-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Elastic Agent] Proposal: Change structure of Elastic Agent config #18758
Comments
Pinging @elastic/ingest-management (Team:Ingest Management) |
@ruflin Thanks for working to simplify things! I think this is a step in the right direction. There are still a few things that are unclear to me. Perhaps is just my lack of familiarity with how the config works, but it might illustrate the questions others have seeing it the first time. I see your example has I'm also wondering what is the difference between the input Lastly, in a separate discussion we discussed calling this "Data Kind" Would it make sense to align our name here making it |
The main reason was for disambiguation and try to hide implementation details of the running beats. We could either make it two keys or have map that correlates type to processes. |
I love the addition of simplicity. |
Why I also agree with @mostlyjason on the Example:
|
There are two distinct things:
One can't imply the other as you get logs or metrics from files on S3. On the input level we have common configuration for all the streams using the same input. For example hosts, username, password so these don't have to be specified for each stream:
The @blakerouse For the |
Yes, we did multiple iterations for the output definition. So we move to |
Yeah let's move the As for my example config, I am just coming in with a new set of eyes and have barely used the current configuration. I am completely missing all back conversations on the current design, which is either a good thing or a bad thing (teaching the new guy), ;-). Being that when a user uses Fleet they really don't care what the configuration looks like, really the only users that care are those in standalone mode. To me I think it would be great if the configuration could align the same user experience writing configuration as selecting integrations in Fleet. That is why my example places the different types in the Really with my design you could think that actually the top level key of
I understand it's hidden by metricbeat but I think with the current design what draws my eye is:
|
Currently the config between UI and Config is aligned, that is one of the reasons we have datasources. By design, the agent does not know about integrations, only inputs. The part you are getting at that nginx/metrics is just a http request is correct and our end goal. I hope that one day the config for nginx/metrics will actually use the type: http instead but were are not there yet. |
In the meantime I had a conversation with @blakerouse the clarify the above. One thing we discussed was that in some cases getting started with nginx in standalone will become harder then at the moment and we discussed some ideas on how to improve this in the future if we want. I'll leave this open for later. Working on configs for autodiscovery I realised also not having the datasource part makes autodiscovery simpler (more on that in an other issue). To get this change in on time for the next release, I suggest we move forward on this. If someone has concerns about the above, please raise them in the next 48h. I filed a meta issue to track all the work needed to get us to the new config. #19082 |
Closing this issue as most of the work in #19082 is done. Further tracking happens there. |
Details around datasource config
The Agent config today consists of three level: Datasources -> Inputs -> Streams. The data sources are a group of inputs and allow configuration of output or namespace on the level above inputs. Initially we only had a list of inputs in the agent config but we introduced the data source level for a few reasons:
In the following I want to dive into each point to see if this actually still matters.
Align with the UI
Having to think about the same concept when creating a config manually or through the UI is powerful to not have two different concepts. At the same time, it seems to be ok to have a more convenient way to configure groupings in the UI then on the agent side.
This is especially true as the grouping of inputs is related to packages which are not available on the agent side. If a users configures nginx he needs to manually specify a logs and metrics input anyway and the datasource grouping will not help him much.
Importing configs
Initially the idea was that data sources would make the importing of configs possible to map it to the UI. This works in case the config imported actually matches 1-1 to a package. But if the user specifies his own inputs and groups them together in some way, this will not work anymore. The solution on our end when importing would either be putting all inputs into 1 data source or create 1 data source per input. 1 data source per input would be more likely as otherwise the UI gets more complex (many inputs in 1 data source). With this, I think the import argument is not valid anymore.
Better error reporting
This still holds but I would argue we can solve it differently through metadata on each input. Each input should support additional metadata where we can add names and ids to have a better error reporting. So when an error is reported to Fleet, we know which data sources with inputs and streams inside it belongs to. Already with datasources we would need this as reporting an error just on a data source is not enough. Having this generic meta concept allows also better error reporting in the standalone case.
Higher Level Configs
It is convenient to configure namespace and output on the data source level. We should still allow this on the UI side but is not required on the agent config level. My assumption is that most users to get started use the default namespace and the default output, this means nothing has to be configured.
Users with more complex configs are likely to use some automation to build the configs in which case specifying the output and namespace more often should not be a problem.
Summary
I think on the agent side, the arguments around having the data source object do not hold up anymore.
Proposed new config
Based on the above, I suggest we remove datasources from the agent config but add namespace, output and meta information support on the input level:
The part under meta is not understood by the agent but logged/shipped in case of an error.
This new config combined with the proposed changed for the fields used for the indexing strategy (elastic/package-registry#482) solves also the problem that there was no good way to set a different
dataset.type
for an input. For example thelog
input could also generate metrics. The config below will send data tometrics-foo-prod
:Removing the datasource part also makes getting started easier for manual configuration. The simplest config now looks as following:
It is expect that the Ingest Manager in Kibana still has a grouping of inputs available but will flatten it before shipped to the Agent.
The text was updated successfully, but these errors were encountered: