[Elastic Agent] Proposal: Change structure of Elastic Agent config #18758

ruflin · 2020-05-27T06:45:39Z

Details around datasource config

The Agent config today consists of three level: Datasources -> Inputs -> Streams. The data sources are a group of inputs and allow configuration of output or namespace on the level above inputs. Initially we only had a list of inputs in the agent config but we introduced the data source level for a few reasons:

Align with the UI
Make importing of configs possible
Better error reporting to the UI
Higher level configs to be configured for multiple inputs like output, namespace, constraints

In the following I want to dive into each point to see if this actually still matters.

Align with the UI

Having to think about the same concept when creating a config manually or through the UI is powerful to not have two different concepts. At the same time, it seems to be ok to have a more convenient way to configure groupings in the UI then on the agent side.

This is especially true as the grouping of inputs is related to packages which are not available on the agent side. If a users configures nginx he needs to manually specify a logs and metrics input anyway and the datasource grouping will not help him much.

Importing configs

Initially the idea was that data sources would make the importing of configs possible to map it to the UI. This works in case the config imported actually matches 1-1 to a package. But if the user specifies his own inputs and groups them together in some way, this will not work anymore. The solution on our end when importing would either be putting all inputs into 1 data source or create 1 data source per input. 1 data source per input would be more likely as otherwise the UI gets more complex (many inputs in 1 data source). With this, I think the import argument is not valid anymore.

Better error reporting

This still holds but I would argue we can solve it differently through metadata on each input. Each input should support additional metadata where we can add names and ids to have a better error reporting. So when an error is reported to Fleet, we know which data sources with inputs and streams inside it belongs to. Already with datasources we would need this as reporting an error just on a data source is not enough. Having this generic meta concept allows also better error reporting in the standalone case.

Higher Level Configs

It is convenient to configure namespace and output on the data source level. We should still allow this on the UI side but is not required on the agent config level. My assumption is that most users to get started use the default namespace and the default output, this means nothing has to be configured.

Users with more complex configs are likely to use some automation to build the configs in which case specifying the output and namespace more often should not be a problem.

Summary

I think on the agent side, the arguments around having the data source object do not hold up anymore.

Proposed new config

Based on the above, I suggest we remove datasources from the agent config but add namespace, output and meta information support on the input level:

inputs:
 - type: system/metrics
   namespace: default
   use_output: default
   meta:
     package.name: bar
     settings.id: foo
     hello: world
   streams:
     - metricset: cpu
       dataset: system.cpu

The part under meta is not understood by the agent but logged/shipped in case of an error.

This new config combined with the proposed changed for the fields used for the indexing strategy (elastic/package-registry#482) solves also the problem that there was no good way to set a different dataset.type for an input. For example the log input could also generate metrics. The config below will send data to metrics-foo-prod:

inputs:
 - type: logs
   dataset.type: metrics
   dataset.namespace: prod

   streams:
     - paths: /var/log/foo.log
       dataset.name: foo

Removing the datasource part also makes getting started easier for manual configuration. The simplest config now looks as following:

inputs:
 - type: logs
   streams:
     - paths: /var/log/foo.log

It is expect that the Ingest Manager in Kibana still has a grouping of inputs available but will flatten it before shipped to the Agent.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-05-27T06:58:08Z

Pinging @elastic/ingest-management (Team:Ingest Management)

mostlyjason · 2020-05-27T14:13:58Z

@ruflin Thanks for working to simplify things! I think this is a step in the right direction. There are still a few things that are unclear to me. Perhaps is just my lack of familiarity with how the config works, but it might illustrate the questions others have seeing it the first time.

I see your example has type: system/metrics Can you explain again what the purpose of the system prefix is? Also, I've seen other examples with type nginx/metrics. Is there a reason we don't break these into two separate keys?

I'm also wondering what is the difference between the input type and dataset.type? It'd simpler if one implied the other. Whats the use case having a logs input produce metrics? How does this input type map the child process that the input is applied to? Are you suggesting that the log input type is applied to filebeat, which then produces metrics?

Lastly, in a separate discussion we discussed calling this "Data Kind" Would it make sense to align our name here making it dataset.kind?

ph · 2020-05-27T15:34:40Z

I see your example has type: system/metrics Can you explain again what the purpose of the system prefix is? Also, I've seen other examples with type nginx/metrics. Is there a reason we don't break these into two separate keys?

The main reason was for disambiguation and try to hide implementation details of the running beats. We could either make it two keys or have map that correlates type to processes.

EricDavisX · 2020-05-27T16:45:16Z

I love the addition of simplicity.

blakerouse · 2020-05-27T17:04:17Z

Why use_output instead of just output? Being that an output must always be defined either explicitly or implicitly (using default when not defined), I feel like the use_ prefix seems wierd.

I also agree with @mostlyjason on the system/ prefix. What about moving the type to just system and being that it only does metrics, thats all it does. Using that logic could it be possible to do just nginx then inside of the same input defined both metrics and logs, and if either isn't defined only do the defined ones.

Example:

inputs:
  - type: nginx
     streams:
       - type: logs
         paths: /var/log/nginx/access.log
       - type: metrics
         http: http://localhost/nginx_status

ruflin · 2020-05-28T10:47:45Z

There are two distinct things:

The input type to be run: This can be syslog, log, nginx/metrics, S3, system/metrics, redis (for redis slow logs) etc.
The type/kind of data it produces: logs, metrics

One can't imply the other as you get logs or metrics from files on S3.

On the input level we have common configuration for all the streams using the same input. For example hosts, username, password so these don't have to be specified for each stream:

inputs:
  - type: logs
    streams:
      - paths: /var/log/nginx/access.log
        dataset.name: nginx.access
      - paths: /var/log/nginx/error.log
        dataset.name: nginx.error
  - type: nginx/metrics
    hosts: https://localhost:8080
    username: foo
    password: bar
    streams:
      - metricset: stubstatus
      - metricset: hello-world
        period: 20s

The input.type references a unique input in one of the processes it runs. Not sure I follow what the benefit would be of two keys as it cannot solve the problem of implying dataset.type from the input.type.

@blakerouse For the use_ prefix no strong opinion but I think the history here is that we wanted to make it clear, you can only reference global inputs and not define one locally. I suggest we take the use_output discussion to an other issue.

ph · 2020-05-28T12:27:12Z

@blakerouse For the use_ prefix no strong opinion but I think the history here is that we wanted to make it clear, you can only reference global inputs and not define one locally. I suggest we take the use_output discussion to an other issue.

Yes, we did multiple iterations for the output definition.
Initially, we even allowed users to create output inside the input, but with the number of levels in the yaml this was not explicitly clear.

So we move to use_output to clearly differentiate both of them, we can create an issue to discuss it.

blakerouse · 2020-05-28T13:16:26Z

Yeah let's move the use_output to another issue (or if you already feel it's correct, feel free to ignore me ;-) ). It only triggered my eye because of the _ in the key name, which is the only one that way.

As for my example config, I am just coming in with a new set of eyes and have barely used the current configuration. I am completely missing all back conversations on the current design, which is either a good thing or a bad thing (teaching the new guy), ;-).

Being that when a user uses Fleet they really don't care what the configuration looks like, really the only users that care are those in standalone mode. To me I think it would be great if the configuration could align the same user experience writing configuration as selecting integrations in Fleet. That is why my example places the different types in the streams and the integration name at the input type.

Really with my design you could think that actually the top level key of inputs would be changed to integrations. To be more of the format of:

integrations:
  - integration: nginx
     inputs:
       - type: logs
         paths: /var/log/nginx/access.log
         dataset: access
       - type: logs
         paths: /var/log/nginx/error.log
         dataset: error
       - type: http
         http: http://localhost/nginx_status

I understand it's hidden by metricbeat but nginx/metrics is really just an HTTP request that knows how to process the data that is why I was saying highlight the input type as http.

I think with the current design what draws my eye is:

type: logs vs type: nginx/metrics as it just looks and feels inconsistent.

ruflin · 2020-05-28T13:32:21Z

Currently the config between UI and Config is aligned, that is one of the reasons we have datasources. By design, the agent does not know about integrations, only inputs.

The part you are getting at that nginx/metrics is just a http request is correct and our end goal. I hope that one day the config for nginx/metrics will actually use the type: http instead but were are not there yet.

ruflin · 2020-06-09T18:08:45Z

In the meantime I had a conversation with @blakerouse the clarify the above. One thing we discussed was that in some cases getting started with nginx in standalone will become harder then at the moment and we discussed some ideas on how to improve this in the future if we want. I'll leave this open for later.

Working on configs for autodiscovery I realised also not having the datasource part makes autodiscovery simpler (more on that in an other issue).

To get this change in on time for the next release, I suggest we move forward on this. If someone has concerns about the above, please raise them in the next 48h. I filed a meta issue to track all the work needed to get us to the new config. #19082

ruflin · 2020-06-23T06:41:36Z

Closing this issue as most of the work in #19082 is done. Further tracking happens there.

ruflin self-assigned this May 27, 2020

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label May 27, 2020

ruflin added the Team:Ingest Management label May 27, 2020

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label May 27, 2020

ruflin mentioned this issue Jun 9, 2020

[Elastic Agent] Change structure of Elastic Agent across the Stack #19082

Closed

15 tasks

ruflin closed this as completed Jun 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Elastic Agent] Proposal: Change structure of Elastic Agent config #18758

[Elastic Agent] Proposal: Change structure of Elastic Agent config #18758

ruflin commented May 27, 2020

elasticmachine commented May 27, 2020

mostlyjason commented May 27, 2020 •

edited

Loading

ph commented May 27, 2020

EricDavisX commented May 27, 2020

blakerouse commented May 27, 2020

ruflin commented May 28, 2020

ph commented May 28, 2020

blakerouse commented May 28, 2020

ruflin commented May 28, 2020

ruflin commented Jun 9, 2020

ruflin commented Jun 23, 2020

[Elastic Agent] Proposal: Change structure of Elastic Agent config #18758

[Elastic Agent] Proposal: Change structure of Elastic Agent config #18758

Comments

ruflin commented May 27, 2020

Details around datasource config

Align with the UI

Importing configs

Better error reporting

Higher Level Configs

Summary

Proposed new config

elasticmachine commented May 27, 2020

mostlyjason commented May 27, 2020 • edited Loading

ph commented May 27, 2020

EricDavisX commented May 27, 2020

blakerouse commented May 27, 2020

ruflin commented May 28, 2020

ph commented May 28, 2020

blakerouse commented May 28, 2020

ruflin commented May 28, 2020

ruflin commented Jun 9, 2020

ruflin commented Jun 23, 2020

mostlyjason commented May 27, 2020 •

edited

Loading