Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[filebeat] Fix ingest pipeline overwriting module field values #33236

Merged

Conversation

crespocarlos
Copy link
Contributor

@crespocarlos crespocarlos commented Sep 30, 2022

What does this PR do?

This PR fixes a problem with the ingest pipeline not fully considering the fields included in the module configuration.

Notes

According to Filebeat doc, users can add new fields to the output, but doesn't mention anything about overwriting log entry's existing field values

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

  • Start local Kibana and add logging config to kibana.yml
logging:
  appenders:
    console:
      type: console
      layout:
        type: pattern
        highlight: true
    file:
      type: file
      fileName: ./logs/kibana-json.log
      layout:
        type: json
  root:
    appenders: [default, console, file]
    level: debug

Pull this branch and start filebeat from the source https://github.com/elastic/kibana/blob/main/x-pack/plugins/monitoring/dev_docs/how_to/running_components_from_source.md#filebeat

  • On filebeat.yml, enable Kibana module.
- module: kibana
    log:
      enabled: true
      var.paths:
        - PATH_TO_KIBANA_LOG
      input:
        fields:
          ecs.version: "9.0.0" # existing field
          log.level: "TEST" # existing field
          service.name: "Kibana" # new field
          cloud.availability_zone: "danger-zone" # new field
        fields_under_root: true

Note that if fields_under_root is omitted or false, these custom fields will appear on the log as fields.ecs.version, fields.kibana.service and etc. When true, besides being included in the root, they can overwrite existing log entry fields.

image

cloud.availability_zone and service.name will be added to the ingested log, service.* won't be overwritten by service.name and ecs.version and log.level will remain with their original value

Related issues

Closes #32665

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Sep 30, 2022
@mergify
Copy link
Contributor

mergify bot commented Sep 30, 2022

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @crespocarlos? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@crespocarlos crespocarlos added bug Module:kibana Kibana Beats modules Team:Infra Monitoring UI - DEPRECATED Infrastructure Monitoring UI team - DEPRECATED - Use Team:Monitoring labels Sep 30, 2022
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Sep 30, 2022
@crespocarlos crespocarlos added the backport-v8.4.0 Automated backport with mergify label Sep 30, 2022
@crespocarlos crespocarlos changed the title Drop support for non-ecs kibana logs [filebeat] Fix ingest pipeline overwriting module custom field values and drop support for non-ecs logs Sep 30, 2022
@crespocarlos crespocarlos marked this pull request as ready for review September 30, 2022 08:43
@crespocarlos crespocarlos requested a review from a team as a code owner September 30, 2022 08:43
@crespocarlos crespocarlos added the backport-v8.5.0 Automated backport with mergify label Sep 30, 2022
@elasticmachine
Copy link
Collaborator

elasticmachine commented Sep 30, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-10-04T09:52:05.256+0000

  • Duration: 70 min 52 sec

Test stats 🧪

Test Results
Failed 0
Passed 6899
Skipped 737
Total 7636

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@crespocarlos crespocarlos marked this pull request as draft September 30, 2022 16:25
@matschaffer
Copy link
Contributor

it also drops support for non-ecs compliant logs

Did you see a way to avoid doing this? Not sure but I'm guessing if we drop support we can't consider this a minor change.

@crespocarlos
Copy link
Contributor Author

it also drops support for non-ecs compliant logs

Did you see a way to avoid doing this? Not sure but I'm guessing if we drop support we can't consider this a minor change.

Yeah. This makes sense. Better address that in another ticket.

@crespocarlos crespocarlos changed the title [filebeat] Fix ingest pipeline overwriting module custom field values and drop support for non-ecs logs [filebeat] Fix ingest pipeline overwriting module field values Oct 3, 2022
@@ -5,10 +5,20 @@ paths:
{{ end }}
exclude_files: [".gz$"]

json.keys_under_root: false
Copy link
Contributor Author

@crespocarlos crespocarlos Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this is set to true, Filebeat overwrites all fields correctly, but it also replaces the log entry @timestamp with Filebeat's. And that would make the ingested data inconsistent

@crespocarlos crespocarlos marked this pull request as ready for review October 4, 2022 08:18
@crespocarlos crespocarlos marked this pull request as draft October 4, 2022 08:46
inline: 'ctx.json.keySet().each (key -> ctx[key] = ctx.json.get(key))'
- remove:
field: json
- rename:
Copy link
Contributor Author

@crespocarlos crespocarlos Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistent with integration packages. These lines are responsible for making this fix work:

add_to_root: true
add_to_root_conflict_strategy: merge

@crespocarlos crespocarlos marked this pull request as ready for review October 4, 2022 10:06
Copy link
Contributor

@matschaffer matschaffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will approve after some exploratory testing.

I was wondering if maybe there's a way to override the configuration used in testing so we could inject some override fields. This would help protect against future regression. But if there's nothing we can easily use, I think it'd be beyond the scope of this PR to add it.

Copy link
Contributor

@matschaffer matschaffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I was able to see the new fields. I was intuitively expecting the fields to override things originally found in the logs, but it doesn't look like that's what's being asked for in the original issue.

@crespocarlos
Copy link
Contributor Author

Cool. I was able to see the new fields. I was intuitively expecting the fields to override things originally found in the logs, but it doesn't look like that's what's being asked for in the original issue.

That is possible using decode_json_fields processor. It's the easiest way to both add new fields and override original stuff from the logs. But since the doc doesn't mention anything about the latter, I decided not to enable this behaviour

@crespocarlos crespocarlos merged commit 4b4bfc4 into elastic:main Oct 5, 2022
mergify bot pushed a commit that referenced this pull request Oct 5, 2022
* Fix ingest pipeline, allowing field value override

* Fix ecs and non-ecs pipelines

* Fix pipeline description

* Revert all changes on pipeline.yml

* Allow only adding fields to the output; revert possibility of overwritting existing log entry field values

(cherry picked from commit 4b4bfc4)
mergify bot pushed a commit that referenced this pull request Oct 5, 2022
* Fix ingest pipeline, allowing field value override

* Fix ecs and non-ecs pipelines

* Fix pipeline description

* Revert all changes on pipeline.yml

* Allow only adding fields to the output; revert possibility of overwritting existing log entry field values

(cherry picked from commit 4b4bfc4)
crespocarlos added a commit that referenced this pull request Oct 10, 2022
… (#33256)

* Fix ingest pipeline, allowing field value override

* Fix ecs and non-ecs pipelines

* Fix pipeline description

* Revert all changes on pipeline.yml

* Allow only adding fields to the output; revert possibility of overwritting existing log entry field values

(cherry picked from commit 4b4bfc4)

Co-authored-by: Carlos Crespo <crespocarlos@users.noreply.github.com>
crespocarlos added a commit that referenced this pull request Oct 10, 2022
… (#33255)

* Fix ingest pipeline, allowing field value override

* Fix ecs and non-ecs pipelines

* Fix pipeline description

* Revert all changes on pipeline.yml

* Allow only adding fields to the output; revert possibility of overwritting existing log entry field values

(cherry picked from commit 4b4bfc4)

Co-authored-by: Carlos Crespo <crespocarlos@users.noreply.github.com>
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
* Fix ingest pipeline, allowing field value override

* Fix ecs and non-ecs pipelines

* Fix pipeline description

* Revert all changes on pipeline.yml

* Allow only adding fields to the output; revert possibility of overwritting existing log entry field values
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.4.0 Automated backport with mergify backport-v8.5.0 Automated backport with mergify bug Module:kibana Kibana Beats modules Team:Infra Monitoring UI - DEPRECATED Infrastructure Monitoring UI team - DEPRECATED - Use Team:Monitoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kibana ingest pipeline overwrites root fields set in Filebeat config
3 participants