Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libbeat] Allow per beat.Client control of event normalization #33657

Conversation

andrewkroh
Copy link
Member

@andrewkroh andrewkroh commented Nov 13, 2022

What does this PR do?

Control over the addition of the "generalizeEvent" processor into the publishing pipeline was only available at the Beat level. This adds a new option that can be set by input's when they create their beat.Client.

This allows inputs to override the Beat's default behavior. My expected use case it to disable event normalization for inputs that are known to only produce beat.Events containing the standard data types expected by the processors and outputs (i.e. map[string]interface{} containing primitives, slices, or other map[string]interface{}).

Inputs would want to disable the event normalization processor if they can because it adds unnecessary processing (recurses over the fields and often allocates).

An example usage:

	// Create client for publishing events and receive notification of their ACKs.
	client, err := pipeline.ConnectWith(beat.ClientConfig{
		CloseRef:   inputContext.Cancelation,
		ACKHandler: awscommon.NewEventACKHandler(),
		Processing: beat.ProcessingConfig{
			EventNormalization: boolPtr(false),
		},
	})

Why is it important?

It allows input authors to make an optimization.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Screenshots

Screen Shot 2022-11-13 at 11 18 45

@andrewkroh andrewkroh added enhancement libbeat backport-skip Skip notification from the automated backport with mergify labels Nov 13, 2022
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Nov 13, 2022
Control over the addition of the "generalizeEvent" processor into the publishing pipeline was
only available at the Beat level. This adds a new option that can be set by input's when they
create their beat.Client.

This allows inputs to override the Beat's default behavior. My expected use case it to disable
event normalization for inputs that are known to only produce beat.Events containing the
standard data types expected by the processors and outputs (i.e. map[string]interface{}
containing primitives, slices, or other map[string]interface{}).

Inputs would want to disable the event normalization processor if they can because it adds
unnecessary processing (recurses over the fields and often allocates).
@andrewkroh andrewkroh force-pushed the feature/libbeat/event-normalization-override branch from ccc2fcb to 118bc11 Compare November 13, 2022 16:02
@elasticmachine
Copy link
Collaborator

elasticmachine commented Nov 13, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-11-14T19:53:14.775+0000

  • Duration: 77 min 7 sec

Test stats 🧪

Test Results
Failed 0
Passed 23781
Skipped 1951
Total 25732

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@andrewkroh andrewkroh marked this pull request as ready for review November 13, 2022 20:21
@andrewkroh andrewkroh requested a review from a team as a code owner November 13, 2022 20:21
@andrewkroh andrewkroh requested review from belimawr and faec and removed request for a team November 13, 2022 20:21
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Nov 13, 2022
libbeat/publisher/processing/default_test.go Outdated Show resolved Hide resolved
@andrewkroh andrewkroh merged commit 00f57f3 into elastic:main Nov 14, 2022
andrewkroh added a commit to andrewkroh/beats that referenced this pull request Nov 14, 2022
Disable event normalization for the aws-s3 input to reduce allocations when processing events.
The input only produces basic types in its events. Either it puts a string into the `message` field
or it decodes json into a map[string]interface with encoding/json. Both of those should be fine
for the downstream processors and outputs.

Relates elastic#33657
andrewkroh added a commit that referenced this pull request Nov 15, 2022
Disable event normalization for the aws-s3 input to reduce allocations when processing events.
The input only produces basic types in its events. Either it puts a string into the `message` field
or it decodes json into a map[string]interface with encoding/json. Both of those should be fine
for the downstream processors and outputs.

Relates #33657
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
Control over the addition of the "generalizeEvent" processor into the publishing pipeline was
only available at the Beat level. This adds a new option that can be set by input's when they
create their beat.Client.

This allows inputs to override the Beat's default behavior. My expected use case it to disable
event normalization for inputs that are known to only produce beat.Events containing the
standard data types expected by the processors and outputs (i.e. map[string]interface{}
containing primitives, slices, or other map[string]interface{}).

Inputs would want to disable the event normalization processor if they can because it adds
unnecessary processing (recurses over the fields and often allocates).

* lint / misspell - fix spelling
* lint / unused - remove `drop` field
* lint / errorlint - wrap error in fmt.Errorf
* lint / errcheck - add missing checks
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
Disable event normalization for the aws-s3 input to reduce allocations when processing events.
The input only produces basic types in its events. Either it puts a string into the `message` field
or it decodes json into a map[string]interface with encoding/json. Both of those should be fine
for the downstream processors and outputs.

Relates #33657
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify enhancement libbeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants