Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce casting between []byte and string in CRI log processing #8424

Merged
merged 3 commits into from
Oct 23, 2018

Conversation

jareksm
Copy link
Contributor

@jareksm jareksm commented Sep 25, 2018

  • Reduce casting from []byte to string and []byte again.
  • If criflags are set to true, default to CRI log processing.
  • Refactor CRI and json-file detection.

@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

Copy link
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening this, it's looking good 🎉

Do you have any benchmarks? I'm wondering about the performance improvement

l := len(message.Content)
if l > 0 && message.Content[l] == '\n' || message.Content[l] == '\r' {
message.Content = message.Content[:l-1]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this condition is not doing the same thing as TrimRightFunc, as it would not trim \r\n

Copy link
Contributor Author

@jareksm jareksm Sep 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on a 100% linux systems with no \r. Is that for windows? In this case the more optimal solution would be to determine if filebeat is running on windows or POSIX system and then trim the line correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was more defensive code, I guess we can live with this and see if the problem is there for other implementations

@@ -144,7 +144,7 @@ func (p *DockerJSONReader) parseDockerJSONLog(message *reader.Message, msg *logL
}

func (p *DockerJSONReader) parseLine(message *reader.Message, msg *logLine) error {
if strings.HasPrefix(string(message.Content), "{") {
if !p.criflags || len(message.Content) > 0 && message.Content[0] == '{' {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

criflags is used to enable parsing CRI flags within the CRI format, mainly intended for backwards compatibility, but this can still be CRI with it set to false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assumption here is, if criflags are enabled, we are dealing with CRI log. Otherwise check the first byte if it is '{' to determine the log format. Although a much better solution would be to detect log format at the start of the file and process it as either json-file or CRI log without doing this test for every line in the log.

As for your initial question, I don't have any proper benchmarks comparing old code to new code. I do have micro benchmarks comparing specific changes. I didn't benchmark []byte to string and then to []byte casting, since I'm dealing with very large volume of multi-part messages, that's 2 extra memory allocations per every entry and it adds up very quickly. It saves on memory and GC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to avoid overloading this parameter and use something different to force the expected format. I plan to enable CRI flags parsing in our default manifests, as most users will benefit from them, but it should still work with docker logs.

I suggest we add a new force_format: docker-json | cri parameter. It can also be done in a following PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot better option then making assumptions. I will update this code in the new few days.

Copy link
Contributor Author

@jareksm jareksm Oct 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@exekias After thinking this through, I've kept the current way of auto-detecting. But I've added force_cri_logs option to force CRI log parsing. Also I've moved new line removal to OS specific files, as such the correct function will be chosen at compile time.

@jareksm jareksm force-pushed the master branch 2 times, most recently from 303e346 to a1cb2cd Compare October 20, 2018 15:54
* Move new line removal to OS specific functions.
* Add option to force CRI log parsing, otherwise autodetect log type.
* Fixed new line removal
* Added unit tests to check Forced CRI flag
Copy link
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating this, this is almost ready to go 🎉

Can you please add a CHANGELOG.asciidoc entry?, also add the new parameter to filebeat/docs/inputs/input-docker.asciidoc

@exekias
Copy link
Contributor

exekias commented Oct 21, 2018

jenkins, test this please

@exekias exekias added Filebeat Filebeat needs_backport PR is waiting to be backported to other branches. containers Related to containers use case v6.5.0 labels Oct 21, 2018
@exekias
Copy link
Contributor

exekias commented Oct 23, 2018

I'll do it in my side, as I want to push this to 6.5 👍

@exekias exekias merged commit 3f7d6a6 into elastic:master Oct 23, 2018
@exekias
Copy link
Contributor

exekias commented Oct 23, 2018

Thank you so much for your contribution @jareksm 🎉 🌮 ! Keep them coming 😄

exekias pushed a commit to exekias/beats that referenced this pull request Oct 23, 2018
exekias pushed a commit to exekias/beats that referenced this pull request Oct 23, 2018
…ic#8424)

* Reduce casting between []byte and string in CRI log processing

* Refactor CRI and json-file detection

* Improve handling of new lines in linux/windows CRI logs.

* Move new line removal to OS specific functions.
* Add option to force CRI log parsing, otherwise autodetect log type.

* Fixed unit tests and new line removal

* Fixed new line removal
* Added unit tests to check Forced CRI flag

(cherry picked from commit 3f7d6a6)
@exekias exekias removed the needs_backport PR is waiting to be backported to other branches. label Oct 23, 2018
exekias added a commit that referenced this pull request Oct 24, 2018
* Fix build for all OS after #8424

* Apply PR comments
exekias added a commit to exekias/beats that referenced this pull request Oct 24, 2018
* Fix build for all OS after elastic#8424

* Apply PR comments
@exekias exekias removed the v6.5.0 label Oct 24, 2018
exekias added a commit that referenced this pull request Oct 26, 2018
… CRI log processing (#8718)

* Reduce casting between []byte and string in CRI log processing (#8424)

* Reduce casting between []byte and string in CRI log processing

* Refactor CRI and json-file detection

* Improve handling of new lines in linux/windows CRI logs.

* Move new line removal to OS specific functions.
* Add option to force CRI log parsing, otherwise autodetect log type.

* Fixed unit tests and new line removal

* Fixed new line removal
* Added unit tests to check Forced CRI flag

(cherry picked from commit 3f7d6a6)

* Add CHANGELOG and docs for CRI force flag (#8699)

(cherry picked from commit 06ec3b9)

* Fix conflicting paths

* Fix build for all OS after #8424 (#8717)

* Fix build for all OS after #8424

* Apply PR comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
containers Related to containers use case Filebeat Filebeat
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants