Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CLI flags for Kafka batching params #2047

Merged
merged 2 commits into from
Jan 31, 2020

Conversation

apm-opentt
Copy link
Contributor

@apm-opentt apm-opentt commented Jan 28, 2020

Resolves #2037
The default Samara Kafka client writes to Kafka broker for every single span received.
This is not efficient and caused high CPU utilization on the brokers as well as high network traffic.
Samara Kafka client has options to enable batching. These options needed to be exposed
to the command for performance tuning. The option names matches official Kafka documentation and
not specific to Samara.

Signed-off-by: albert chung alchung@us.ibm.com

Which problem is this PR solving?

Short description of the changes

  • Added 2 new command line parameters for Kafka producer

@codecov
Copy link

codecov bot commented Jan 28, 2020

Codecov Report

Merging #2047 into master will decrease coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2047      +/-   ##
==========================================
- Coverage   97.41%   97.37%   -0.04%     
==========================================
  Files         207      207              
  Lines       10304    10304              
==========================================
- Hits        10038    10034       -4     
- Misses        223      225       +2     
- Partials       43       45       +2
Impacted Files Coverage Δ
cmd/agent/app/processors/thrift_processor.go 100% <ø> (ø) ⬆️
cmd/query/app/server.go 91.66% <0%> (-2.78%) ⬇️
cmd/query/app/static_handler.go 86.84% <0%> (-1.76%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e3b2e39...e85f3d2. Read the comment docs.

pkg/kafka/producer/config.go Outdated Show resolved Hide resolved
plugin/storage/kafka/options.go Outdated Show resolved Hide resolved
plugin/storage/kafka/options.go Outdated Show resolved Hide resolved
plugin/storage/kafka/options.go Outdated Show resolved Hide resolved
plugin/storage/kafka/options.go Outdated Show resolved Hide resolved
@yurishkuro yurishkuro changed the title ADD NEW KAFKA PARAM TO COMMANDLINE Add cli flags for Kafka batching params Jan 28, 2020
@vprithvi
Copy link
Contributor

I found this post to be a good read on the topic which contains experimental results on the effects of changing these parameters. (For instance, at high throughput they find that batch sizes up to 512KB increase total throughput)

apm-opentt added a commit to apm-opentt/jaeger that referenced this pull request Jan 28, 2020
1. Setting proper defaults
2. Rename cli option name
3. Change cli option type

Signed-off-by: albert <alchung@us.ibm.com>
apm-opentt added a commit to apm-opentt/jaeger that referenced this pull request Jan 28, 2020
1. Setting proper defaults
2. Rename cli option name
3. Change cli option type

Signed-off-by: albert <alchung@us.ibm.com>
apm-opentt added a commit to apm-opentt/jaeger that referenced this pull request Jan 28, 2020
1. Setting proper defaults
2. Rename cli option name
3. Change cli option type

Signed-off-by: albert chung <alchung@us.ibm.com>
Signed-off-by: albert <alchung@us.ibm.com>
apm-opentt added a commit to apm-opentt/jaeger that referenced this pull request Jan 28, 2020
1. Setting proper defaults
2. Rename cli option name
3. Change cli option type

Signed-off-by: albert <alchung@us.ibm.com>
Signed-off-by: albert chung <alchung@us.ibm.com>
apm-opentt added a commit to apm-opentt/jaeger that referenced this pull request Jan 28, 2020
1. Setting proper defaults
2. Rename cli option name
3. Change cli option type

Signed-off-by: albert <alchung@us.ibm.com>
Signed-off-by: albert chung <alchung@us.ibm.com>
Signed-off-by: Albert Chung <alchung@us.ibm.com>
apm-opentt added a commit to apm-opentt/jaeger that referenced this pull request Jan 28, 2020
1. Setting proper defaults
2. Rename cli option name
3. Change cli option type

Signed-off-by: Albert Chung <alchung@us.ibm.com>
@yurishkuro
Copy link
Member

From my reading/grepping sarama code, I don't see Flush.MaxMessages, Flush.Bytes, or Flush.Frequency to be ever assigned any default values, which means they are 0 by default. And when they are zero, produceSet.readyToFlush() has this case:

	// If all three config values are 0, we always flush as-fast-as-possible
	case ps.parent.conf.Producer.Flush.Frequency == 0 && ps.parent.conf.Producer.Flush.Bytes == 0 && ps.parent.conf.Producer.Flush.Messages == 0:
		return true

So unless @vprithvi has some other thoughts, I think we have no batching by default, which means we can have 0s as default values for the new CLI flags to be backwards compatible.

@yurishkuro yurishkuro changed the title Add cli flags for Kafka batching params Add CLI flags for Kafka batching params Jan 29, 2020
@apm-opentt
Copy link
Contributor Author

For some reason DCO check is red (x). I do have a signed-off-by in the commit. Can someone help?

Address code review issues

1. Set batch properties default to 0. ie. no default batching
2. Prefix batch properties with batch prefix
3. Added batch max messages for completeness
Resolves #2037

Signed-off-by: Albert Chung <alchung@us.ibm.com>

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Date:      Wed Jan 29 16:24:38 2020 -0500
#
# On branch master
# Your branch is up to date with 'origin/master'.
#
# Changes to be committed:
#       modified:   pkg/kafka/producer/config.go
#       modified:   plugin/storage/kafka/options.go
#       modified:   plugin/storage/kafka/options_test.go
#

@apm-opentt apm-opentt closed this Jan 29, 2020
@yurishkuro
Copy link
Member

you need to ensure that each commit is signed. If you missed some, the easiest fix is to git reset --soft to a hash before your changes and push as a single signed commit.

@apm-opentt
Copy link
Contributor Author

@yurishkuro For some reason I am still having the DCO not passing. I did a git reset --soft then commit then push but it didn't work. Can you help provide the detail steps? Thanks!

@yurishkuro
Copy link
Member

It complains about this commit: 04da10c

You need to find the commit hash in the upstream master from which you started your work. In the future I recommend using a branch in your fork, rather than master directly. Once you find the appropriate hash N, you run git reset --soft N. After this all your changes will be staged on top of an official commit, so git commit -sm 'Add CLI flags for Kafka batching params'.

Signed-off-by: Albert Chung <alchung@us.ibm.com>
@yurishkuro yurishkuro merged commit 243b56f into jaegertracing:master Jan 31, 2020
@yurishkuro
Copy link
Member

Thanks!

bhiravabhatla added a commit to bhiravabhatla/jaeger that referenced this pull request Jan 25, 2021
…lasticsearch

Signed-off-by: santosh <bsantosh@thoughtworks.com>
bhiravabhatla added a commit to bhiravabhatla/jaeger that referenced this pull request Jan 25, 2021
…lasticsearch

Signed-off-by: santosh <bsantosh@thoughtworks.com>
bhiravabhatla added a commit to bhiravabhatla/jaeger that referenced this pull request Jan 25, 2021
…lateBuilder to GetMappingAsString

Signed-off-by: santosh <bsantosh@thoughtworks.com>
bhiravabhatla added a commit to bhiravabhatla/jaeger that referenced this pull request Jan 25, 2021
Signed-off-by: santosh <bsantosh@thoughtworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose Samara Kafka producer/consumer configurations to command line.
4 participants