Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Cassandra reconnect interval #934

Merged

Conversation

nyanshak
Copy link
Contributor

Which problem is this PR solving?

Short description of the changes

  • Adds support for configuring gocql's ReconnectInterval setting. This will make gocql attempt to reconnect to downed Cassandra nodes every interval.

* Fix jaegertracing#767 by enabling gocql setting `ReconnectInterval` to reconnect to
down Cassandra hosts at a regular interval.

Signed-off-by: Brendan Shaklovitz <nyanshak@users.noreply.github.com>
@@ -7,6 +7,6 @@ import (
)

func init() {
data := "PK\x03\x04\x14\x00\x08\x00\x08\x00\xe7\x88\x83J\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\n\x00 \x00index.htmlUT\x05\x00\x01B\x81\xe2X\xd4W_s\xdb\xb8\x11\x7f\xd7\xa7\xd8\xc1\xa5%U\x99\xa4d\xd9\xb1#\x8b\xea\xb8V\xceq\xae9\xa7\xb2\x93\x9b\xf4&\x0f \xb0\"!\x93\x00\x03\x80\xb2u\x1e\x7f\xf7\x0eHJ\x96\xec\xb8s\x8f\x17\xceH\x02v\x17\xbb\xfb\xdb?Xj\x9c\xd9\"\x9ft\x00\xc6\x05Z\n,\xa3\xda\xa0\x8d\xc9\xc5\xd5ep||\xf8&\x18\x90G\xae\xa4\x05\xc6d)\xf0\xb6T\xda\x12`JZ\x946&\xb7\x82\xdb,\xe6\xb8\x14\x0c\x83z\xb3\x07B\n+h\x1e\x18Fs\x8c\x07a\x7f\x0f\nz'\x8a\xaa\xd8&U\x06u\xbd\xa7I\x8eq\xbf1\x96!\xe5n\x010\xb6\xc2\xe68y\xa7\xec\xecr\n\x01\xcc\x04G\x03\x97\x12\xa6XP\xc9\xc7Q\xc3od\x0d\xd3\xa2\xb4`4\x8bIfmiFQ\xc4\x14\xc7p\xf1\xadB\xbd\n\x99*\xa2f\x19\x0c\xc3A8\x08\x0b!\xc3\x85!\x93q\xd4\x1cm\xf5\xe4B\xde\x80\xc6<&\xc6\xaer4\x19\xa2%\x90i\x9c?\xea-\xe8\x1d\xe32L\x94\xb2\xc6jZ\xba\x8d\xd3\xbf!D\xc3p\x18\x1eE\xcc\x98GZm\x90\x19C@H\x8b\xa9\x16v\x15\x13\x93\xd1\xe1\xf1A\xf0\xaf\xcf_\x84\xb8\xba\xf8\x19\x7f\x19\xf0\xf3\xe2\xfd\xec\xf4f\xc5\xaaw\xa7\xeff\xe9p\xff\xb2\xf8\xc4no\x8f\x94\x1c\xce\xbe\xf0\xf4\xe03\xed},\xae\xae\xcd\x1f\xd1/\xaf\x8f\x97 \x7f\xbb\xc8\x0e*\x02L+c\x94\x16\xa9\x901\xa1R\xc9U\xa1*C\xfeOp\xfe,\x88\xc5S\x0c\x8b\xefB\xb8f\x87\x17\xff\x11I\x7f\xff\xe8\xdbr\xb5\xb8\xfa0\x7f\xb7\xb8\xfc@\xff}3\xaf~\xfb|\xf7\xdf\xbbO\x1f\xe5\xd9\xfb\xd3\xa3|\xbf8\xfb\xed\xd7\x8b\xf2\xfcMq~6=\xbe=\xff\xf5\x82}\x9c\x1e]\xdf\xd1\x97!<&\xa8\x05\xe3\xf22\xe9\x84U%8\xdcCAu*d`U9\x82\xc1ayw\x02\x0f\x9d0SV+\x1e$\x95\xb5J\xc2=\x94\x94s!\xd3\x11\xec\xf7\x9d\x04\xab\xb4Qz\x04\xa5r@\xf4\xc9\xae\x92\xfe\xf7\x94\x8c2\xb5D\x0d\xf7\xcf\xcf\xceEnQ\x8f \xd1\"\xcd\xacDc\xfc\xe3\xc3\xbfu\x9d\x8a\x9fZ\x15\xb9J_\xf0\xf4'+\xca\x17X5\xd8\xa8E\xeb:#Z\xb7\xc68Q|\xd5\xa6\x96\x8b%\xb0\x9c\x1a\x13\x13\xd7\x91TH\xd4m\xdaw\xb9u\xb8h\x8e\xda6\xdf\x81\x90s\xe5\xa2\xcb\xc5r#\xcf\xd0aZo]7\x0e\\\xff\xc1,\xbc\x0c\xa7\xe18\xca\x06\xdb\xbc\x83\xc9\x18\x8b\xc9\xb3\xb6\xc4b2\x8e\xb2\x83-\xc9-7\xb4\xba%\x8f\x9c\xe7\x10\xf2\xa0\xe0\xc1\x10\xdc\xc2\x14\xc1\xeb'\xb2\xb5\xbc)\xa9|FuO\xab$\xb1\x12\x12+k\x80\xf5\"\xc9\x15\xbb\x81\x9dt\x92\xef*\xe0\xd4\xd2\x80U\xc6\xaa\x02uL\x06\xfbC2\x99Q\x96a\xee\x19\xf89W\x9a\xe60E#Ri\xc6\x91s\xe3 \x92\xedX\xb6\xa4\xbf,\xb8\xe1\x9b}2\xb9\xd6\xaa\x80\xb3L1\x95S+P\xff\xf0\xa8\x8e\x86\x032yOK*\xd1\xa0\xcb\x15j\xfb\xe3\xe7\xea\xf0\xf5\x11\x99\x9c\x16\xf4\x0f!S8S\xf39\"\xcc\x145\x16\xf5\x9f\x01\xf7t\xebp\n\x1e\x13+J29\xcb\x05\xbb\x01%am\xae\x9e\xf4@\x13\xb5D\xb0\n\x94\xe6\xa8\x81\x02\xa3:|I\xd1\xe35G\xd6\xd8s\xa4\xfc\xe9\xed\x12m_/\x1b\xd68j\xae\xb3\xcefPM:\x9dy%\x99\x15J\xc2\\\xe9\x82\xdai\xa5\xa9\xdb\xfa\xbc]t\xe1\xbe\x03\xb0\xa4\x1a8\xc4\xb0\xa6B\x04\xfe\xa0_?\xf0\x0f\x184?\xaf\xfb\xdd\x93V\xb6\x92\xc2\x1a\x88\xc1+\x84\xf4\x1cQ\xa3\xad\xb4\x84\x0f\xd4f\xa1V\x95\xe4>\xefB\xaf\x91;\xe9<t:\xee\x14\xcb\x05J\xfb\xe9\xd3\xc5\x14\xe2m\xd1fI%W\x85\xdfm\xed9[\xeeLN\x8d\x9d\xe1\xb7\n\x8d\xad\x8f\xf5O:\x9dW>\xa9\x87\x16\xe9\x86\xee\xc5\xcb'_T\xa5\xe1\x16\x93\xd6\x82g@\xf0\x91\x1bpZ\xc9tB\xa0\xb7m\xba\x07\xc4M\x83\x86\xd5m\xd5\xed\x96R7d.\x99\xfe:x>.m\x13\xa8\x1dwz\xbdu<\xf4\x96\x87\xbb\xa6\x02g}\xe7\xd4\xfa\xcc\\\xa3\xc9\xce\xa8\x86\x18^\xf9\xaf|\xb25\xe3H7,5\x96(\xb9\xefm7S}$`T\x93z`L\x85)\xa9e\x99+\xe6\xa6\xae\xc2\xf0w\x8d\xdfF\xe0\xf56\x1e\xf5\xbc\xaf\xed$qe\xe2uC\x96\x89\x9ck\x94~\xf7\xf7\xfe\xd7MF7E\x1b\x03.mh\xa9N\xd1\x86\xae}\x0c\xdap\xcdu\xd2nx\xa2v\xd9\xbfo\xeb\xd1[PLQ\x07 MS\x9a\xa27\x02\xcf\xa01B\xc9\xd8{\x1a{oo\x1d\xac\x9a\xb7q\xb3\x03\xf0\xe0\xb43%\x8d\xca1\xccU\xea\xb7\x966>&8W\x1a!\x86)\xb5\x18Ju\xeb\xd7\xacW!]\xd0;\xdf\x8bx\x1b\x8f\x7fn\x1a\xbe6\xbfF\xd6\x03\xef\xefRI\x835y\xa7\xe8\xf6Z,\xad\xc5\xd1z\xb1WS\x0b\xb4\x99\xe2#\xf0\xce\xdf^{\x0d\xc9T\x8c\xa11#\xd8\x94\x88\x0b\xd5\x1eX\xbc\xb3W\x96\xda\xcat7\xe1q\xae\xd3\xb9\xadc\xbb\xeb\xb9{\xb6\x01;\x1d\x1bF\xdd\x93\xeb~\x8c\x9f\xf5/\xb54|{}\xba\x11_\x97S\xd3\x12^\xfb\xb2?N&\x0el-=\xd5b\xd9\x84a\x1c%\x13\xa0Z\x8b\xa5+\x1d!\xa1\x96Y\xdb\xea\x81\x07m\x19mg\xa8I^N-J\xb6jx~\x0d+h\xf2\xe2\xfa\xdd+\xccW\xafu\xe9\xc1E\xea\xa1{\xd2q\x9f\xfaz\xda\\J\xe3\xa8\xf9\xc3\xf4\xbf\x00\x00\x00\xff\xffPK\x07\x08\xb4\x93_{g\x05\x00\x008\x0d\x00\x00PK\x01\x02\x14\x03\x14\x00\x08\x00\x08\x00\xe7\x88\x83J\xb4\x93_{g\x05\x00\x008\x0d\x00\x00\n\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\xa4\x81\x00\x00\x00\x00index.htmlUT\x05\x00\x01B\x81\xe2XPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x00A\x00\x00\x00\xa8\x05\x00\x00\x00\x00"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why is this changing? Can you exclude this change from the PR?

flagSet.Duration(
nsConfig.namespace+suffixReconnectInterval,
nsConfig.ReconnectInterval,
"Reconnect interval to retry connecting to downed hosts")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: what would be the behavior if this is set to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current (master) jaeger behavior is equivalent to setting to 0. If hosts are marked as down, gocql will not try to reconnect without this set.

@nyanshak nyanshak force-pushed the cassandra-reconnect-interval branch from f903859 to d8e6faa Compare July 13, 2018 16:59
@codecov
Copy link

codecov bot commented Jul 13, 2018

Codecov Report

Merging #934 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master   #934   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files         126    126           
  Lines        6074   6080    +6     
=====================================
+ Hits         6074   6080    +6
Impacted Files Coverage Δ
plugin/storage/cassandra/options.go 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a0dc40e...67352af. Read the comment docs.

@@ -204,6 +210,7 @@ func (cfg *namespaceConfig) initFromViper(v *viper.Viper) {
cfg.ConnectionsPerHost = v.GetInt(cfg.namespace + suffixConnPerHost)
cfg.MaxRetryAttempts = v.GetInt(cfg.namespace + suffixMaxRetryAttempts)
cfg.Timeout = v.GetDuration(cfg.namespace + suffixTimeout)
cfg.ReconnectInterval = v.GetDuration(cfg.namespace + suffixReconnectInterval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a test

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@yurishkuro
Copy link
Member

looks like you missed signing one of the commits. You may have to squash them into one and sign that one.

Signed-off-by: Brendan Shaklovitz <nyanshak@users.noreply.github.com>
@nyanshak nyanshak force-pushed the cassandra-reconnect-interval branch from e8494d9 to 67352af Compare July 13, 2018 17:26
@@ -31,6 +31,7 @@ type Configuration struct {
Keyspace string `validate:"nonzero"`
ConnectionsPerHost int `validate:"min=1" yaml:"connections_per_host"`
Timeout time.Duration `validate:"min=500"`
ReconnectInterval time.Duration `validate:"min=500" yaml:"reconnect_interval"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that the minimum of 500ns is a good value. Could you elaborate on how this was chosen?

(I think we need to revisit the Timeout as well separately)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that I really understand the validate:"min=500" section. It seems that it wasn't actually enforced in my tests. (As in, I would not pass any value, resulting in 0, which is less than 500 and it would pass validation. Unless that's a special case?)

I copied Timeout value for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, it looks like these validations would only be applied if this configuration file was validated with something like https://github.com/go-validator/validator

I'll make a separate ticket to address this

@yurishkuro yurishkuro merged commit 7919cd9 into jaegertracing:master Jul 14, 2018
@yurishkuro yurishkuro changed the title Cassandra reconnect interval Add support for Cassandra reconnect interval Jul 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants