Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BinaryLogClient race condition on reconnect causing duplicate events to be processed #113

Closed
andrewbudimanasana opened this issue Sep 19, 2016 · 2 comments

Comments

@andrewbudimanasana
Copy link

Timeline:

  1. Thread A initial connect
  2. Keep-alive thread detects failure when binlog position = X
  3. Keep-alive starts reconnect thread B at position X
  4. Thread A does not realize that it should stop processing events
  5. Thread A and B process duplicate binlogs from position X on up

Currently, the 2 ways thread A was supposed to notice the reconnect and
stop:

  1. The socket's input stream gets shut down
  2. It processes events only if the global connected state == true

From what I can tell, thread A was supposed to stop because the state
became disconnected, and then exit because the next read should
have yielded EOF.

This doesn't always work because:

  1. The input stream is buffered so the EOF read is delayed
  2. Before it reaches the end of the buffer, thread B connects
  3. Thread A now processes the rest of its buffered events because thread
    B updates the global connect state

The symptom of this is interleaving duplicate events. This especially
messes up the transaction logic.

I'm using mypipe which uses version 0.2.4 so I created a patch on top of that branch here. https://github.com/andrewbudiman/mysql-binlog-connector-java/commits/fix-reconnect-processing

I'd be happy to create a pr for the master branch. I was considering a more intrusive refactor to make things cleaner, but didn't know if you already had ideas so I tried to keep the patch simple.

@shyiko
Copy link
Owner

shyiko commented Sep 20, 2016

@andrewbudimanasana Thank you so much! Fix is going to be released in 0.4.2 (as soon as I'm done testing changes).

shyiko added a commit that referenced this issue Sep 20, 2016
shyiko added a commit that referenced this issue Sep 20, 2016
shyiko added a commit that referenced this issue Sep 20, 2016
shyiko added a commit that referenced this issue Sep 20, 2016
shyiko added a commit that referenced this issue Sep 20, 2016
shyiko added a commit that referenced this issue Sep 20, 2016
@shyiko
Copy link
Owner

shyiko commented Sep 20, 2016

0.4.2 is out (Maven Central sync might take up to 2 hours).

@shyiko shyiko closed this as completed Sep 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants