Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hbase10] Still too many threads #691. #692

Merged
merged 1 commit into from
Apr 11, 2016

Conversation

saintstack
Copy link
Contributor

Fix broken synchronization meant to guard against over-creation
of cluster connection instances on startup when multiple threads.

Fix broken synchronization meant to guard against over-creation
of cluster connection instances on startup when multiple threads.
@saintstack
Copy link
Contributor Author

#691 has write up on what this pull request addresses

@@ -166,7 +176,9 @@ public void init() throws DBException {
String table = com.yahoo.ycsb.workloads.CoreWorkload.table;
try {
final TableName tName = TableName.valueOf(table);
connection.getTable(tName).getTableDescriptor();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so getTable() isn't thread safe in that it could create more connection objects?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought Connection was supposed to be threadsafe? Is this essentially a bug in the implementation? Any idea if it still exists in later HBase client versions?

I'm happy to demonstrate a workaround other applications need to be aware of, but I'd like to make sure we document why we're being more conservative than the HBase docs indicate we need to be.

@saintstack
Copy link
Contributor Author

@manolama The root issue is that when lots of threads starting up, they all see the static 'connection' as null so each creates a new one but only one of which prevails... meantime a bunch of unreferenced connections are just hanging out (along w/ their zk client threads). The synchronization here on connection.getTable is just my being consistent synchronizing all use of connection so threads see the connection probably made by another and don't get a NPE on connection; probably gratuitous (getTable is safe).

@busbey connection is thread safe. This 'fix' is to ycsb HBaseClient trying to have a singleton instance of a connection. It was synchronizing on an Integer that it was then changing inside the synchronize block.... and the synchronization on the threadCount integer was supposed to be 'protecting' connection initialization. Its broken.

@busbey
Copy link
Collaborator

busbey commented Apr 10, 2016

oh! the autoboxing of the integer changed the reference. of course. /me slaps forehead.

guess next time I won't advocate moving to something "simpler" than the Atomic classes. :/

@saintstack
Copy link
Contributor Author

On Sat, Apr 9, 2016 at 10:40 PM, Sean Busbey notifications@github.com
wrote:

oh! the autoboxing of the integer changed the reference. of course. /me
slaps forehead.

guess next time I won't advocate moving to something "simpler" than the
Atomic classes. :/


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#692 (comment)

Looking at hbase098, it is synchronizing on the AtomicInteger which would
do-the-right-thing. Should I make this patch do same as it?

@busbey
Copy link
Collaborator

busbey commented Apr 10, 2016

The synchronization here on connection.getTable is just my being consistent synchronizing all use of connection so threads see the connection probably made by another and don't get a NPE on connection; probably gratuitous (getTable is safe).

This part shouldn't be needed, since we are synchronized within init and cleanup on changing the null/not-null state of connection.

If we leave the synchronized block in here, then we'll contend on the lock across all the threads whenever the table changes. With the current core workloads that will mean the first operation each thread does. Maybe not killer since just at startup, but it will be ugly since it'll be each thread's init + each threads first operation all hitting the same mutex.

// Must be an object for synchronization and tracking running thread counts.
private static Integer threadCount = 0;

private static AtomicInteger threadCount = new AtomicInteger(0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be final.

@busbey
Copy link
Collaborator

busbey commented Apr 10, 2016

Looking at hbase098, it is synchronizing on the AtomicInteger which would
do-the-right-thing. Should I make this patch do same as it?

That's what this client did before #651. I don't really mind either way. I suppose it's nice for the implementations to be similar?

@busbey busbey mentioned this pull request Apr 10, 2016
@cmatser
Copy link
Collaborator

cmatser commented Apr 10, 2016

👍

@manolama
Copy link
Collaborator

@saintstack Ah thank you! That damn autoboxing got me so it makes sense now.

@cmatser
Copy link
Collaborator

cmatser commented Apr 11, 2016

I'm tracking this issue for the release. Can you tell me what's the next step(s) before this can be merged? Thanks!

@saintstack
Copy link
Contributor Author

I'm running the patch locally. Works for me. It will be different locking from what is in the hbase098 and hbase094 clients but it was different before I showed up. There is one +1 above. Need more?

@busbey
Copy link
Collaborator

busbey commented Apr 11, 2016

the threadcount variable should be final. but I can clean that up and make things look like hbase098 in a follow-on.

@busbey busbey merged commit 604c50d into brianfrankcooper:master Apr 11, 2016
@busbey
Copy link
Collaborator

busbey commented Apr 11, 2016

shoot. just reread things more carefully and saw my comment about synchronizing around getTable.

probably fine? I'll add it to the follow-on. someone let me know if they think we should revert and fix now.

@saintstack
Copy link
Contributor Author

Fine I'd say @busbey ... just startup. And we're synchronizing anyways.

Thanks for merge.

cmatser pushed a commit that referenced this pull request Apr 11, 2016
Fix broken synchronization meant to guard against over-creation

Issue #692

of cluster connection instances on startup when multiple threads.
@ghost ghost mentioned this pull request May 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants