Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rocksdb] Added support for RocksDB Java API #1052

Merged
merged 4 commits into from
Jul 7, 2018

Conversation

adamretter
Copy link
Contributor

This adds support for RocksDB via RocksJava binding.

@adamretter adamretter force-pushed the rocks-java branch 2 times, most recently from 39702ad to e1c285d Compare October 20, 2017 20:05
@adamretter adamretter force-pushed the rocks-java branch 3 times, most recently from 65b98e3 to c0c6a93 Compare November 29, 2017 14:13
@adamretter
Copy link
Contributor Author

@risdenk I have bought this up to date and in line with the changes you requested in #876

Closes #876

@adamretter
Copy link
Contributor Author

@busbey Any chance of getting this one merged?

@adamretter
Copy link
Contributor Author

@busbey @manolama can someone please take a look?

Copy link
Contributor

@Vogel612 Vogel612 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like that you added thorough Unit-Tests. Thanks for that :)


## Quick Start

This section describes how to run YCSB on RocksDB running locally.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it, though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I believe so. Why do you think otherwise?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no installation and setup instructions for RocksDB itself :/ As I understood it, these are also in the binding-related readme.md

Copy link
Contributor Author

@adamretter adamretter Apr 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to install it. Everything you need is self-contained inside the rocksdbjni Jar that YCSB declares a dependency on. Note that RocksDB is an embedded database, not a database server.

@@ -0,0 +1,57 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Copyright (c) 2017 YCSB contributors. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's 2018 by now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -0,0 +1,387 @@
/*
* Copyright (c) 2017 YCSB contributors. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


@GuardedBy("RocksDBClient.class") private static Path rocksdbDir = null;
@GuardedBy("RocksDBClient.class") private static RocksObject dbOptions = null;
@GuardedBy("RocksDBClient.class") private static RocksDB rocksDb = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this wasn't static, you'd not need all the synchronization.... What is the advantage of forcing the RocksDB object you're using to be a singleton?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to ensure that there was only a single instance of RocksDB running at any time within YCSB. Perhaps this is unnecessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that you want only a single instance of RocksDB. I don't understand why. For the general case, that is unnecessary, yes. It may be different for RocksDB though. I have never worked with it before...

So ... why do you want to ensure that there is only a single instance of RocksDB running at any time?
How do you propose dealing with multiple ycsb run invocations on the same host?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RocksDB is an embedded database (not a database server), i.e. it will be running inside the same JVM as YCSB itself. I wanted to ensure there was only one RocksDB instance per YCSB instance. Does that make sense?

public void init() throws DBException {
synchronized(RocksDBClient.class) {
if(rocksDb == null) {
rocksdbDir = Paths.get(getProperties().getProperty(PROP_ROCKSDB_DIR));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

casing inconsistency: rocksDbDir

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


final ColumnFamilyHandle cf = COLUMN_FAMILIES.get(table).getHandle();
try(final RocksIterator iterator = rocksDb.newIterator(cf)) {
int iterations = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're never using the value stored in iterations. Why have that variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used on the line below - iterations < recordcount

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right, I missed that. Please disregard that :)

final Path file = rocksdbDir.resolve(CF_NAMES_FILENAME);
if(Files.exists(file)) {
try (final LineNumberReader reader =
new LineNumberReader(new InputStreamReader(Files.newInputStream(file), UTF_8))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of new InputStreamReader(Files.newInputStream(file), UTF_8) you could use Files.newBufferedReader(file, UTF_8)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


private void saveColumnFamilyNames() throws IOException {
final Path file = rocksdbDir.resolve(CF_NAMES_FILENAME);
try(final PrintWriter writer = new PrintWriter(new OutputStreamWriter(Files.newOutputStream(file), UTF_8))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of new OutputStreamWriter(Files.newOutputStream(file), UTF_8) you could use Files.newBufferedWriter(file, UTF_8)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -0,0 +1,22 @@
/*
* Copyright (c) 2017 YCSB contributors. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -0,0 +1,121 @@
/*
* Copyright (c) 2017 YCSB contributors. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@adamretter
Copy link
Contributor Author

@Vogel612 Thanks for the review. I have corrected everything as suggested (where possible).

The only outstanding question, is perhaps the static synchronization one. I would welcome a response to my comment on that, and then we can either fix, or leave as is...

*
* See {@code rocksdb/README.md} for details.
*
* @author <a href="mailto:adam@evolvedbinary.com">Adam Retter</a>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid @author lines. Instead we rely on git to show attribution.

Please leave this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay no problem, I have removed that now.

</dependency>
<dependency>
<groupId>net.jcip</groupId>
<artifactId>jcip-annotations</artifactId>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, is GuardedBy runtime scoped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, it is compile time.

@adamretter
Copy link
Contributor Author

@busbey If you are happy with this, I can rebase to squash down the fixup commits before you merge. Just let me know...

@busbey busbey mentioned this pull request May 19, 2018
@busbey
Copy link
Collaborator

busbey commented May 19, 2018

A preemptive rebase+squash is always helpful, but I haven't had a chance to dig in yet. Planning to once 0.14.0 is wrapped.

@adamretter
Copy link
Contributor Author

@busbey Sure. I have now squashed and rebased.

Copy link
Collaborator

@busbey busbey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some questions and a couple of touch-ups.


## Quick Start

This section describes how to run YCSB on RocksDB running locally.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth pointing out here that RocksDB is always embedded and so things like "How to run in parallel" don't make sense to do in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

rocksdb/pom.xml Outdated
<parent>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>binding-parent</artifactId>
<version>0.14.0-SNAPSHOT</version>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update this now that master is at 0.15.0-SNAPSHOT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

synchronized(RocksDBClient.class) {
if(rocksDb == null) {
rocksDbDir = Paths.get(getProperties().getProperty(PROPERTY_ROCKSDB_DIR));
System.out.println("RocksDB data dir: " + rocksDbDir);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use a logging framework instead of System.out for messages. preferably slf4j.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
}

private RocksDB initRocksDB() throws IOException, RocksDBException {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this has to be called within a lock on RocksDBClient.class, you should note that in javadoc on the method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

try {
if(!COLUMN_FAMILIES.containsKey(name)) {
final ColumnFamilyOptions cfOptions = new ColumnFamilyOptions().optimizeLevelStyleCompaction();
final ColumnFamilyHandle cfHandle = rocksDb.createColumnFamily(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to work through the reasoning around locking. is this call to rocksDb.createColumnFamily the non-threadsafe thing we're using the column family lock to guard?

Copy link
Contributor Author

@adamretter adamretter Jun 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two different locking mechanisms at play here:

  1. It seemed to me that there could be multiple RocksDBClient object instances within a running instance of YCSB. However, as RocksDB itself is an embedded database, we can only open and close it once for the entire lifecycle of YCSB. The synchronization on RocksDBClient.class ensures that RocksDB will only be opened once on the first call, and then only closed on the last call (ensured via reference counting).

  2. A YCSB "table" is represented in RocksDB as a Column Family. We create these column families on demand, and we must only create them once (on their first use). The COLUMN_FAMILY_LOCKS are used to ensure that a column family is only created once no matter how many concurrent calls there are for an operation on the "table". Also note that, when you create a column family in RocksDB, the next time you open the database you have to tell it of all existing column families, otherwise it will error. This is why we also persist the column family names to a text file.

Hope that makes sense?

@busbey busbey changed the title Added support for RocksDB [rocksdb] Added support for RocksDB Java API Jul 7, 2018
@busbey busbey merged commit 563925e into brianfrankcooper:master Jul 7, 2018
@busbey
Copy link
Collaborator

busbey commented Jul 7, 2018

thanks for getting this together @adamretter!

@adamretter
Copy link
Contributor Author

@busbey Awesome thanks. We can now look at tuning the options and improving performance.

@adamretter adamretter deleted the rocks-java branch July 7, 2018 05:05
@busbey busbey mentioned this pull request Jul 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants