Skip to content

Commit

Permalink
Merge pull request brianfrankcooper#214 from LatencyUtils:master
Browse files Browse the repository at this point in the history
    [core] use HdrHistogram and fix coordinated omission

also closes brianfrankcooper#10
  • Loading branch information
busbey committed Jun 17, 2015
2 parents 798f1c7 + 9120a70 commit f40a230
Show file tree
Hide file tree
Showing 10 changed files with 809 additions and 173 deletions.
67 changes: 67 additions & 0 deletions core/CHANGES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
When used as a latency under load benchmark YCSB in it's original form suffers from
Coordinated Omission[1] and related measurement issue:

* Load is controlled by response time
* Measurement does not account for missing time
* Measurement starts at beginning of request rather than at intended beginning
* Measurement is limited in scope as the histogram does not provide data on overflow values

To provide a minimal correction patch the following were implemented:

1. Replace internal histogram implementation with HdrHistogram[2]:
HdrHistogram offers a dynamic range of measurement at a given precision and will
improve the fidelity of reporting. It allows capturing a much wider range of latencies.
HdrHistogram also supports compressed loss-less serialization which enable capturing
snapshot histograms from which lower resolution histograms can be constructed for plotting
latency over time. Snapshot interval histograms are serialized on status reporting which
must be enabled using the '-s' option.

2. Track intended operation start and report latencies from that point in time:
Assuming the benchmark sets a target schedule of execution in which every operation
is supposed to happen at a given time the benchmark should measure the latency between
intended start time and operation completion.
This required the introduction of a new measurement point and inevitably
includes measuring some of the internal preparation steps of the load generator.
These overhead should be negligible in the context of a network hop, but could
be corrected for by estimating the load-generator overheads (e.g. by measuring a
no-op DB or by measuring the setup time for an operation and deducting that from total).
This intended measurement point is only used when there is a target load (specified by
the -target paramaeter)

This branch supports the following new options:

* -p measurementtype=[histogram|hdrhistogram|hdrhistogram+histogram|timeseries] (default=histogram)
The new measurement types are hdrhistogram and hdrhistogram+histogram. Default is still
histogram, which is the old histogram. Ultimately we would remove the old measurement types
and use only HdrHistogram but the old measurement is left in there for comparison sake.

* -p measurement.interval=[op|intended|both] (default=op)
This new option deferentiates between measured intervals and adds the intended interval(as described)
above, and the option to record both the op and intended for comparison.

* -p hdrhistogram.fileoutput=[true|false] (default=false)
This new option will enable periodical writes of the interval histogram into an output file. The path can be set using '-p hdrhistogram.output.path=<PATH>'.

Example parameters:
-target 1000 -s -p workload=com.yahoo.ycsb.workloads.CoreWorkload -p basicdb.verbose=false -p basicdb.simulatedelay=4 -p measurement.interval=both -p measurementtype=hdrhistogram -p hdrhistogram.fileoutput=true -p maxexecutiontime=60

Further changes made:

* -p status.interval=<number of seconds> (default=10)
Controls the number of seconds between status reports and therefore between HdrHistogram snapshots reported.

* -p basicdb.randomizedelay=[true|false] (default=true)
Controls weather the delay simulated by the mock DB is uniformly random or not.

Further suggestions:

1. Correction load control: currently after a pause the load generator will do
operations back to back to catchup, this leads to a flat out throughput mode
of testing as opposed to controlled load.

2. Move to async model: Scenarios where Ops have no dependency could delegate the
Op execution to a threadpool and thus separate the request rate control from the
synchronous execution of Ops. Measurement would start on queuing for execution.

1. https://groups.google.com/forum/#!msg/mechanical-sympathy/icNZJejUHfE/BfDekfBEs_sJ
2. https://github.com/HdrHistogram/HdrHistogram
31 changes: 29 additions & 2 deletions core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,33 @@
<version>6.1.1</version>
<scope>test</scope>
</dependency>
</dependencies>

<dependency>
<groupId>org.hdrhistogram</groupId>
<artifactId>HdrHistogram</artifactId>
<version>2.1.4</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>${maven.assembly.version}</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
40 changes: 27 additions & 13 deletions core/src/main/java/com/yahoo/ycsb/BasicDB.java
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
import java.util.Set;
import java.util.Enumeration;
import java.util.Vector;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.LockSupport;


/**
Expand All @@ -32,11 +34,15 @@ public class BasicDB extends DB
public static final String VERBOSE="basicdb.verbose";
public static final String VERBOSE_DEFAULT="true";

public static final String SIMULATE_DELAY="basicdb.simulatedelay";
public static final String SIMULATE_DELAY_DEFAULT="0";
public static final String SIMULATE_DELAY="basicdb.simulatedelay";
public static final String SIMULATE_DELAY_DEFAULT="0";

public static final String RANDOMIZE_DELAY="basicdb.randomizedelay";
public static final String RANDOMIZE_DELAY_DEFAULT="true";


boolean verbose;
boolean verbose;
boolean randomizedelay;
int todelay;

public BasicDB()
Expand All @@ -49,14 +55,22 @@ void delay()
{
if (todelay>0)
{
try
{
Thread.sleep((long)Utils.random().nextInt(todelay));
}
catch (InterruptedException e)
{
//do nothing
}
long delayNs;
if (randomizedelay) {
delayNs = TimeUnit.MILLISECONDS.toNanos(Utils.random().nextInt(todelay));
if (delayNs == 0) {
return;
}
}
else {
delayNs = TimeUnit.MILLISECONDS.toNanos(todelay);
}

long now = System.nanoTime();
final long deadline = now + delayNs;
do {
LockSupport.parkNanos(deadline - now);
} while ((now = System.nanoTime()) < deadline && !Thread.interrupted());
}
}

Expand All @@ -69,7 +83,7 @@ public void init()
{
verbose=Boolean.parseBoolean(getProperties().getProperty(VERBOSE, VERBOSE_DEFAULT));
todelay=Integer.parseInt(getProperties().getProperty(SIMULATE_DELAY, SIMULATE_DELAY_DEFAULT));

randomizedelay=Boolean.parseBoolean(getProperties().getProperty(RANDOMIZE_DELAY, RANDOMIZE_DELAY_DEFAULT));
if (verbose)
{
System.out.println("***************** properties *****************");
Expand Down
Loading

0 comments on commit f40a230

Please sign in to comment.