Skip to content

Performance measurement by executing synthetic or historical workloads

adferguson edited this page Jul 17, 2012 · 9 revisions

One standard way to do performance comparison is to execute the same workload on the systems being measured, compute potentially multiple performance metrics from the observed behavior, then assess the pros and cons of each system.

SWIM allows realistic and representative workloads to be used to drive such performance comparisons.

Step 1. Generate scripts to execute the synthetic workload

This step converts the output from Analyze historical cluster traces and synthesize representative workload into scripts that call stub Hadoop jobs to reproduce activity in the workload.

GenerateReplayScript.java

Usage:

   javac GenerateReplayScript.java

   java GenerateReplayScript
       [path to synthetic workload file]
       [number of machines in the original production cluster]
       [number of machines in the cluster where the workload will be run]
       [size of each input partition in bytes]
       [number of input partitions]
       [output directory for the scripts]
       [HDFS directory for the input data]
       [prefix to workload output in HDFS]
       [amount of data per reduce task in byptes]
       [workload stdout stderr output dir]
       [hadoop command]
       [path to WorkGen.jar]
       [path to workGenKeyValue_conf.xsl] 

Generates a folder of shell scripts to execute the synthetic workload.

Make sure to chmod +x the shell scripts generated.

Arguments in the above:

  • [path to synthetic workload file] e.g., samples_24_times_1hr_0.tsv, or, for testing, samples_24_times_1hr_0_first50jobs.tsv
  • [number of machines in the original production cluster] e.g., 600 for the Facebook trace
  • [number of machines in the cluster where the workload will be run] e.g., 10 machines, small test cluster
  • [size of each input partition in bytes] Should be roughly the same as HDFS block size, e.g., 67108864
  • [number of input partitions] The input data size need to be >= max input size in the synthetic workload. Try a number. The program will check whether it is large enough. e.g., 10 for the workload in samples_24_times_1hr_0_first50jobs.tsv.
  • [output directory for the scripts] e.g., scriptsTest, or, to not overwrite the files in that directory, scriptsTest2
  • [HDFS directory for the input data] e.g., workGenInput. Later, need to generate data to this directory.
  • [prefix to workload output in HDFS] e.g., workGenOutputTest. The HDFS output dir will have format $prefix-$jobIndex.
  • [amount of data per reduce task in byptes] Should be roughly the same as HDFS block size, e.g., 67108864
  • [workload output dir] Directory to output the log files, e.g., /home/USER/swimOutput.
  • [hadoop command] Command to invoke Hadoop on the targeted system, e.g. $HADOOP_HOME/bin/hadoop
  • [path to WorkGen.jar] Path to WorkGen.jar on the targeted system, e.g. $HADOOP_HOME/WorkGen.jar
  • [path to workGenKeyValue_conf.xsl] Path to workGenKeyValue_conf.xsl on the targeted system, e.g. $HADOOP_HOME/conf/workGenKeyValue_conf.xsl

For example, the scripts in scriptsTest/ were created by

  java GenerateReplayScript 
       FB-2009_samples_24_times_1hr_0_first50jobs.tsv 
       600 
       10 
       67108864 
       10
       scriptsTest
       workGenInput
       workGenOutputTest
       67108864
       workGenLogs
       hadoop
       WorkGen.jar
       '/usr/lib/hadoop-0.20.2/conf/workGenKeyValue_conf.xsl'

Those scripts launch the first 50 jobs in the day-long Facebook-like workload, pre-synthesized for testing the cluster setup. The primary script is run-jobs-all.sh That in turns calls run-jobs-$i.sh at appropriate times.

Make sure to "chmod +x" the shell scripts generated.

Step 2. Prepare Hadoop cluster

Install Hadoop and setup a cluster.

Copy randomwriter_conf.xsl and workGenKeyValue_conf.xsl to the ${HADOOP_HOME}/conf/ folder, or somewhere else appropriate for your setup.

Step 3. Compile MapReduce jobs needed for workload execution

Compile the MapReduce job used to write the input data set.

mkdir hdfsWrite
javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d hdfsWrite HDFSWrite.java
jar -cvf HDFSWrite.jar -C hdfsWrite/ .

Note: if you are using MapReduce 2 or above (aka YARN), you should substitute the following javac command:

javac -classpath ${HADOOP_HOME}/share/hadoop/common/\*:${HADOOP_HOME}/share/hadoop/mapreduce/\*:${HADOOP_HOME}/share/hadoop/mapreduce/lib/\* -d hdfsWrite HDFSWrite.java

Compile the MapReduce job used to read/shuffle/write data with prescribed data ratios

mkdir workGen
javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d workGen WorkGen.java
jar -cvf WorkGen.jar -C workGen/ .

Again, when using MapReduce 2, substitute this instead:

javac -classpath ${HADOOP_HOME}/share/hadoop/common/\*:${HADOOP_HOME}/share/hadoop/mapreduce/\*:${HADOOP_HOME}/share/hadoop/mapreduce/lib/\* -d workGen WorkGen.java

Step 4. Execute workload

Start a Hadoop cluster, with the same number of machines as that used to generate the scripts earlier in Step 1.

Write input data:

Edit conf/randomwriter_conf.xsl on the cluster. Make sure the configuration parameter test.randomwrite.bytes_per_map is the same as the value used for [size of each input partition in bytes] for java GenerateReplayScript. Also set the configuration parameter test.randomwrite.total_bytes to be the product of [size of each input partition in bytes] and [number of input partitions]. These steps help ensure that HDFS is populated with sufficient input data to run the workload.

Once the parameters in conf/randomwriter_conf.xsl have been set, run the following:

bin/hadoop jar HDFSWrite.jar org.apache.hadoop.examples.HDFSWrite -conf conf/randomwriter_conf.xsl workGenInput

Now the cluster is ready to run the actual workload:

cp -r scriptsTest ${HADOOP_HOME}
cd ${HADOOP_HOME}/scriptsTest
./run-jobs-all.sh &

The workload will then run in the background until complete.

If you get a permissions error, make sure you have added execution permission to the workload scripts.

Experienced users:

The scripts currently assume that they would be run from a directory one level lower than ${HADOOP_HOME}. The relative paths of the hadoop binary and the output paths can be modified by editing GenerateReplayScript.java.

If you want to add any other commands after each job, you can do this by editing GenerateReplayScript.java also.

Step 5. Workload output and post processing

Basic post processing

run-job-$i.sh pipes System.out and System.err to ${HADOOP_HOME}/output/job-$i.txt

The files contain the screen output generated by Hadoop during each job's execution.

A quick way to extract the duration of each job:

for i in {0..$END} do cat ${HADOOP_HOME}/output/job-$i.txt | grep "The job took" | awk '{print $4}' >> all-jobs-duration.txt done

The per-job duration data is now ready for further performance analysis.

A more extensive tool

A more sophisticated analysis tool is parse-hadoop-jobhistory.pl.

This is the same tool as that used in Step 1 of Analyze historical cluster traces and synthesize representative workload.

This tool parses Hadoop job history logs created in the default Hadoop logging format.

Usage:

perl parse-hadoop-jobhistory.pl [job history dir] > outputFile.tsv

This script prints the output to STDOUT, which can then be piped to file or consumed for further analysis. The output format is tab separated values (.tsv), one row per job. The output fields are:

   1.  unique_job_id
   2.  job_name
   3.  map_input_bytes
   4.  shuffle_bytes
   5.  reduce_output_bytes
   6.  submit_time_seconds (epoch format)
   7.  duration_seconds
   8.  map_time_task_seconds (2 tasks of 10 seconds = 20 task-seconds)
   9.  red_time_task_seconds
   10. total_time_task_seconds
   11. map_tasks_count     
   12. reduce_tasks_count
   13. hdfs_input_path (if available in history log directory)
   14. hdfs_output_path (if available in history log directory)

For example, one could parse the Facebook history logs by calling

perl parse-hadoop-jobhistory.pl FB-2009_LogRepository > FacebookTrace.tsv 

The subsequent output can be directly fed into Step 2 of Analyze historical cluster traces and synthesize representative workload. Alternately, human analysts can use the output for more extensive analysis.