Skip to content

Commit

Permalink
Update README after changes in previous commits
Browse files Browse the repository at this point in the history
After the extensive changes made in the previous commits, the README is
now updated to reflect this.
  • Loading branch information
JelteF committed Dec 20, 2021
1 parent d7ec696 commit 4bd106e
Showing 1 changed file with 37 additions and 19 deletions.
56 changes: 37 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# HammerDB TPROC-C and CH benchmarking tool for Citus and PostgreSQL

This repository contains utility scripts/files to run [HammerDB](https://github.com/TPC-Council/HammerDB) and the [CH-benCHmark](https://db.in.tum.de/research/projects/CHbenCHmark/) on Citus and regular PostgreSQL.
This repository contains utility scripts/files to run [HammerDB][hammerdb] and
the [CH-benCHmark][ch] on Citus and regular PostgreSQL.

# Preparation

Expand Down Expand Up @@ -31,43 +32,60 @@ cd ch-benchmark

# Running HammerDB TPROC-C with CH-benCHmark support

Generate a patched HammerDB version with:
```bash
./generate-hammerdb.sh 4.0
```

`build-and-run.sh` is the driver script and can be run using:

```bash
./build-and-run.sh <version 3.3 or 4.0> <prefix> <is_tpcc> <is_ch>
./build-and-run.sh [--hammerdb-version[=]<version>] [--ch|--ch-queries-only] [--no-citus] [--name[=]name]
```

* prefix indicates the prefix used in result files
* if `is_tpcc` is `true`, then the transaction queries will be run.
* if `is_ch` is `true`, then the analytical queries will be run.

The script relies on libpq environment variables for connecting to the database.

Example usage:
Example usage that runs only HammerDB TPROC-C without CH-benCHmark queries:
```bash
export PGHOST=203.0.113.4
export PGUSER=citus
export PGDATABASE=citus
export PGPASSWORD=
./build-and-run.sh tpcc-run true false
./build-and-run.sh
```

So if you want to run both tpcc and analytical queries concurrently, you should set both of them to true.
## Running CH-bencCHmark queries

When running `build-and-run.sh` with the default flags, only HammerDB TPROC-C
will be ran. If you want to run the CH-benCHmark analytical queries you can
specify the `--ch` flag to run the benchmark with both TPROC-C and CH-benCHmark
queries at the same time. Or if you only want to run the CH analytical queries
without TPROC-C you can specify the `--ch-queries-only` flag.

## Changing HammerDB configurations

`build.tcl` is used to build hammerdb tables and `run.tcl` is used to run the test.
You can change hammerdb configurations from those files.

*pg_count_ware/pg_num_vu* should be at least 4. https://www.hammerdb.com/blog/uncategorized/how-many-warehouses-for-the-hammerdb-tpc-c-test/

`ch_benchmark.py` is a utility script to send the extra 22 queries(analytical queries). By default one thread is used for sending the analytical queries. The start index for each thread is randomly chosen with a fixed seed so that it will be same across different platforms.
## Running against plain Postgres (without Citus)

If you want to run the benchmarks against standard Postgres you need to do two
things:
1. Change `pg_cituscompat` to `false` in `build.tcl`
2. And run `build-and-run.sh` with the `--no-citus` flag (only needed if you
enable CH queries).


# Checklist for running the benchmark
- [ ] Make sure that worker node count is a divisor of the value of
`--shard-count`, otherwise some nodes will have more shards and the load will
not be distributed evenly.
- [ ] Make sure that `max_connections` is high enough based on `vuset vu` in
`run.tcl`. `max_connections` should be at least 150 more than the value given
to `vuset vu`.

Checklist for running benchmark:
# Implementation details Details
`ch_benchmark.py` is a utility script to send the extra 22 queries(analytical
queries). By default one thread is used for sending the analytical queries. The
start index for each thread is randomly chosen with a fixed seed so that it will
be same across different platforms.

* Make sure that node count is a divisor of shard count, otherwise some nodes will have more shards and the load will not be distribuded evenly.
* Make sure that max_connections is high enough based on #vuuser. max_connections should be at least 150 more than #vuuser.
* Make sure that you do a checkpoint before starting the test, the `build-and-run.sh` already does this. Otherwise the timing of checkpoint can affect the results for short tests.
[hammerdb]: https://github.com/TPC-Council/HammerDB
[ch]: https://db.in.tum.de/research/projects/CHbenCHmark/

0 comments on commit 4bd106e

Please sign in to comment.