Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Support for Distributions in Templated Workloads #385

Merged
merged 36 commits into from
Mar 25, 2024

Conversation

ETHenzlere
Copy link
Contributor

@ETHenzlere ETHenzlere commented Oct 31, 2023

PR adds support for various Numerical Distributions in templated benchmarks

  • Uniform
  • Binomial
  • Zipfian
  • Scrambled

The distributions work not only for integers, but also different types such as timestamps, long, float
A table of supported combinations can be found in the templated benchmarks readme.

Type uniform binomial zipfian scrambled (zipfian)
INTEGER X X X X
FLOAT / REAL X X - -
BIGINT X X X X
VARCHAR / STRING X - - -
TIMESTAMP X X X X
DATE X X X X
TIME X X X X

Usage example:

<templates>
    <template name="MyTemplate">
        <query><![CDATA[SELECT * FROM MyTable WHERE id = ?]]></query>
        <types>
            <type>INTEGER</type>
        </types>
        <values>
            <value dist="uniform" min="0" max="1000" seed="1"/>
        </values>
         <values>
            <value dist="zipf" min="1" max="1000" seed="2"/>
        </values>
    </template>
    <!-- ... -->
<templates>

This PR is not a breaking change - One can still use a static value in the templated queries.
<value>10</value>

In the future, I could see a breaking change that adds the datatype directly to the values so the TemplatedValue can do the type handling directly. <value dist="uniform" type="integer">. This would make type checking easier and remove the need to store the original min/max values as strings for all datatypes.

data/templated/example.xml Outdated Show resolved Hide resolved
data/templated/example.xml Outdated Show resolved Hide resolved
data/templated/example.xml Outdated Show resolved Hide resolved
@bpkroth
Copy link
Collaborator

bpkroth commented Jan 24, 2024

Looking pretty good. Just a few cosmetic polish things left:

  • documentation or improved handling of timestamps as longs issue in the configs
  • enums instead of static strings
  • expanded test coverage for a few cases

data/templated/example.xml Outdated Show resolved Hide resolved
Copy link
Collaborator

@bpkroth bpkroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@bpkroth bpkroth enabled auto-merge (squash) March 25, 2024 16:16
@bpkroth bpkroth merged commit cc2cfa5 into cmu-db:main Mar 25, 2024
130 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants