Skip to content

Commit

Permalink
Update user guide
Browse files Browse the repository at this point in the history
  • Loading branch information
e10v committed Jul 14, 2024
1 parent 5abf932 commit 9dfd143
Showing 1 changed file with 90 additions and 9 deletions.
99 changes: 90 additions & 9 deletions docs/user-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,18 +50,18 @@ Many statistical tests, like Student's t-test or Z-test, don't need granular dat

**tea-tasting** assumes that:

- The data is grouped by randomization units, such as individual users.
- There is a column indicating the variant of the A/B test (typically labeled as A, B, etc.).
- Data is grouped by randomization units, such as individual users.
- There is a column indicating variant of the A/B test (typically labeled as A, B, etc.).
- All necessary columns for metric calculations (like the number of orders, revenue, etc.) are included in the table.

### A/B test definition

The [`Experiment`](api/experiment.md) class defines the parameters of an A/B test: metrics and a variant column name. There are two ways to define metrics:
The [`Experiment`](api/experiment.md) class defines parameters of an A/B test: metrics and a variant column name. There are two ways to define metrics:

- Using keyword parameters, with metric names as parameter names and metric definitions as parameter values, as in example above.
- Using keyword parameters, with metric names as parameter names, and metric definitions as parameter values, as in example above.
- Using the first argument `metrics` which accepts metrics in a form of dictionary with metric names as keys and metric definitions as values.

By default, **tea-testing** assumes that A/B test variant is stored in a column named `"variant"`. You can change it using the `variant` parameter of the `Experiment` class.
By default, **tea-testing** assumes that A/B test variant is stored in a column named `"variant"`. You can change it, using the `variant` parameter of the `Experiment` class.

Example usage:

Expand All @@ -83,7 +83,7 @@ Metrics are instances of metric classes which define how metrics are calculated.

Use the [`Mean`](api/metrics/mean.md#tea_tasting.metrics.mean.Mean) class to compare averages between variants of an A/B test. For example, average number of orders per user, where user is a randomization unit of an experiment. Specify the column containing the metric values using the first parameter `value`.

Use the [`RatioOfMeans`](api/metrics/mean.md#tea_tasting.metrics.mean.RatioOfMeans) class to compare ratios of averages between variants of an A/B test. For example, average number of orders per average number of sessions. Specify the columns containing the numerator and denominator values using the parameters `numer` and `denom`.
Use the [`RatioOfMeans`](api/metrics/mean.md#tea_tasting.metrics.mean.RatioOfMeans) class to compare ratios of averages between variants of an A/B test. For example, average number of orders per average number of sessions. Specify the columns containing the numerator and denominator values using parameters `numer` and `denom`.

Use the following parameters of `Mean` and `RatioOfMeans` to customize the analysis:

Expand All @@ -108,7 +108,7 @@ experiment = tt.Experiment(

Look for other supported metrics in the [Metrics](api/metrics/index.md) reference.

You can change the default values of these four parameters using [global settings](#global-settings).
You can change default values of these four parameters using the [global settings](#global-settings).

### Analyzing and retrieving experiment results

Expand All @@ -118,7 +118,7 @@ After defining an experiment and metrics, you can analyze the experiment data us
result = experiment.analyze(data)
```

By default, **tea-tasting** assumes that the variant with the lowest ID is a control. Change the default behavior using the `control` parameter:
By default, **tea-tasting** assumes that the variant with the lowest ID is a control. Change default behavior using the `control` parameter:

```python
result = experiment.analyze(data, control=0)
Expand All @@ -136,7 +136,7 @@ print(result["orders_per_user"])
#> statistic=1.5647028839586694)
```

The fields in the result depend on metrics. For `Mean` and `RatioOfMeans`, the [fields include](api/metrics/mean.md#tea_tasting.metrics.mean.MeanResult):
Fields in result depend on metrics. For `Mean` and `RatioOfMeans`, the [fields include](api/metrics/mean.md#tea_tasting.metrics.mean.MeanResult):

- `metric`: Metric name.
- `control`: Mean or ratio of means in the control variant.
Expand Down Expand Up @@ -272,6 +272,87 @@ The [result](api/metrics/proportion.md#tea_tasting.metrics.proportion.SampleRati
- `treatment`: Number of observations in treatment.
- `pvalue`: P-value

### Power analysis

In **tea-tasting**, you can analyze statistical power for `Mean` and `RatioOfMeans` metrics. There are three possible options:

- Calculate the effect size, given statistical power and the total number of observations.
- Calculate the total number of observations, given statistical power and the effect size.
- Calculate statistical power, given the effect size and the total number of observations.

In the following example, **tea-tasting** calculates statistical power given the relative effect size and the number of observations:

```python
import tea_tasting as tt


data = tt.make_users_data(
seed=42,
sessions_uplift=0,
orders_uplift=0,
revenue_uplift=0,
covariates=True,
)

orders_per_session = tt.RatioOfMeans("orders", "sessions", rel_effect_size=0.1)
print(orders_per_session.solve_power(data, "power"))
#> power effect_size rel_effect_size n_obs
#> 52% 0.0261 10% 4000
```

Besides `alternative`, `equal_var`, `use_t`, and covariates (CUPED), the following metric parameters impact the result:

- `alpha`: Significance level.
- `ratio`: Ratio of the number of observations in the treatment relative to the control.
- `power`: Statistical power.
- `effect_size` and `rel_effect_size`: Absolute and relative effect size. Only one of them can be defined.
- `n_obs`: Number of observations in the control and in the treatment together. If the number of observations is not set explicitly, it's inferred from the dataset.

You can change default values of `alpha`, `ratio`, `power`, and `n_obs` using the [global settings](#global-settings).

**tea-tasting** can analyze power for several values of parameters `effect_size`, `rel_effect_size`, or `n_obs`. Example:

```python
orders_per_user = tt.Mean("orders", alpha=0.1, power=0.7, n_obs=(10_000, 20_000))
print(orders_per_user.solve_power(data, "rel_effect_size"))
#> power effect_size rel_effect_size n_obs
#> 70% 0.0367 7.1% 10000
#> 70% 0.0260 5.0% 20000
```

You can analyze power for all metrics in the experiment. Example:

```python
with tt.config_context(n_obs=(10_000, 20_000)):
experiment = tt.Experiment(
sessions_per_user=tt.Mean("sessions", "sessions_covariate"),
orders_per_session=tt.RatioOfMeans(
numer="orders",
denom="sessions",
numer_covariate="orders_covariate",
denom_covariate="sessions_covariate",
),
orders_per_user=tt.Mean("orders", "orders_covariate"),
revenue_per_user=tt.Mean("revenue", "revenue_covariate"),
)

power_result = experiment.solve_power(data)
print(power_result)
#> metric power effect_size rel_effect_size n_obs
#> sessions_per_user 80% 0.0458 2.3% 10000
#> sessions_per_user 80% 0.0324 1.6% 20000
#> orders_per_session 80% 0.0177 6.8% 10000
#> orders_per_session 80% 0.0125 4.8% 20000
#> orders_per_user 80% 0.0374 7.2% 10000
#> orders_per_user 80% 0.0264 5.1% 20000
#> revenue_per_user 80% 0.488 9.2% 10000
#> revenue_per_user 80% 0.345 6.5% 20000
```

In the example above, **tea-tasting** calculates the relative and absolute effect size for all metrics for two possible sample size values, `10_000` and `20_000`.

The `solve_power` methods of a [metric](api/metrics/mean.md#tea_tasting.metrics.mean.Mean.solve_power) and of an [experiment](api/experiment.md#tea_tasting.experiment.Experiment.solve_power) return the instances of [`MeanPowerResult`](api/metrics/mean.md#tea_tasting.metrics.mean.MeanPowerResult) and [`ExperimentPowerResult`](api/experiment.md#tea_tasting.experiment.ExperimentPowerResult) respectively. These result classes provide the serialization methods similar to the experiment result: `to_dicts`, `to_pandas`, `to_pretty`, `to_string`, `to_html`.

### Global settings

In **tea-tasting**, you can change defaults for the following parameters:
Expand Down

0 comments on commit 9dfd143

Please sign in to comment.