Skip to content

Commit

Permalink
Integrate CRUD statistics with metrics rock
Browse files Browse the repository at this point in the history
If `metrics` [1] found, you can use metrics collectors to store
statistics. `metrics >= 0.10.0` is required to use metrics driver.
(`metrics >= 0.9.0` is required to use summary quantiles with
age buckets. `metrics >= 0.5.0, < 0.9.0` is unsupported
due to quantile overflow bug [2]. `metrics == 0.9.0` has bug that do
not permits to create summary collector without quantiles [3].
In fact, user may use `metrics >= 0.5.0`, `metrics != 0.9.0`
if he wants to use metrics without quantiles, and `metrics >= 0.9.0`
if he wants to use metrics with quantiles. But this is confusing,
so let's use a single restriction for both cases.)

The metrics are part of global registry and can be exported together
(e.g. to Prometheus) with default tools without any additional
configuration. Disabling stats destroys the collectors.

Metrics collectors are used by default if supported. To explicitly set
driver, call `crud.enable_stats{ driver = driver }` ('local' or
'metrics'). To enable quantiles, call
`crud.enable_stats{ driver = 'metrics', quantiles = true }`.
With quantiles, `latency` statistics are changed to 0.99 quantile
of request execution time (with aging). Quantiles computations increases
performance overhead by near 10% when used in statistics.

Add CI matrix to run tests with `metrics` installed. To get full
coverage on coveralls, #248 must be resolved.

1. https://github.com/tarantool/metrics
2. tarantool/metrics#235
3. tarantool/metrics#262

Closes #224
  • Loading branch information
DifferentialOrange committed Jan 28, 2022
1 parent 620994a commit bc7ef2b
Show file tree
Hide file tree
Showing 9 changed files with 1,234 additions and 150 deletions.
21 changes: 20 additions & 1 deletion .github/workflows/test_on_push.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,24 @@ jobs:
matrix:
# We need 1.10.6 here to check that module works with
# old Tarantool versions that don't have "tuple-keydef"/"tuple-merger" support.
tarantool-version: ["1.10.6", "1.10", "2.2", "2.3", "2.4", "2.5", "2.6", "2.7"]
tarantool-version: ["1.10.6", "1.10", "2.2", "2.3", "2.4", "2.5", "2.6", "2.7", "2.8"]
metrics-version: [""]
remove-merger: [false]
perf-test: [false]
include:
- tarantool-version: "1.10"
metrics-version: "0.12.0"
perf-test: true
- tarantool-version: "2.7"
remove-merger: true
- tarantool-version: "2.8"
metrics-version: "0.1.8"
- tarantool-version: "2.8"
metrics-version: "0.10.0"
- tarantool-version: "2.8"
coveralls: true
metrics-version: "0.12.0"
perf-test: true
fail-fast: false
runs-on: [ubuntu-latest]
steps:
Expand Down Expand Up @@ -47,6 +58,10 @@ jobs:
tarantool --version
./deps.sh
- name: Install metrics
if: matrix.metrics-version != ''
run: tarantoolctl rocks install metrics ${{ matrix.metrics-version }}

- name: Remove external merger if needed
if: ${{ matrix.remove-merger }}
run: rm .rocks/lib/tarantool/tuple/merger.so
Expand All @@ -62,6 +77,10 @@ jobs:
- name: Run tests and code coverage analysis
run: make -C build coverage

- name: Run performance tests
run: make -C build performance
if: ${{ matrix.perf-test }}

- name: Send code coverage to coveralls.io
run: make -C build coveralls
if: ${{ matrix.coveralls }}
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

### Added
* Statistics for CRUD operations on router (#224).
* Integrate CRUD statistics with [`metrics`](https://github.com/tarantool/metrics) (#224).

### Changed

Expand Down
8 changes: 8 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,14 @@ add_custom_target(luatest
COMMENT "Run regression tests"
)

set(PERFORMANCE_TESTS_SUBDIR "test/performance")

add_custom_target(performance
COMMAND PERF_MODE_ON=true ${LUATEST} -v -c ${PERFORMANCE_TESTS_SUBDIR}
WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
COMMENT "Run performance tests"
)

add_custom_target(coverage
COMMAND ${LUACOV} ${PROJECT_SOURCE_DIR} && grep -A999 '^Summary' ${CODE_COVERAGE_REPORT}
DEPENDS ${CODE_COVERAGE_STATS}
Expand Down
63 changes: 61 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -606,8 +606,24 @@ crud.disable_stats()
crud.reset_stats()
```

Format is as follows.
If [`metrics`](https://github.com/tarantool/metrics) `0.10.0` or greater
found, metrics collectors will be used by default to store statistics
instead of local collectors. You can manually choose driver if needed.
```lua
-- Use metrics collectors. (Default if metrics found).
crud.enable_stats({ driver = 'metrics' })

-- Use metrics collectors with 0.99 quantile.
crud.enable_stats({ driver = 'metrics', quantiles = true })

-- Use simple local collectors.
crud.enable_stats({ driver = 'local' })
```
Performance overhead is 3-7% in case of `local` driver and
5-10% in case of `metrics` driver, up to 20% for `metrics` with quantiles.

Format is as follows.
```
crud.stats()
---
- spaces:
Expand Down Expand Up @@ -657,9 +673,44 @@ Possible statistics operation labels are
Each operation section contains of different collectors
for success calls and error (both error throw and `nil, err`)
returns. `count` is total requests count since instance start
or stats restart. `latency` is average time of requests execution,
or stats restart. `latency` is 0.99 quantile of request execution
time if `metrics` driver used and quantiles enabled,
otherwise `latency` is total average.
`time` is total time of requests execution.

In [`metrics`](https://www.tarantool.io/en/doc/latest/book/monitoring/)
registry statistics are stored as `tnt_crud_stats` metrics
with `operation`, `status` and `name` labels. Collector
`tnt_crud_space_not_found` stores count of calls to unknown spaces,
`tnt_crud_schema_reloads` stores count of schema reloads in calls.
```
metrics:collect()
---
- - label_pairs:
status: ok
operation: insert
name: customers
value: 221411
metric_name: tnt_crud_stats_count
- label_pairs:
status: ok
operation: insert
name: customers
value: 10.49834896344692
metric_name: tnt_crud_stats_sum
- label_pairs:
status: ok
operation: insert
name: customers
quantile: 0.99
value: 0.00023606420935973
metric_name: tnt_crud_stats
- label_pairs: []
value: 3
metric_name: tnt_crud_space_not_found
...
```

`select` section additionally contains `details` collectors.
```lua
crud.stats('my_space').select.details
Expand All @@ -674,9 +725,17 @@ crud.stats('my_space').select.details
is a count of tuples fetched from storages during execution,
`tuples_lookup` is a count of tuples looked up on storages
while collecting response for call.
In [`metrics`](https://www.tarantool.io/en/doc/latest/book/monitoring/)
registry they are stored as `tnt_crud_map_reduces`,
`tnt_crud_tuples_fetched` and `tnt_crud_tuples_lookup` metrics
with `{ operation = 'select', name = space_name }` labels.

Statistics are preserved between package reloads or [Tarantool Cartridge
role reloads](https://www.tarantool.io/en/doc/latest/book/cartridge/cartridge_api/modules/cartridge.roles/#reload).
Beware that metrics 0.12.0 and below do not support
preserving stats between role reload
(see [tarantool/metrics#334](https://github.com/tarantool/metrics/issues/334)),
thus this feature will be unsupported for `metrics` driver.

## Cartridge roles

Expand Down
17 changes: 15 additions & 2 deletions crud/stats/local_registry.lua
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,16 @@
-- @module crud.stats.local_registry
--

local errors = require('errors')

local dev_checks = require('crud.common.dev_checks')
local op_module = require('crud.stats.operation')
local registry_common = require('crud.stats.registry_common')
local stash = require('crud.stats.stash')

local registry = {}
local internal = stash.get('local_registry')
local StatsLocalError = errors.new_class('StatsLocalError', {capture_stack = false})

--- Initialize local metrics registry.
--
Expand All @@ -17,9 +20,19 @@ local internal = stash.get('local_registry')
--
-- @function init
--
-- @treturn boolean Returns true.
-- @tab opts
--
function registry.init()
-- @bool opts.quantiles
-- Quantiles is not supported for local, only `false` is valid.
--
-- @treturn boolean Returns `true`.
--
function registry.init(opts)
dev_checks({ quantiles = 'boolean' })

StatsLocalError:assert(opts.quantiles == false,
"Quantiles are not supported for 'local' statistics registry")

internal.registry = {}
internal.registry.spaces = {}
internal.registry.space_not_found = 0
Expand Down
Loading

0 comments on commit bc7ef2b

Please sign in to comment.