Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vparquet4 encoding #3368

Merged
merged 38 commits into from
May 7, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
444a6fc
Copy vparquet3 to vparquet4 folder
stoewer Feb 6, 2024
31140a4
Create initial vParquet4 encoding
stoewer Feb 6, 2024
9775501
vParquet4: handle unsupported / dropped attributes (#3123)
stoewer Feb 6, 2024
41dc8eb
Store per-trace service statistics in vparquet4 blocks (#2941)
andreasgerstmayr Feb 6, 2024
30e79e1
Add support for attributes with array values (#3221)
stoewer Feb 6, 2024
794fb93
links and events schema changes (#3163)
ie-pham Feb 6, 2024
23a260e
Add attributes to instrumentation scope (#3322)
stoewer Feb 6, 2024
a2e10bb
Fix typo in vParquet4 event name enconding (#3336)
stoewer Feb 6, 2024
9024d10
Precalculate and reuse the vParquet4 schema before opening blocks
stoewer Feb 6, 2024
06521ed
Address linter warnings
stoewer Feb 6, 2024
1f46601
Convert block in vparquet4/test-data directory
stoewer Feb 6, 2024
580d50f
CHANGELOG.md
stoewer Feb 7, 2024
16432c9
Update test cases in BenchmarkBackendBlockTraceQL
stoewer Feb 8, 2024
156ba47
Unsupported attribute values are no longer dropped
stoewer Feb 14, 2024
1f6708b
Merge remote-tracking branch 'upstream/main' into add-vparquet4-encoding
stoewer Feb 15, 2024
8ec97c9
Backport: Update traceql metrics to use the trace-level timestamp col…
stoewer Feb 15, 2024
91b1236
Add ServiceStats data to existing TraceQL engine tests (#2)
andreasgerstmayr Feb 16, 2024
9393edf
Fix combining service stats by using max() (#1)
andreasgerstmayr Feb 16, 2024
d79cea6
Recalculate ServiceStats in combiner by moving it to separate functio…
andreasgerstmayr Apr 10, 2024
9ab5514
Move serviceStats iterator to second pass (#4)
andreasgerstmayr Apr 10, 2024
9bef5e8
Merge remote-tracking branch 'upstream/main' into add-vparquet4-encoding
stoewer Apr 15, 2024
c89e606
Backport: Better trace ID sharding (#3399)
stoewer Apr 16, 2024
9b81152
Backport: Add string interning to SyncIterator (#3411)
stoewer Apr 16, 2024
66a54e8
Backport: Traceql better pooling (#3432)
stoewer Apr 16, 2024
ff97bb8
Backport: New (unsafe) query hints (#3396)
stoewer Apr 16, 2024
dc10143
Backport: Redo 8-byte trace ID sharding to reduce number of row groups
stoewer Apr 16, 2024
0587418
Backport: TraceQL nested set intrinsics (#3497)
stoewer Apr 16, 2024
ad936e3
Backport: Parquetquery iterator less pooling (#3508)
stoewer Apr 16, 2024
665b765
Remove searchProgress handler
stoewer Apr 16, 2024
3627f3d
Fix tests: Unset service stats in TraceSearchMeta
stoewer Apr 16, 2024
0c139e9
Backport: Fix pool leak when iter exhausted and async pages for bette…
stoewer Apr 19, 2024
19dbeab
make ServiceStats an intrinsic (#5)
andreasgerstmayr Apr 28, 2024
2ef85b3
Replace ValueType column with IsArray column
stoewer Apr 18, 2024
d290290
Skip service stats map allocation when none are present
stoewer Apr 30, 2024
a8abcb5
Fix performance regression in existing parquet encodings
stoewer May 3, 2024
8108e03
Merge remote-tracking branch 'upstream/main' into add-vparquet4-encoding
stoewer May 6, 2024
393b2a7
Merge remote-tracking branch 'upstream/main' into add-vparquet4-encoding
stoewer May 6, 2024
8c71ae5
Use same queries in BackendBlockTraceQL
stoewer May 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ linters-settings:
exclude: ./.errcheck-exclude.txt
errorlint:
errorf: false # don't check whether fmt.Errorf uses %w to wrap errors
goconst:
ignore-tests: true
joe-elliott marked this conversation as resolved.
Show resolved Hide resolved

issues:
exclude:
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
## main / unreleased

* [ENHANCEMENT] Add tracing integration to profiling endpoints [#3276](https://github.com/grafana/tempo/pull/3276) (@cyriltovena)
* [FEATURE] New block encoding vParquet4 with support for links, events, and arrays [#3368](https://github.com/grafana/tempo/pull/3368) (@stoewer @ie-pham @andreasgerstmayr)
joe-elliott marked this conversation as resolved.
Show resolved Hide resolved
* [FEATURE] Add configuration on tempo-query plugin for fetch services older than complete_block_timeout [#3262](https://github.com/grafana/tempo/pull/3350) (@rubenvp8510)
* [FEATURE] Add support for multi-tenant queries in streaming search [#3262](https://github.com/grafana/tempo/pull/3262) (@electron0zero)
* [FEATURE] TraceQL metrics queries [#3227](https://github.com/grafana/tempo/pull/3227) [#3252](https://github.com/grafana/tempo/pull/3252) [#3258](https://github.com/grafana/tempo/pull/3258) (@mdisibio @zalegrala)
Expand Down
11 changes: 11 additions & 0 deletions modules/frontend/search_progress.go
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,17 @@ func (r *searchProgress) result() *shardedSearchResults {
SpanSet: copySpanset(m.SpanSet),
}

// copy service stats
if len(m.ServiceStats) > 0 {
joe-elliott marked this conversation as resolved.
Show resolved Hide resolved
mCopy.ServiceStats = make(map[string]*tempopb.ServiceStats, len(m.ServiceStats))
for service, stat := range m.ServiceStats {
mCopy.ServiceStats[service] = &tempopb.ServiceStats{
SpanCount: stat.SpanCount,
ErrorCount: stat.ErrorCount,
}
}
}

// now copy spansets
if len(m.SpanSets) > 0 {
mCopy.SpanSets = make([]*tempopb.SpanSet, 0, len(m.SpanSets))
Expand Down
785 changes: 578 additions & 207 deletions pkg/tempopb/tempo.pb.go

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions pkg/tempopb/tempo.proto
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,12 @@ message TraceSearchMetadata {
uint32 durationMs = 5;
SpanSet spanSet = 6; // deprecated. use SpanSets field below
repeated SpanSet spanSets = 7;
map<string, ServiceStats> serviceStats = 8;
}

message ServiceStats {
uint32 spanCount = 1;
uint32 errorCount = 2;
}

message SpanSet {
Expand Down
11 changes: 11 additions & 0 deletions pkg/traceql/combine.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,17 @@ func combineSearchResults(existing *tempopb.TraceSearchMetadata, incoming *tempo
existing.DurationMs = incoming.DurationMs
}

// Merge service stats
for service, incomingStats := range incoming.ServiceStats {
existingStats, ok := existing.ServiceStats[service]
stoewer marked this conversation as resolved.
Show resolved Hide resolved
if !ok {
existingStats = &tempopb.ServiceStats{}
existing.ServiceStats[service] = existingStats
}
existingStats.SpanCount += incomingStats.SpanCount
existingStats.ErrorCount += incomingStats.ErrorCount
}

// make a map of existing Spansets
existingSS := make(map[string]*tempopb.SpanSet)
for _, ss := range existing.SpanSets {
Expand Down
8 changes: 8 additions & 0 deletions pkg/traceql/engine.go
Original file line number Diff line number Diff line change
Expand Up @@ -231,9 +231,17 @@ func (e *Engine) asTraceSearchMetadata(spanset *Spanset) *tempopb.TraceSearchMet
RootTraceName: spanset.RootSpanName,
StartTimeUnixNano: spanset.StartTimeUnixNanos,
DurationMs: uint32(spanset.DurationNanos / 1_000_000),
ServiceStats: make(map[string]*tempopb.ServiceStats, len(spanset.ServiceStats)),
SpanSet: &tempopb.SpanSet{},
}

for service, stats := range spanset.ServiceStats {
stoewer marked this conversation as resolved.
Show resolved Hide resolved
metadata.ServiceStats[service] = &tempopb.ServiceStats{
SpanCount: stats.SpanCount,
ErrorCount: stats.ErrorCount,
}
}

for _, span := range spanset.Spans {
tempopbSpan := &tempopb.Span{
SpanID: util.SpanIDToHexString(span.ID()),
Expand Down
2 changes: 2 additions & 0 deletions pkg/traceql/engine_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ func TestEngine_Execute(t *testing.T) {
TraceID: "1",
RootServiceName: "my-service",
RootTraceName: "HTTP GET",
ServiceStats: map[string]*tempopb.ServiceStats{},
SpanSet: expectedSpanset,
SpanSets: []*tempopb.SpanSet{expectedSpanset},
},
Expand Down Expand Up @@ -321,6 +322,7 @@ func TestEngine_asTraceSearchMetadata(t *testing.T) {
RootTraceName: "HTTP GET",
StartTimeUnixNano: 1000,
DurationMs: uint32(time.Second.Milliseconds()),
ServiceStats: map[string]*tempopb.ServiceStats{},
SpanSet: expectedSpanset,
SpanSets: []*tempopb.SpanSet{expectedSpanset},
}
Expand Down
6 changes: 6 additions & 0 deletions pkg/traceql/storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,11 @@ type SpansetAttribute struct {
Val Static
}

type ServiceStats struct {
SpanCount uint32
ErrorCount uint32
}

type Spanset struct {
// these fields are actually used by the engine to evaluate queries
Scalar Static
Expand All @@ -141,6 +146,7 @@ type Spanset struct {
RootServiceName string
StartTimeUnixNanos uint64
DurationNanos uint64
ServiceStats map[string]ServiceStats
Attributes []*SpansetAttribute

// Set this function to provide upstream callers with a method to
Expand Down
6 changes: 5 additions & 1 deletion tempodb/encoding/versioned.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import (
"github.com/grafana/tempo/tempodb/encoding/vparquet"
"github.com/grafana/tempo/tempodb/encoding/vparquet2"
"github.com/grafana/tempo/tempodb/encoding/vparquet3"
"github.com/grafana/tempo/tempodb/encoding/vparquet4"
)

// VersionedEncoding represents a backend block version, and the methods to
Expand Down Expand Up @@ -66,6 +67,8 @@ func FromVersion(v string) (VersionedEncoding, error) {
return vparquet2.Encoding{}, nil
case vparquet3.VersionString:
return vparquet3.Encoding{}, nil
case vparquet4.VersionString:
return vparquet4.Encoding{}, nil
default:
return nil, fmt.Errorf("%s is not a valid block version", v)
}
Expand All @@ -78,7 +81,7 @@ func DefaultEncoding() VersionedEncoding {

// LatestEncoding returns the most recent encoding.
func LatestEncoding() VersionedEncoding {
return vparquet3.Encoding{}
return vparquet4.Encoding{}
joe-elliott marked this conversation as resolved.
Show resolved Hide resolved
}

// AllEncodings returns all encodings
Expand All @@ -88,6 +91,7 @@ func AllEncodings() []VersionedEncoding {
vparquet.Encoding{},
vparquet2.Encoding{},
vparquet3.Encoding{},
vparquet4.Encoding{},
}
}

Expand Down
24 changes: 14 additions & 10 deletions tempodb/encoding/vparquet3/block_traceql_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -564,22 +564,26 @@ func BenchmarkBackendBlockTraceQL(b *testing.B) {
query string
}{
// span
{"spanAttValNoMatch", "{ span.bloom = `bar` }"},
{"spanAttIntrinsicNoMatch", "{ name = `asdfasdf` }"},
{"spanAttValMatch", "{ span.component = `net/http` }"},
{"spanAttValNoMatch", "{ span.bloom = `does-not-exit-6c2408325a45` }"},
{"spanAttIntrinsicMatch", "{ name = `/cortex.Ingester/Push` }"},
{"spanAttIntrinsicNoMatch", "{ name = `does-not-exit-6c2408325a45` }"},

// resource
{"resourceAttValNoMatch", "{ resource.module.path = `bar` }"},
{"resourceAttIntrinsicMatch", "{ resource.service.name = `tempo-query-frontend` }"},
{"resourceAttValMatch", "{ resource.opencensus.exporterversion = `Jaeger-Go-2.30.0` }"},
{"resourceAttValNoMatch", "{ resource.module.path = `does-not-exit-6c2408325a45` }"},
{"resourceAttIntrinsicMatch", "{ resource.service.name = `tempo-gateway` }"},
{"resourceAttIntrinsicMatch", "{ resource.service.name = `does-not-exit-6c2408325a45` }"},

// mixed
{"mixedValNoMatch", "{ .bloom = `bar` }"},
{"mixedValNoMatch", "{ .bloom = `does-not-exit-6c2408325a45` }"},
{"mixedValMixedMatchAnd", "{ resource.foo = `bar` && name = `gcs.ReadRange` }"},
{"mixedValMixedMatchOr", "{ resource.foo = `bar` || name = `gcs.ReadRange` }"},

{"count", "{ } | count() > 1"},
{"struct", "{ resource.service.name != `loki-querier` } >> { resource.service.name = `loki-querier` && status = error }"},
{"||", "{ resource.service.name = `loki-querier` } || { resource.service.name = `loki-ingester` }"},
{"mixed", `{resource.namespace!="" && resource.service.name="loki-distributor" && duration>2s && resource.cluster=~"prod.*"}`},
{"struct", "{ resource.service.name != `loki-querier` } >> { resource.service.name = `loki-gateway` && status = error }"},
{"||", "{ resource.service.name = `loki-querier` } || { resource.service.name = `loki-gateway` }"},
{"mixed", `{resource.namespace!="" && resource.service.name="cortex-gateway" && duration>50ms && resource.cluster=~"prod.*"}`},
}

ctx := context.TODO()
Expand All @@ -598,8 +602,8 @@ func BenchmarkBackendBlockTraceQL(b *testing.B) {
require.NoError(b, err)

opts := common.DefaultSearchOptions()
opts.StartPage = 10
opts.TotalPages = 1
opts.StartPage = 3
opts.TotalPages = 2

block := newBackendBlock(meta, rr)
_, _, err = block.openForSearch(ctx, opts)
Expand Down
32 changes: 32 additions & 0 deletions tempodb/encoding/vparquet4/block.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
package vparquet4

import (
"sync"

"github.com/grafana/tempo/tempodb/backend"
"github.com/grafana/tempo/tempodb/encoding/common"
)

const (
DataFileName = "data.parquet"
)

type backendBlock struct {
meta *backend.BlockMeta
r backend.Reader

openMtx sync.Mutex
}

var _ common.BackendBlock = (*backendBlock)(nil)

func newBackendBlock(meta *backend.BlockMeta, r backend.Reader) *backendBlock {
return &backendBlock{
meta: meta,
r: r,
}
}

func (b *backendBlock) BlockMeta() *backend.BlockMeta {
return b.meta
}
Loading
Loading