Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of span2 format #1644

Closed
20 of 22 tasks
codefromthecrypt opened this issue Jul 4, 2017 · 14 comments
Closed
20 of 22 tasks

Implementation of span2 format #1644

codefromthecrypt opened this issue Jul 4, 2017 · 14 comments
Assignees
Labels
enhancement model Modeling of traces

Comments

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jul 4, 2017

This is the implementation of #1499 Not all steps are sequential

Overview

This work supports a new data type which represents a single-host view of an operation. It will be represented in json w/ optional proto3 mapping.

Full description is in #1499, but the format is below:

{
  "traceId": 32lowerHexCharacters,
  "parentId": 16lowerHexCharactersOrAbsentForRoot,
  "id": 16lowerHexCharacters,
  "kind": enum(CLIENT|SERVER|PRODUCER|CONSUMER|Absent),
  "name": stringOrAbsent,
  "timestamp": uint53EpochMicrosOrAbsentIfIncomplete,
  "duration": uint53MicrosOrAbsentIfIncomplete,
  "localEndpoint": existingEndpointTypeOrAbsent,
  "remoteEndpoint": existingEndpointTypeOrAbsent,
  "annotations": [
    {"timestamp": uint53EpochMicros, "value": string},
    ...
  ],
  "tags": {
    string: string,
    ...
  },
  "debug": trueOrAbsent,
  "shared": trueOrAbsent
}

Implementation overview

As we are working in the brown field, various phases can work at their own pace. Tracer libraries should be able to opt-into this without a breaking change to user apis. Any storage or transport work should anticipate mixed formats, but allow the ability to strictly accept the new one. The UI is completely decoupled from this, as all read apis still use the old format. We may revisit such later.

Basic library support

Since the server code is written in java, a first step could be to write the java library

Upload Api Definition

While new servers can detect based on content, we should make it possible for new sites to strictly use the new type.

Storage integration

Once the type + codec library is in, we can start using it on the backend even if the incoming data is in the old format.

Transport integration

Once the type + codec library is in, we can start using it for transport even if the storage layer uses the old model.

Dependency linking integration

The Span2 type is a better fit for dependency links and can replace the internal type currently used.

@codefromthecrypt
Copy link
Member Author

here's the first work on the java binding and codec #1651

codefromthecrypt pushed a commit that referenced this issue Jul 21, 2017
There were tests accidentally assuming merge semantics when testing
unrelated things. This scrubs some of the tests to focus on what they
are testing, and in doing so help pave forward single-host span storage.

See #1644
@codefromthecrypt
Copy link
Member Author

Update: all conversion tests work. I'm going to do a POC on elasticsearch to make sure it works. Once it does, I'll chop up #1651 into several smaller works

@codefromthecrypt codefromthecrypt changed the title Implementation of simple json format Implementation of span2 format Jul 27, 2017
@codefromthecrypt
Copy link
Member Author

@xeraa quick question. Is there an option besides spark to do a data migration?

Ex. I'd like to walk through documents grouped by traceID then write them back modified to a different index.

@xeraa
Copy link
Member

xeraa commented Aug 2, 2017

@adriancole multiple options:

  1. Reindex API: As long as you are using Elasticsearch 5.0+ and can use a query to address the right documents that's the easiest solution. You can also use scripting to change the documents:
POST _reindex
{
  "source": {
    "index": "source_index",
    "query": {
      "match": {
        "foo": "bar"
      }
    }
  },
  "dest": {
    "index": "destination_index"
  },
  "script": {
    "inline": "if (ctx._source.foo == 'bar') {ctx._version++; ctx._source.remove('foo')}",
    "lang": "painless"
  }
}
  1. If you need something more complex, we normally rely on Logstash with Elasticsearch as the input and output and a filter with the selection and transformation in the middle.

@codefromthecrypt
Copy link
Member Author

thx for the help @xeraa

codefromthecrypt pushed a commit that referenced this issue Aug 6, 2017
This adds Elasticsearch 6.x support via single-type indexes:

* zipkin:span-2017-08-05 - span2 (single endpoint) format
* zipkin:dependency-2017-08-05 - dependency links in existing format

This indexing model will be available in the next minor release of
Zipkin, particularly for Elasticsearch 2.4+. If you aren't running
Elasticsearch 2.4+, yet. Please upgrade.

Those wishing to experiment with this format before the next minor
release can set `ES_EXPERIMENTAL_SPAN2=true` to use this style now.
When set, writes will use the above scheme, but both the former and new
indexes will be read.

Fixes #1676
See #1644 for the new span2 model
See #1679 for the dual-read approach, which this is similar to
codefromthecrypt pushed a commit that referenced this issue Aug 10, 2017
This accepts the json format from #1499 on current transports. It does
so by generalizing format detection from the two Kafka libraries, and
a new `SpanDecoder` interface. Types are still internal, but this allows
us to proceed with other work in #1644, including implementing reporters
in any language.

Concretely, you can send a json list of span2 format as a Kafka or Http
message. If using http, use the /api/v2/spans endpoint like so:

```bash
$ curl -X POST -s localhost:9411/api/v2/spans -H'Content-Type: application/json' -d'[{
  "timestamp_millis": 1502101460678,
  "traceId": "9032b04972e475c5",
  "id": "9032b04972e475c5",
  "kind": "SERVER",
  "name": "get",
  "timestamp": 1502101460678880,
  "duration": 612898,
  "localEndpoint": {
    "serviceName": "brave-webmvc-example",
    "ipv4": "192.168.1.113"
  },
  "remoteEndpoint": {
    "serviceName": "",
    "ipv4": "127.0.0.1",
    "port": 60149
  },
  "tags": {
    "error": "500 Internal Server Error",
    "http.path": "/a"
  }
}]'
```
codefromthecrypt pushed a commit that referenced this issue Aug 10, 2017
This accepts the json format from #1499 on current transports. It does
so by generalizing format detection from the two Kafka libraries, and
a new `SpanDecoder` interface. Types are still internal, but this allows
us to proceed with other work in #1644, including implementing reporters
in any language.

Concretely, you can send a json list of span2 format as a Kafka or Http
message. If using http, use the /api/v2/spans endpoint like so:

```bash
$ curl -X POST -s localhost:9411/api/v2/spans -H'Content-Type: application/json' -d'[{
  "timestamp_millis": 1502101460678,
  "traceId": "9032b04972e475c5",
  "id": "9032b04972e475c5",
  "kind": "SERVER",
  "name": "get",
  "timestamp": 1502101460678880,
  "duration": 612898,
  "localEndpoint": {
    "serviceName": "brave-webmvc-example",
    "ipv4": "192.168.1.113"
  },
  "remoteEndpoint": {
    "serviceName": "",
    "ipv4": "127.0.0.1",
    "port": 60149
  },
  "tags": {
    "error": "500 Internal Server Error",
    "http.path": "/a"
  }
}]'
```
codefromthecrypt pushed a commit that referenced this issue Aug 12, 2017
This accepts the json format from #1499 on current transports. It does
so by generalizing format detection from the two Kafka libraries, and
a new `SpanDecoder` interface. Types are still internal, but this allows
us to proceed with other work in #1644, including implementing reporters
in any language.

Concretely, you can send a json list of span2 format as a Kafka or Http
message. If using http, use the /api/v2/spans endpoint like so:

```bash
$ curl -X POST -s localhost:9411/api/v2/spans -H'Content-Type: application/json' -d'[{
  "timestamp_millis": 1502101460678,
  "traceId": "9032b04972e475c5",
  "id": "9032b04972e475c5",
  "kind": "SERVER",
  "name": "get",
  "timestamp": 1502101460678880,
  "duration": 612898,
  "localEndpoint": {
    "serviceName": "brave-webmvc-example",
    "ipv4": "192.168.1.113"
  },
  "remoteEndpoint": {
    "serviceName": "",
    "ipv4": "127.0.0.1",
    "port": 60149
  },
  "tags": {
    "error": "500 Internal Server Error",
    "http.path": "/a"
  }
}]'
```
codefromthecrypt pushed a commit that referenced this issue Aug 12, 2017
This accepts the json format from #1499 on current transports. It does
so by generalizing format detection from the two Kafka libraries, and
a new `SpanDecoder` interface. Types are still internal, but this allows
us to proceed with other work in #1644, including implementing reporters
in any language.

Concretely, you can send a json list of span2 format as a Kafka or Http
message. If using http, use the /api/v2/spans endpoint like so:

```bash
$ curl -X POST -s localhost:9411/api/v2/spans -H'Content-Type: application/json' -d'[{
  "timestamp_millis": 1502101460678,
  "traceId": "9032b04972e475c5",
  "id": "9032b04972e475c5",
  "kind": "SERVER",
  "name": "get",
  "timestamp": 1502101460678880,
  "duration": 612898,
  "localEndpoint": {
    "serviceName": "brave-webmvc-example",
    "ipv4": "192.168.1.113"
  },
  "remoteEndpoint": {
    "serviceName": "",
    "ipv4": "127.0.0.1",
    "port": 60149
  },
  "tags": {
    "error": "500 Internal Server Error",
    "http.path": "/a"
  }
}]'
```
@codefromthecrypt
Copy link
Member Author

#1700 makes the write api span2 native. I'll make a later pull request for the read side. Notably, I'm not going to carry over the merging/clock skew adjustment logic currently copy-pasted into every impl. This simplifies the storage contract which is raw by default now. The merging/clock-skew can be done in an api decorator and/or javascript.

@codefromthecrypt
Copy link
Member Author

#1705 hones the way we'll address callbacks (ex span consumption) and synchronous calls (ex span name lists). This will reduce the effort in implementing v2 interfaces

@codefromthecrypt
Copy link
Member Author

draft of the v2 java read api #1709

@codefromthecrypt
Copy link
Member Author

#1711 << elasticsearch uses spanstore2 api
#1710 << expose spanstore2 http api

@codefromthecrypt
Copy link
Member Author

#1726 << finishes the v2 model (and codec ops required). This doesn't make a v2 "storage component" type, yet, as this can be done later

@codefromthecrypt
Copy link
Member Author

#1729 << adds the v2 storage component

@codefromthecrypt
Copy link
Member Author

openzipkin/zipkin-api#47 starts on the proto3 encoding

@codefromthecrypt
Copy link
Member Author

think we're all done

@codefromthecrypt codefromthecrypt added model Modeling of traces enhancement labels Oct 23, 2018
abesto pushed a commit to abesto/zipkin that referenced this issue Sep 10, 2019
)

This accepts the json format from openzipkin#1499 on current transports. It does
so by generalizing format detection from the two Kafka libraries, and
a new `SpanDecoder` interface. Types are still internal, but this allows
us to proceed with other work in openzipkin#1644, including implementing reporters
in any language.

Concretely, you can send a json list of span2 format as a Kafka or Http
message. If using http, use the /api/v2/spans endpoint like so:

```bash
$ curl -X POST -s localhost:9411/api/v2/spans -H'Content-Type: application/json' -d'[{
  "timestamp_millis": 1502101460678,
  "traceId": "9032b04972e475c5",
  "id": "9032b04972e475c5",
  "kind": "SERVER",
  "name": "get",
  "timestamp": 1502101460678880,
  "duration": 612898,
  "localEndpoint": {
    "serviceName": "brave-webmvc-example",
    "ipv4": "192.168.1.113"
  },
  "remoteEndpoint": {
    "serviceName": "",
    "ipv4": "127.0.0.1",
    "port": 60149
  },
  "tags": {
    "error": "500 Internal Server Error",
    "http.path": "/a"
  }
}]'
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement model Modeling of traces
Projects
None yet
Development

No branches or pull requests

2 participants