Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] use ISO dates in output instead of epoch millis #65584

Merged
merged 16 commits into from
Dec 7, 2020

Conversation

hendrikmuhs
Copy link
Contributor

@hendrikmuhs hendrikmuhs commented Nov 30, 2020

Transform writes dates as epoch millis, this does not work for historic data in some cases or is unsupported. Dates should be written as such. With this PR transform starts writing dates in ISO format, but as existing transform might rely on the format it provides backwards compatibility for old jobs as well as a setting to write dates as epoch millis.

fixes #63787

Example

The following example illustrates the change, we use the following config for pivot (dataset: kibana_sample_data_logs):

  "pivot": {
    "group_by": {
      "@timestamp": {
        "date_histogram": {
          "field": "@timestamp",
          "calendar_interval": "1m"
        }
      }
    },
    "aggregations": {
      "latest_timestamp": {
        "max": {
          "field": "@timestamp"
        }
      },
      "agent_dc": {
        "cardinality": {
          "field": "agent.keyword"
        }
      }
    }

The change affects how @timestamp is written in _source (note: dates in the aggs part are already written as ISO string, with the change dates in pivot.group_by and pivot.aggregations behave consistently):

Before

        "_index" : "t3",
        "_id" : "ANAnyse5SVpHo-UXL4f8598AAAAAAAAA",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : 1606005540000,
          "latest_timestamp" : "2020-11-22T00:39:02.912Z",
          "agent_dc" : 1
        }

After

        "_index" : "t4",
        "_id" : "ANAnyse5SVpHo-UXL4f8598AAAAAAAAA",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2020-11-22T00:39:00.000Z",
          "latest_timestamp" : "2020-11-22T00:39:02.912Z",
          "agent_dc" : 1
        }

Doc values do not change, because either way it is parsed as date (but parsing can't fail anymore if dates <1970).

In case you rely on the old format (because you use the produced _source in an application), you can get back the old style with:

  "settings": {
    "dates_as_epoch_millis": true
  }

When updating an old transform using _update, this is automatically set. If you want to migrate an old transform to the new style, you can use _update to explicitly set dates_as_epoch_millis to false or null (== default).

Docs preview for the added parameter: https://elasticsearch_65584.docs-preview.app.elstc.co/diff

Post PR actions:

  • document change in "Breaking changes" for 7.11
  • BWC

@hendrikmuhs hendrikmuhs marked this pull request as ready for review December 1, 2020 11:54
@hendrikmuhs
Copy link
Contributor Author

@elasticmachine update branch

@hendrikmuhs
Copy link
Contributor Author

@szabosteve Can you have a look at the doc changes: elasticsearch_65584.docs-preview.app.elstc.co/diff

Copy link
Contributor

@szabosteve szabosteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs changes LGTM, thanks.
It's just a nit, so please take it or leave it: we try to order the parameters alphabetically in the API docs and in the common params file (I know, we are not always consistent...). If you don't have time for changing it, no problem, common-parms needs to be reviewed anyway soon.

@hendrikmuhs
Copy link
Contributor Author

It's just a nit, so please take it or leave it: we try to order the parameters alphabetically

I fixed it. I blame the last minute rename of the parameter, the 1st version was write_da....

Copy link
Contributor

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

with just two small remarks

@hendrikmuhs hendrikmuhs merged commit 9b47889 into elastic:master Dec 7, 2020
@hendrikmuhs hendrikmuhs deleted the transform-date-not-epoch branch December 7, 2020 14:34
hendrikmuhs pushed a commit that referenced this pull request Dec 7, 2020
) (#65952)

Transform writes dates as epoch millis, this does not work for historic data in some cases or is
unsupported. Dates should be written as such. With this PR transform starts writing dates in ISO
format, but as existing transform might rely on the format it provides backwards compatibility for
old jobs as well as a setting to write dates as epoch millis.

fixes #63787
backport #65584
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Transform] Continous transform failure when grouping dates using terms agg
7 participants