Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix use of time zone in date_histogram rewrite #31407

Merged
merged 2 commits into from
Jun 20, 2018

Conversation

cbuescher
Copy link
Member

Currently, DateHistogramAggregationBuilder#rewriteTimeZone uses the aggregation
date math parser and time zone to check whether all values in a read have the
same timezone to speed up computation. However, the upper and lower bounds to
check are retrieved as longs in epoch time, so they don't need to get parsed
using a time zone or a parser other than "epoch_millis". This PR changes this
behaviour that was causing problems when the field type mapping was specifying
only "epoch_millis" as a format but a different time zone than UTC was used.

Closes #31392

Currently, DateHistogramAggregationBuilder#rewriteTimeZone uses the aggregation
date math parser and time zone to check whether all values in a read have the
same timezone to speed up computation. However, the upper and lower bounds to
check are retrieved as longs in epoch_millis, so they don't need to get parsed
using a time zone or a parser other than "epoch_millis". This changes this
behaviour that was causing problems when the field type mapping was specifying
only "epoch_millis" as a format but a different timezone than UTC was used.

Closes elastic#31392
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this was more sneaky than I expected.

final DocValueFormat format = ft.docValueFormat(null, null);
final Object formattedLow = format.format(low);
final Object formattedHigh = format.format(high);
final Object formattedLow = DocValueFormat.RAW.format(low);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the field is a date, it's a bit confusing to use the RAW format imo, maybe use a new DocValueFormat.Date(epoch_millis, DateTimeZone.UTC) instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was even thinking about just boxing the long to a Long because there really isn't much more to it in this case. Would that work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It relies on the assumption that we store millis internally, but I think that's fine. 👍

@@ -70,6 +72,7 @@
public class DateHistogramAggregationBuilder extends ValuesSourceAggregationBuilder<ValuesSource.Numeric, DateHistogramAggregationBuilder>
implements MultiBucketAggregationBuilder {
public static final String NAME = "date_histogram";
private static DateMathParser DEFAULT_DATE_PARSER = new DateMathParser(DateFieldMapper.DEFAULT_DATE_TIME_FORMATTER);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the epoch_millis parser specifically rather than a parser that understands epoch millis among other formats?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I did that, might have not pushed it though because I think it was a late addition. Will do.

@cbuescher
Copy link
Member Author

@jpountz I pushed an update with the changes you requested

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@cbuescher cbuescher merged commit df10704 into elastic:master Jun 20, 2018
cbuescher pushed a commit that referenced this pull request Jun 20, 2018
Currently, DateHistogramAggregationBuilder#rewriteTimeZone uses the aggregation
date math parser and time zone to check whether all values in a read have the
same timezone to speed up computation. However, the upper and lower bounds to
check are retrieved as longs in epoch_millis, so they don't need to get parsed
using a time zone or a parser other than "epoch_millis". This changes this
behaviour that was causing problems when the field type mapping was specifying
only "epoch_millis" as a format but a different timezone than UTC was used.

Closes #31392
@jpountz
Copy link
Contributor

jpountz commented Jun 20, 2018

@cbuescher Should we use the non-issue label rather than bug given that this bug is not released?

@cbuescher
Copy link
Member Author

@jpountz makes sense in terms of release notes I think

@cbuescher cbuescher added >non-issue and removed >bug labels Jun 20, 2018
dnhatn added a commit that referenced this pull request Jun 20, 2018
* 6.x:
  [DOCS] Omit shard failures assertion for incompatible responses  (#31430)
  [DOCS] Move licensing APIs to docs (#31445)
  backport of: add is-write-index flag to aliases (#30942) (#31412)
  backport of: Add rollover-creation-date setting to rolled over index (#31144) (#31413)
  [Docs] Extend Homebrew installation instructions (#28902)
  [Docs] Mention ip_range datatypes on ip type page (#31416)
  Multiplexing token filter (#31208)
  Fix use of time zone in date_histogram rewrite (#31407)
  Revert "Mute DefaultShardsIT#testDefaultShards test"
  [DOCS] Fixes code snippet testing for machine learning (#31189)
  Security: fix joining cluster with production license (#31341)
  [DOCS] Updated version in Info API example
  [DOCS] Moves the info API to docs (#31121)
  Revert "Increasing skip version for failing test on 6.x"
  Preserve response headers on cluster update task (#31421)
  [DOCS] Add code snippet testing for more ML APIs (#31404)
  Docs: Advice for reindexing many indices (#31279)
dnhatn added a commit that referenced this pull request Jun 20, 2018
* master:
  [DOCS] Omit shard failures assertion for incompatible responses  (#31430)
  [DOCS] Move licensing APIs to docs (#31445)
  Add Delete Snapshot High Level REST API
  Remove QueryCachingPolicy#ALWAYS_CACHE (#31451)
  [Docs] Extend Homebrew installation instructions (#28902)
  Choose JVM options ergonomically
  [Docs] Mention ip_range datatypes on ip type page (#31416)
  Multiplexing token filter (#31208)
  Fix use of time zone in date_histogram rewrite (#31407)
  Core: Remove index name resolver from base TransportAction (#31002)
  [DOCS] Fixes code snippet testing for machine learning (#31189)
  [DOCS] Removed  and  params from MLT. Closes #28128 (#31370)
  Security: fix joining cluster with production license (#31341)
  Unify http channels and exception handling (#31379)
  [DOCS] Moves the info API to docs (#31121)
  Preserve response headers on cluster update task (#31421)
  [DOCS] Add code snippet testing for more ML APIs (#31404)
  Do not preallocate bytes for channel buffer (#31400)
  Docs: Advice for reindexing many indices (#31279)
  Mute HttpExporterTests#testHttpExporterShutdown test Tracked by #31433
  Docs: Add note about removing prepareExecute from the java client (#31401)
  Make release notes ignore the `>test-failure` label. (#31309)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants