Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add timezone key to time dimensions type_params #253

Open
3 tasks done
Jstein77 opened this issue Jan 24, 2024 · 1 comment
Open
3 tasks done

[Feature] Add timezone key to time dimensions type_params #253

Jstein77 opened this issue Jan 24, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@Jstein77
Copy link

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward change to existing dbt-semantic-interfaces functionality, rather than a Big Idea better suited to a discussion

Describe the feature

MetricFlow currently uses timezone-agnostic types for all date/time operations, but forcing people to coerce to UTC and then back-convert on render is cumbersome at best and impossible at worst. We have real-world examples of people running into date boundary issues where they have customer data stored in the customer-local timezone, and they want to compute daily customer-specific metrics with the boundaries set for that customer's local day (rather than UTC or whatever).

For situations like that it's natural to store the date/time information in local time with timezone annotations intact, as this makes common query types more natural and also allows for things like audits against local time values.

Currently , the only work-around is to normalize the time stamp in the warehouse (via the underlying dbt model) ahead of time, or else to use the expr field to do it on the fly when the semantic model is constructed. This is somewhat limiting, as it does not allow for re-use of the same measure against different time zones.

This issue will only become more common when we add hourly granularities, so it's something we'll need to address.

Note - the existing use case as described here requires us to aggregate against local time - if we do so against UTC the daily aggregations will be offset against user local time - so any strategy involving doing everything in UTC proper and then offsetting the result set in its entirety won't work, as different rows will need to be offset independently. Whether we provide support for this kind of approach via MetricFlow or some other mechanism involving dbt date/time plugins that allow MetricFlow to do simpler aggregations is an open question. dbt-labs/metricflow#733

The config changes need to keep the timezone annotation intact are relatively simple:

dimensions:
	- name: date_time
		type: time
		type_params:
			time_granularity: day
			timezone: pst # New

Describe alternatives you've considered

Manually convert timezones in dbt or via the expr parameter.

Who will this benefit?

Users who store time dimensions in local timezone.

Are you interested in contributing this feature?

No response

Anything else?

No response

@Jstein77 Jstein77 added the enhancement New feature or request label Jan 24, 2024
@dbeatty10
Copy link

As noted in dbt-labs/dbt-core#9263 (comment):

I'd advocate for using tz database names (like America/Boise) rather than non-standardized abbreviations (like MST).

The tz database is also known as:

  • tzdata
  • zoneinfo database
  • IANA time zone database
  • Olson database

One way to get lists of the valid time zones is to use the pytz Python library:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants