Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Climatological attributes table(s) #99

Open
rod-glover opened this issue May 27, 2021 · 0 comments
Open

Climatological attributes table(s) #99

rod-glover opened this issue May 27, 2021 · 0 comments

Comments

@rod-glover
Copy link
Contributor

In revision 2914c6c8a7f9, we drop table meta_climo_attrs because it is not in use and, when used, will likely be replaced by differently defined table(s). In this issue, we record information and discussions that will help when climatological attributes table(s) are recreated.

The original table is created in the upgrade for revision 522eed334c85. It is also preserved in the downgrade for revision 2914c6c8a7f9.

The corresponding definition in the ORM (now removed along with the table) is:

class ClimatologyAttributes(Base):
    __tablename__ = "meta_climo_attrs"
    vars_id = Column(Integer, ForeignKey("meta_vars.vars_id"), primary_key=True)
    station_id = Column(
        Integer, ForeignKey("meta_station.station_id"), primary_key=True
    )
    month = Column(Integer, primary_key=True)
    wmo_code = Column(String(1))
    adjusted = Column(Boolean)


Index(
    "meta_climo_attrs_idx",
    ClimatologyAttributes.vars_id,
    ClimatologyAttributes.station_id,
    ClimatologyAttributes.wmo_code,
    ClimatologyAttributes.month,
)

Discussions with @faronium included the following information:

this is how I'd design the table:

obs_raw_id (INT), adjusted (BOOL), nyears_total (INT), nyears_within_norm (INT).

through obs_raw_id can link back to all of the station and variable meta data associated with the climatology, so that kills vars_id, station_id, month and normal period as all of that can be backed out from the obs_raw entry. Adjusted needs to stay (ideally the three stations that were used to adjust would be stored in additional columns, but we'll ignore that now). Two nyears entries are needed because a user should be able to differentiate between climatologies based on data that's within the climate normal period and those based on data outside that range. Adjusted meets that need, but doesn't tell us how long the records are that feed into the climo. With that information wmo_code can also be dropped.

@jameshiebert also commented:

I can envision a programmatic use case for it. E.g. a filter on the PCDS portal where you can select stations based on their quality. "Give me only the A rated stations". "Give me stations that are >= C rating", etc. There's currently no process available that actually updates (or calculates) those ratings, so it's an unimplemented feature at this point.

@rod-glover also remarks

The redesigned table above may be better implemented as 2 tables:

  • a table, call it climo_attrs, with columns adjusted, nyears_total, nyears_within_norm and an id column
  • an association table containing columns climo_attrs_id and obs_raw_id, call it obs_raw_climo_attrs

This functions very much like the existing flags tables, and eliminates data duplication if many observations share the same climatological attributes. (If they don't, then this 2-table design has no utility.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant