Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-30439: Refactor DataCoordinate #769

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from
Draft

DM-30439: Refactor DataCoordinate #769

wants to merge 15 commits into from

Commits on Jan 12, 2023

  1. Configuration menu
    Copy the full SHA
    a2496b4 View commit details
    Browse the repository at this point in the history
  2. Fix Protocol inheritance ordering.

    In all of the PEP-544 examples, Protocols appear last, and the original
    order confuses pylance.
    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    aa3b4c7 View commit details
    Browse the repository at this point in the history
  3. Move DataCoordinate container classes to new modules.

    This starts a new _containers subpackage that will be extended to
    include other butler primitives, like DimensionRecords and DatasetRefs.
    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    5e08d90 View commit details
    Browse the repository at this point in the history
  4. Rework DataCoordinate custom containers.

    This renames some existing classes, so it is a breaking API change,
    but one whose effects should be minimal to nonexistent outside
    daf_butler.  Overall it aligns the hierarchy better with the built-in
    typing hierarchy, injecting some interfaces that had implementations
    but no ABCs, and it cleans up the handling of mutability by separating
    frozenset from set views.
    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    f348319 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    bfcbd04 View commit details
    Browse the repository at this point in the history
  6. Add DimensionRecord cache containers.

    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    751731a View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    eb8ab29 View commit details
    Browse the repository at this point in the history
  8. Stop trying to cache DimensionRecord lookup failures.

    We've historically treated DimensionRecord lookup failures as a
    non-error (we just don't return records we didn't find), at least
    mostly because we allow implied dimension dependencies to be None
    (e.g. an exposure need not have a physical_filter).  But that's
    actually broken because of DM-21840, and I now think the right
    solution there is to just make those foreign keys NOT NULL after all.
    
    That's a long way of saying that I think we could _probably_ start
    treating DimensionRecord lookup failures as errors, and simplify a lot
    of code in the process (though a lot of test would break).  I'm not
    confident enough to try that now and I don't want to deal with fixing
    the tests, but I am confident enough to say that lookup failures are
    so rare that we *can* simplify by not caching them, and that's what
    this commit does.
    
    This is relevant because I'm about to reimplement the caching entirely
    in terms of the new DimensionRecord containers, and I don't want those
    to have to worry about caching failed lookups.  But I wanted to make
    the (slight) behavioral change a separate commit (this one) from the
    refactoring one.
    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    fa5c728 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    f125bac View commit details
    Browse the repository at this point in the history
  10. Require more consistency in DimensionRecordStorage.fetch.

    This is now being used as a callback in the new cache object, and for
    consistent error-handling there we need these to behave more
    consistently.
    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    60ffbe7 View commit details
    Browse the repository at this point in the history
  11. Doc fixes for Registry.expandDataIds.

    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    4e9748f View commit details
    Browse the repository at this point in the history
  12. Stop testing behavior that DataCoordinate doesn't advertise.

    This was a test for some cleverness that avoids a copy, but I'm not
    sure all of that cleverness is really worth its weight, and I don't
    think we should be testing something the user is not supposed to care
    about.
    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    decbdc9 View commit details
    Browse the repository at this point in the history
  13. Move data ID expansion implementation to DataCoordinate.

    This adds a new method to Registry and SqlRegistry without a
    RemoteRegistry implementation, and that's why it isn't decorated with
    abstractmethod yet.  I'll fix that after I make the DimensionRecord
    containers serializable.
    
    This also adds a check to DataCoordinate.standardize that any passed
    keys are actually dimension names; this addresses a long-standing
    problem where users would get a keyword argument to some higher-level
    API wrong, and it would be forwarded as **kwargs down to
    DataCoordinate.standardize and then silently ignored.  And it turns
    out we were doing that even in our own test utility code!
    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    20ae036 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    66c40c0 View commit details
    Browse the repository at this point in the history
  15. Add logic to recognize non-standard keys in DataCoordinate.

    This reimplements some of the special handling of non-standard keys
    from Butler._findDatasetRef in the hopes of being able to move it all
    down to Registry (and thus work on many more interfaces).  But it's
    just a start at that; I realized while trying to make
    Butler._findDatasetRef use the new code that we really need to make
    queryDatasets work on CALIBRATION collections first.  But I think what
    I've done so far will still be useful eventually, so I'm keeping it.
    TallJimbo authored and timj committed Jan 12, 2023
    Configuration menu
    Copy the full SHA
    4f55339 View commit details
    Browse the repository at this point in the history