Spec-breaking proposal to support N timestamps #1196
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an incomplete draft proposal to add support for multiple (N) timestamps to mcap.
Today mcap supports recording exactly two timestamps per message:
log_time
andpublish_time
. Of these, onlylog_time
is indexed, meaning it is impossible to seek to or play back records bypublish_time
without doing a full linear scan on the file.While it is up to the user what to put in these timestamps, the original intention was that
log_time
would represent the time at which a given message was received and written by the logging node, andpublish_time
would represent when the message was originally sent by a publishing node.However, this design has proven short-sighted, as:
publish_time
is desirableThis PR is an incomplete draft proposal to see what it would look like to add support for an arbitrary number of named timestamps. The idea is that one would specify an arbitrary-length list of timestamp names up front (with string names), then each message would supply exactly that list of timestamps.
I can see two ways to do this: configure a file-global list of timestamps which all records in the file support, or configure a list of supported per-channel timestamps (each channel could in theory have different named timestamps). In this PR I opted to try the per-channel approach, which would make
mcap merge
operations easier (if merging two files with differently named timestamps, we don't want to write a bunch of zero timestamps to every single message, so keeping the timestamps per-channel helps here).However, when using the per-channel approach it leaves open a question about how to handle Attachment timestamps, and Summary record timestamps. I have left that question unanswered.
Another potential approach might be to introduce a file-global list of timestamps and give each timestamp a 4 bit identifier, then each message could provide a list of timestamp IDs and the corresponding timestamp. The downside of this approach is that in the Message record each timestamp would take up 12 instead of 8 bits.
As currently proposed this would be a backwards-incompatible change, so could only really be considered as part of an MCAPv2 spec. We could possibly solve only problem (a) in a backwards-compatible way (by introducing new Chunk Index records which index the publish_time, old readers would fall back to unindexed reading but would still be able to access all the file data). But I can't see any obvious way to retrofit N-timestamps without breaking existing readers, since timestamps are so core to how MCAP files work.