Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement UUID version 7 #14

Closed
wants to merge 2 commits into from
Closed

Implement UUID version 7 #14

wants to merge 2 commits into from

Conversation

rdn32
Copy link

@rdn32 rdn32 commented Aug 30, 2024

Implement a function to construct v7 UUIDs as defined by RFC 9562. These incorporate timestamps so that they sort by order of creation - although in the implementation, a timestamp value has to be supplied.

I've not modifed the version type - a V7 variant could be added, but I'm not sure how much sense that would make.

@dbuenzli
Copy link
Owner

I merged your patch as 413249e but I don't think we are there yet API wise. Let's use this PR for discussing this. In 3c6abae. I renamed your version to v7_ns and added a base v7 in which you control all the fields as in the spec, this allows people to easily implement the sequencing schemes if they want to (we could also provide generators for these).

Now I think we still want at least:

  1. Generators for purely random based ones along the lines of Uuidm.v4_gen:
val v7_gen : now_ms:(unit -> int64) -> Random.State.t -> (unit -> t)
val v7_ns_gen : now_ns:(unit -> int64) -> Random.State.t -> (unit -> t)
  1. I would like to tweak your version of v7_ns to have the rand_b argument as a int64.
    Since 4.14 we have Random.bits64 () which makes using them much more convenient.
  2. Something that interoperates easily with Unix.gettimeofday. E.g.
uint64_ms_of_float_s : float -> int64

Any thoughts ? Also is it a problem for you if Uuidm starts requiring 4.14 ?

Copy link

@lindig lindig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a specialist for UUID; just think this looks solid. Thanks for adding v7 to uuidm. Does this need more test cases or it the construction so obvious (how bits are placed) that problems would show up?

@psafont
Copy link

psafont commented Sep 16, 2024

Any thoughts ?

I like your decision to provide a call with all the parameters defined in the RFC to provide clients flexibility to make their own libraries to provide their own monotonicity schemes; and a more ergonomic / interoperable that uses nanoseconds.

About generators, an important property of v7 is to provide ordered timestamps. The RFC talks about the methods in detail in section 6.2. The generators can ignore these if we think users won't generally need the precision, but then the generators must be made ad-hoc.

  • For v7_gen, if the code makes the assumption that the timestamps are given in a monotonic way, the Monotonic Random method could be encoded. Storing the last timestamp and the seed counter could be used to increment the least-significant position when the timestamp is repeated.
    This should be a safe assumption to do, as it's a recommendation from the RFC and other implementations are likely to behave in the same way:

Monotonic Error Checking:
Implementations SHOULD check if the currently generated UUID is greater than the previously generated UUID. If this is not the case, then any number of things could have occurred, such as clock rollbacks, leap second handling, and counter rollovers. Applications SHOULD embed sufficient logic to catch these scenarios and correct the problem to ensure that the next UUID generated is greater than the previous, or they should at least report an appropriate error. To handle this scenario, the general guidance is that the application MAY reuse the previous timestamp and increment the previous counter method.

I'm unsure about mixing randomness and monotonicity, do you have any opinions on implementing this scheme? Maybe we want another behaviour.

  • For v7_gen_ns, the monotonicity must be retained even with the increased precision, so care is needed in this case as well.

Regarding 2 and 3: I don't have any objections about this, I'm all for making the API as easy to use as possible, so these are all well. Adding an example of them would be clarifying to users as well.

Also is it a problem for you if Uuidm starts requiring 4.14 ?

Not a problem at all, thanks for asking.

dbuenzli added a commit that referenced this pull request Sep 25, 2024
dbuenzli added a commit that referenced this pull request Sep 25, 2024
@dbuenzli
Copy link
Owner

So I had a read at the spec. There's quite a few things you can do to generate V7 UUIDs and the right one depends quite a bit on your application needs or the strictness of your monotonicity constraints. Since v7 and v7_ns constructors easily enable you to devise your own scheme, I decided to only provide two simple V7 generators that look rather obvious to me:

  1. v7_non_monotonic_gen. This one simply uses the ms timestamp and random for everything else. It's not monotonic if you generate multiple UUIDs in the same millisecond but you still get database index locality via the timestamp.
  2. v7_monotonic_gen. This one uses the ms timestamp and rand_a as a counter if the timestamp does not move between two generations (method 1 in the spec). This allows to generate 4096 monotonic UUIDs per milliseconds and returns None until the milliseconds moves if you roll over.

For now I think this is good enough, more generators can be added in the future if other good schemes emerge.

I added a quick start with various example and notably how to use v7_monotonic_gen with Unix.gettimeofday and Unix.sleepf to get monotonic (up to gettimeofday(2) not going backwards…) time-based generation.

Thank you all for your input!

@jberdine
Copy link

I added a quick start with various example and notably how to use v7_monotonic_gen with Unix.gettimeofday and Unix.sleepf to get monotonic (up to gettimeofday(2) not going backwards…) time-based generation.

At the risk of making the quick start less self-contained, I think it would be good to either use Mtime to obtain monotonic timestamps, or at least mention its existence. This is the sort of api where someone might first encounter it mattering that gettimeofday is non-monotonic, and it would help to have a pointer to an existing solution.

@dbuenzli
Copy link
Owner

dbuenzli commented Sep 26, 2024

I think it would be good to either use Mtime to obtain monotonic timestamps, or at least mention its existence.

The standard mandates a POSIX timestamp and Mtime does not provide that. It would be even worse to use Mtime since it can go again over past values after a reboot.

If you really need monotonicity, you'd be better of trying to combine a POSIX clock, a monotononic clock and stable storage but all that is way beyond Uuidm's scope and if you have a distributed system your clocks are going to skew anyways.

In general I think it's better to design your system so that it takes into account that monotonicity (and unicity) may be violated in rare cases and that it should not be a catastrophy for your system…

@rdn32 rdn32 deleted the uuid_v7 branch September 27, 2024 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants