-
-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tighten leap second handling when parsing from RFC 3339 #406
Comments
In |
This shouldn't be too difficult. The main thing to keep note of is that the time is parsed in the local time zone, while leap seconds are always xxxx-12-31 23:59:60Z. With regard to rejecting the leap second, that will not happen, even in a breaking change. If it's possible to do something reasonable (in this case using the a value one nanosecond prior) then we will do so. Leap second support may be implemented in the future, mind you. |
The rollback may be up to a full second, as 23:59:60.999999999 is a valid leap second time. Depending on the application, the meaning of "reasonable" here may feel... stretched.
I see. The desired behaviour depends on the application:
|
Sure, it could be up to a full second. Either way, I'm not comfortable rejecting an otherwise valid moment just because it's a leap second. Every other restriction comes from either the specification or functional limits in the code — namely restricting the UTC offset to ±23:59:59 at most, which is not an issue in practice as no location has ever observed such an extreme offset to my knowledge. I've no problem limiting it to be on June 30 or December 31 at 23:59:60, as the spec requires values to be semantically valid. Anything further restriction is unlikely. If leap seconds are "properly" supported in the future, algorithms that work around a leap second wouldn't be. That would include leap smear, which inherently distorts the objectively correct time. Barring international agreement on changing how leap seconds work (or eliminating them altogether), my ultimate goal is to acknowledge reality within the proleptic Gregorian calendar. |
With computers on the internet, that's usually subject to local clock drift and some approximation of globally synchronized time usually achieved over NTP. Regardless of international agreements, popular protocols already exist that require their time representation to use leap smear and avoid over-59 seconds. Time-rs implementing workarounds or a future IERS-compliant support for UTC, with no alternatives, is actually in disagreement with the protobuf specification. I think Google, AWS, and others are trying to establish a de-facto workaround that would make time accounting friendlier to computing. Furthermore, their NTP servers already provide smeared time, so anyone relying on those is in on the workaround whether or not their time stack expects otherwise. I think the above calls for a separate |
Yes, there is clock drift and synchronization concerns, but that is an unavoidable reality that has nothing to do with this crate. I am taking the clock time on the user's machine as ground truth. NTP is not handled here, nor will it be in the future. Leap seconds do exist, at least for the next few years until IERS meets with regard to its future (I wouldn't be surprised if they just delay again). Leap smear is not universally accepted, and even where it is used the form it takes varies. Once tzdb handling is implemented, I'm absolutely not going to make it so that adding one second doesn't actually add one second of clock time — imagine the bugs that could cause. Proper handling of leap seconds is possible and will eventually be implemented. I am currently using a workaround that uses the closest value within the same calendar day (and hour and minute). The fact that certain protocols use leap smearing doesn't change the goal of this crate: to provide as accurate information as possible. Google's protocol using Google's leap smear isn't surprising. If dedicated protobuf handling is implemented in the future (unlikely), tzdb information can be used to leap smear then. Until that point, I'm providing the best information I can within the confines of reality, RFCs, and constraints of the time crate. I can't be in disagreement with a specification (provided by a company, not a standards body nonetheless) that I never claim to support. |
If you truly want to handle it yourself, I can add a method to the |
Fortunately, there is no need to do any breaking changes to support the protobuf well-known types specification, other than adding a variant of the RFC 3339 format parser that would reject leap seconds. Even that is not even essential to the binary data format in protobuf, but less strict parsing of string inputs may result in unwanted ambiguity in mapping. Formatting and the internal data do not currently produce leap second times. |
How about adding an |
I'll try to submit a PR with proposed changes some time later. |
This would implicitly guarantee the current behavior, which is not desirable. I can investigate delaying the leap second normalization, but this would require some internal information that I'm not sure I want to store.
Be aware that you are likely wasting your time. Adding another format for the sole purpose of rejecting a valid format will not be accepted. 23:59:60Z is a valid time and will always be accepted where the specification requires it. Duplicating all well-known formats would be a significant overhead and result in code and binary bloat, in addition to increased maintenance burden. My goal is not to support every method of handling time, it's to support the correct method. Leap smear is not how time currently works despite what Google may want others to think. |
A change of behavior to expose leap seconds without a rollback would break the current API, as the same input string will map to different time component values. So why not add an escape hatch for 0.3, which can be deprecated or removed when leap seconds are properly supported in the revised API? |
How would it break the current API? The result of the parsing is currently inaccurate and will be so until tzdb handling is implemented. Making something more accurate isn't a breaking change. That's the exact reasoning I used to change the parser from using |
Yes, and the API user is left completely unaware about this, with no recourse. This can make the library unsuitable for some applications, with real world examples I tried to point out above. No matter how we may dislike Google, Amazon, et al., the alternative where every computer system in the world has to independently follow the pronouncements of a committee with 6 months advance notice is no longer universally accepted.
The behavior pre-change was faulty considering the intent, so that arguably was a bug fix. |
The more strict parser would also have the benefit of making the map between |
I proposed a way that would let you handle it but received no direct answer. Please address this if you would like the ability to handle the leap second case manually. I am not willing to take on the maintenance burden of duplicating all well-known formats, especially when the specification requires succeeding on a valid value. The only time I will knowingly fail on an otherwise valid value is when the UTC offset is beyond ±24 hours, as no such time zone has ever existed. But that is not relevant here. Leap seconds are valid, so I parse them. If Google, Amazon, etc. want to use leap smear, that's on them because they are the ones not following reality. They're lying about what time it actually is, whereas my end goal is to not do so. If I have to choose between following some big corporation who thinks they can change things unilaterally and an internationally accepted standard, I'm choosing the latter ten times out of ten. It has nothing to do with "dislike"; it's just that I prefer reality over things that are made up in the name of "simplicity". You can continue yelling into the void about how some large corporations think we should do it this way, but the simple reality is that's not how clocks currently operate. To be perfectly clear, I am not and will not implement leap smear until it becomes an internationally recognized standard. Nor will I accept a PR that does so.
My point exactly. Improving something is considered a bug fix because the previous behavior was not fully correct. I feel like we're going in circles here, so take everything I say in this comment at face value. If you continue to push for something I've already rejected, I'll just ignore that point from here forwards. |
Without more detail, I can't see how that would help. It seems more complicated to me than addressing it in a specific well-known format parser. |
That's not what I'm proposing to do. I want to give the programmer a choice to reject leap second times if their application expressly forbids dealing with them. How they receive their time inputs should be completely outside the scope of the |
It would let you handle it manually. The vast majority of users will not need to do this, which is why going through the
Adding a method to |
Well, the way this crate currently approaches this reality when parsing RFC 3339 is "pretend all these leap second time inputs refer to a single nearby moment, because otherwise we'd end up like |
The current approach is the one that will be used until tzdb handling is implemented as it is the closest to reality within the constraints of the current data structures. I have considered the tradeoffs and came to the conclusion that this is the appropriate behavior. I am not interested in discussing leap smear further as I have made clear. I don't even know why you're bringing it up given that you've said it's not what you want implemented. I will lock this thread if you insist on continuing discussion about leap smear. This issue is currently open solely to track more accurate validity checks. |
The current implementation of
format_description::well_known::Rfc3339
allows second 60 to occur in any time, which is then rolled back to the last nanosecond of 59. As leap seconds are only really allowed at 23:59:60 UTC, this behaviour feels way too loose.Fortunately, there is always the full parsed date for the
Rfc3339
format parser to look at. So it could compute the UTC time and check if the hour is 23 and the minute is 59, otherwise it should reject the parsed time. It might make sense to also check if it's the last day of the month, though I don't know if the language in ITU-R TF.460-6 is considered normative by the IERS.The text was updated successfully, but these errors were encountered: