Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Comments: Possible Strategies for Public Dependency Version Management #1796

Closed
abonander opened this issue Apr 12, 2022 · 24 comments

Comments

@abonander
Copy link
Collaborator

abonander commented Apr 12, 2022

The Situtation

As of writing, we have a lot of PRs piled up for the version 0.6.0 release, many of which which are just breaking upgrades of dependencies: #1426, #1455, #1505, #1529, #1733

Since these are exposed in our public API (although rustls is getting fixed so it's no longer public), we have to cut a backwards-incompatible (0.{x+1}.0) release when we upgrade them to backwards-incompatible versions to maintain our SemVer guarantees.

However, we currently don't cut breaking releases very often (the last one was over a year ago) for reasons which I'll get into.

I'm not a huge fan of this situation for a couple main reasons:

  • It forces people to stay on older, possibly insecure or broken versions of these dependencies until we get around to cutting a new release.
  • When we do cut a new release, it forces people to accept possibly quite inconvenient breaking changes to SQLx itself just to upgrade some dependencies.

Options

I've thought of a handful of possible solutions here, but none of them feel like The Correct One to me, so I'd like to get some feedback and do some workshopping here. The last one I came up with while writing this and is definitely very interesting, but I'm mentioning all of them for the sake of discussion.

Cut More Granular Breaking Releases More Often

This has the benefit of not forcing people to wait forever to upgrade dependencies, and makes upgrading more palatable as there would often be fewer breaking changes in SQLx itself, or none at all.

However, because we only provide bugfixes to the latest 0.x release, this means that people who choose to remain on old versions (or just forget to upgrade) won't receive those fixes. Maybe a possible solution here would be to start offering LTS releases, but that's a lot of work to do for free.

The main reasons why we're not already doing this are that I personally feel like breaking-change releases should have some meat to them to justify the effort of upgrading, and I don't really like the optics of frequent breaking-change releases. When I look at a crate's version history and see that it puts out a breaking release every month or so, it doesn't exactly inspire a lot of confidence. It shows that it's maintained, sure, but also that it's probably in a pretty experimental state or maybe has an inexperienced maintainer who doesn't put a lot of thought into their API design. That's not the kind of vibe we want to put out with SQLx.

Use Version Ranges for Public Dependencies

For dependency upgrades where there were no breaking changes to the parts of that dependency's API that SQLx uses directly, we could use a version range for that dependency which allows the root crate in the dependency graph to select any compatible version in that range. I have previously used this strategy in my img_hash crate (which I haven't had the time or energy to maintain, sorry) for the integration with the image crate.

As for testing in CI, I think we'd be okay just testing against the latest version in the range for a dependency, assuming we only do this when we can just widen the range without having to change the code and the previous range was covered by past CI runs.

This is ostensibly the option I like the most, however we're still back to square one for dependency upgrades with breaking changes to APIs that we need in SQLx (e.g. to provide Encode/Decode impls), so I would ideally like to find a more general solution that covers both cases. Although, since the API surface that SQLx touches is generally rather small for any integration, having to increase the minimum version for the range due to breaking changes that matter to us should hopefully be quite rare.

A big issue though, which is something @mehcode brought up a long time ago when I originally suggested doing it for SQLx, is the potential of the root crate failing to constrain the version range, e.g. someone using sqlx::types::OffsetDateTime and never adding time to their own dependencies to constrain the version appropriate to their usage. A cargo update could naively perform a breaking-change upgrade, as SQLx's version requirements aren't always going to match the end user's, whether they remember them to specify them or not.

We would probably need to remove those reexports to eliminate that hazard, but the macros use them so they aren't affected if the root crate renames one of these dependencies. We could hide them, but it's still potentially an issue if the end user only uses those types through, e.g. query!() where they never have to name them themselves.

We would need to force the user to include the packages in their own dependencies and never rename them, then the macros could just assume that those crates exist in the global namespace.

Move Type Integrations to External Crates

This one obviously falls afoul of the orphan rule unless we use newtype wrappers, which has a significant impact to ergonomics.

However, we still have the same issue of how to reference the types from the macros. If the sqlx facade depends on them with a version range and reexports them, then we have the version range constraint issue. If it doesn't then we still have to force the user to directly depend on the crates and not rename them. There's also the whole new issue of how to assign version numbers to these crates when they have to be semantically versioned against both sqlx-core and the crate they're providing the integration for.

Overall, this feels inferior to the previous option.

Provide a Separate Cargo Feature for Each Version of a Dependency

This was suggested for time 0.2 and time 0.3 but we tried it and it didn't work very well.

For crates that don't have this issue, we could do this, but it doesn't seem like it would scale well, and we'd need to make a breaking-change release anyway when we decide to drop support for old versions of crates.

Implement Our Own Containers for Various Types In-Tree

I've thought about doing this for types like Postgres' MACADDR8, for which an appropriate crate doesn't really exist right now. We already do this for more niche,
database-specific types, again namely from Postgres thanks to its rich type system.

This would fix all the issues with external dependencies, but create a whole slew of new ones.

For one, we rely pretty heavily on the external crates for parsing from text formats since hand-rolling that would be redundant. We would also be taking on the burden of designing useful APIs for all of these types. We could provide conversions to their counterparts in popular crates for that, but then we're back to square one with the semver issue again.

However, while writing this I had a bolt-from-the-blue idea:

⚡ Implement Wrapper Types and Provide Conversions via Version Ranges ⚡

I had to run this past @mehcode because I wasn't sure whether it was absolutely brilliant or completely insane. I think the jury is still out on that.

We would create wrapper types using existing crates for their internal functionality, on separate feature flags that aren't tied to our semver, and then provide conversions to the types publicly through a separate package with as wide of a version range as possible, e.g.:

[dependencies]
_time_internal = { version = "0.3", package = "time" }
time = { version = ">=0.2, <0.4", optional = true }
pub struct TimestampTz(_time_internal::OffsetDateTime);

// impl Type, Encode, Decode
// also Serialize, Deserialize behind a feature flag

// plus a conservative set of inherent methods copied from `OffsetDateTime`

// also impl Type, Encode, Decode for `time::OffsetDateTime` if enabled
// as well as `From<TimestampTz> for time::OffsetDateTime`
// these impls would touch the smallest subset of the `time` API that they could, e.g.
// using `TimestampTz` for the heavy lifting and then just converting to the final type via unix_timestamp_nanos + to_offset
// which is supported by both time 0.2 and time 0.3

We can then upgrade _time_internal with impunity, the macros can directly reference TimestampTz, and we don't have the issue of unconstrained version ranges because the user would only interact with the ranged time dependency if they were already using it in their own crate, where it should be properly constrained.

This would also let us "fix" things like time's default human-readable serialization which is not ISO-8601 compatible even though that's most often the format returned from web APIs.

For new versions of crates where we can't feasibly include them in the version range, we could include them as a new feature.

This would be a pretty sweeping breaking change, however I think once we rip the band-aid off it'll be smooth sailing from there and we can save breaking releases for actual breaking changes to SQLx. This might let us someday soon cut a 1.0!

@mehcode was also concerned about allowing query_as!() invocations to continue using time::OffsetDateTime directly, and suggested having it invoke Into internally. I think that would be best combined with reworking the typechecking code to use const-eval and const-panic that I've been talking about doing for ages.

@abonander
Copy link
Collaborator Author

cc @jplatte @paolobarbolini @joshtriplett (re. git2)

@genusistimelord
Copy link
Contributor

The wrapper type does seem like it might possibly work the best but it also will take a bit to implement for everything. However if it can promise to really fix the issues and be easier to maintain later on I would say that will be the better choice. Is there a test branch with something like this implemented for lets say just time itself currently?

@abonander
Copy link
Collaborator Author

This is all just hypothetical right now.

@genusistimelord
Copy link
Contributor

Ahh maybe would be best to further test it with time or something else since those generally are breaking for some people and would be a great test. A small test would help quicken if it would work well or not for other external libraries. It sounds logical though.

@abonander
Copy link
Collaborator Author

Technically, the wrapper types could live in their own crate since they wouldn't have to deal with coherence except for the direct impls of Type, Encode, Decode on the unconstrained version type.

@jplatte
Copy link
Contributor

jplatte commented Apr 13, 2022

Provide a Separate Cargo Feature for Each Version of a Dependency

This was suggested for time 0.2 and time 0.3 but we tried it and it didn't work very well.

I would say this is by far the best option. In the case of the time upgrade, wouldn't the solution be to quite simply not use the macros?

For crates that don't have this issue, we could do this, but it doesn't seem like it would scale well,

Why do you think it wouldn't scale well?

and we'd need to make a breaking-change release anyway when we decide to drop support for old versions of crates.

Keeping the features for old crate versions around shouldn't have a high cost. You can simply drop old ones when doing a breaking-change release for other reasons.

@genusistimelord
Copy link
Contributor

The other issue I do see though is for how long should older libraries be supported for?

@jplatte
Copy link
Contributor

jplatte commented Apr 13, 2022

You can just remove all not-known-to-be-used ones in a major release and re-add support in a point release by reverting the respective commit ^^

@djc
Copy link
Contributor

djc commented Apr 13, 2022

I don't like type wrappers, it feels like these will just make things more opaque and will make compiler errors harder to interpret. I think cargo feature per version is used with some success in other places (for example, tokio-postgres), and it seems like a reasonable model.

Personally I would be partial to just doing more technically semver-breaking releases, like every 3-6 months. Breaking semver compatibility is not such a big deal, in my opinion, in particular if you don't actually break downstream user's code that much. I would see semantic version updates less as something with conceptual meaning of its own and more as a tool to make it easier to provide bug fixes without breaking to some downstream users.

If you have a lot of users asking for LTS support that seems like an opportunity for monetizing those users.

There's also the combination of cargo feature per version and more semver-breaking releases, which would perhaps alleviate the burden of maintaining many Cargo features/dependency versions as you can reset the features a little more often.

@genusistimelord
Copy link
Contributor

don't like type wrappers, it feels like these will just make things more opaque and will make compiler errors harder to interpret. I think cargo feature per version is used with some success in other places (for example, tokio-postgres), and it seems like a reasonable model.

I do agree with this as it can make it harder to debug things.

Personally I would be partial to just doing more technically semver-breaking releases, like every 3-6 months. Breaking semver compatibility is not such a big deal, in my opinion, in particular if you don't actually break downstream user's code that much. I would see semantic version updates less as something with conceptual meaning of its own and more as a tool to make it easier to provide bug fixes without breaking to some downstream users.

I have been thinking about this and my thought today was no big deal to have breaking changes as long as we can properly version it so its easy enough for the user to choose the version which works and then Work on updating their stuff and helping or getting other libraries to also update to the latest etc.

If you have a lot of users asking for LTS support that seems like an opportunity for monetizing those users.

this would help further support Sqlx development if they Really do want backwards compatibility with older libraries though the main issue would be security updates which might only exist in newer versions of a library so we are allowing them to avoid these updates long term.

After thinking about this for a long while of the upsides and downsides of wrappers.

Upsides:

  • Easier to Implement multiple versions
  • Could support older versions.
  • Could be mapped to and from on the back-end if we enforced only one version allowed at a time.
  • Macros just work

Downsides:

  • Harder to debug
  • Would require you to maintain a ton of features and separate crates.
  • Security issues with Older libraries that are fixed in newer.
  • Allowing users who refuse to change to never update to newer versions which could further hurt switch overs later on as
    the libraries API change.
  • Major breaking changes to implement this. making it a lot more work to switch from 0.5 to 0.6.

Now the For forcing a breaking Release every 3 to 6 months or when needed to implement a breaking API change of a library

Upsides:

  • Only need to focus on maintaining a Single Version range.
  • Easier to Debug
  • Always up to date with the latest Security patches.
  • Less code to manage.
  • Macros just work

Downsides:

  • More general version updates.
  • Must keep track of when a good time to make a switch is, as we don't want to just Switch right away unless it was heavily
    adopted right away.
  • Not backwards compatible so most people might stick with older versions of Sqlx but we cant stop them from doing this
    anyways.

Now if we went with Version Ranges

Upsides:

  • Can support multiple Versions
  • Still easy to debug than a wrapper

Downsides:

  • Security can be compromised since they can Choose the version that could be outdated.
  • A ton of features need to be made for each version range.
  • A lot more Code with feature locks making it harder to maintain
  • Macros did not work well with this?

Let me know if I missed or misunderstood anything here in my understanding.

I think a breaking change after reviewing these such far makes more sense and that supporting backwards compatibility can cause users to lock themselves into older versions and never update. @djc @jplatte @abonander

@magicmaaaaan
Copy link

I think far too much effort is being put into being non-breaking, especially for a library versioned < 1.0. This is precisely the time to be breaking and experimental. Add migration guides or make breaking changes obvious in changelogs. There's no reason to be scared to make changes when Cargo is going to handle version compatibilities. Users are never required to upgrade, as they make the choice to adapt or not.

Anyway, I'm raising both hands to vote for "Cut More Granular Breaking Releases More Often".

@abonander
Copy link
Collaborator Author

@jplatte

In the case of the time upgrade, wouldn't the solution be to quite simply not use the macros?

It is possible, but it's not great for testing because we have the same issues again without being able to use, e.g. datetime!(). For parsing from text format, it's not great for performance because you'd have to fall back to the fallible runtime parse function for format descriptions. You could cache the result in a OnceCell I suppose, but meh. It was easier to say "screw it, just upgrade wholesale to time 0.3"

Why do you think it wouldn't scale well?

Because while the API surface of a given dependency we touch is typically small, it's not trivial and it varies from crate to crate on how much code we can share between versions. time would also have been a significant duplication of code because of how much changed.

Keeping the features for old crate versions around shouldn't have a high cost.

Maintenance and increased build and run time in CI. Now that I'm thinking about it, it's also not clear how this approach would interact with the all-types feature, which a lot of people also naively enable. The macros would probably have to choose the old versions by default because enabling the feature for the new version without thinking about it would be a breaking change.

You can just remove all not-known-to-be-used ones in a major release and re-add support in a point release by reverting the respective commit ^^

We assumed barely anyone used async-std anymore and look how wrong we were.


@djc

I don't like type wrappers, it feels like these will just make things more opaque and will make compiler errors harder to interpret.

Can you elaborate? It seems like it should be easier, especially if the types are named the same as their SQL equivalents. We currently have to maintain large tables explaining which types map to which because it may not be obvious.

@paolobarbolini
Copy link
Contributor

paolobarbolini commented Apr 14, 2022

I think the best solution is supporting multiple dependency versions, as it's the fastest way of getting a sqlx update out. I have no problems with sqlx releasing more frequent breaking releases, but waiting up to 6 months for a dependency upgrade still feels too much to me. I frequently update dependencies as newer versions come out. Having to do it all at once every 6 months doesn't work well.

Here's how we can achieve this:

  • Remove the all-types feature. Personally I find these enable everything features kind of useless as a downstream user of a crate, and I don't generally agree with this approach of Just enable everything if you don't know what you're doing. When they can come useful is during development of the crate itself, and for that I guess we have to figure something out.
    Removing this feature solves the question of what to do with all-types when support for a newer version of a types crate is added
  • When we encounter issues like the time 0.2 to time 0.3 upgrade we implement encoding and decoding with the newer version of the crate and then change the implementation for the older version of it to "just" convert types back and forth from the newer version. In a way we'll treat the older version as less important because it has to go through an additional internal conversion process, but it's the only solution other than changing the time macros themselves to take the name of renamed time dependency How to deal with multiple versions of `time` when using macros time-rs/time#356. LLVM might even optimize the conversions away so performance shouldn't be an issue.
  • Rename all features enabling the various types crates to contain the major and minor version of the crate. Without this we can't do the previous step because the latest version of the time crate has to be imported as time, otherwise we can't use macros.
    We will then make it a compile error to enable the time feature, so that downstream users are forced to enable it as time03. This way when time 0.4 comes out the crate imported as just time will be updated to 0.4, and the time03 feature will import time 0.3 (named for example time03_crate) and the time04 feature. Because encoding/decoding will be done by time 0.4 we shouldn't need the macros from time 0.3, so renaming it should be fine by that point.

@djc
Copy link
Contributor

djc commented Apr 14, 2022

Maintenance and increased build and run time in CI. Now that I'm thinking about it, it's also not clear how this approach would interact with the all-types feature, which a lot of people also naively enable. The macros would probably have to choose the old versions by default because enabling the feature for the new version without thinking about it would be a breaking change.

IMO getting rid of all-types as @paolobarbolini suggests seems very reasonable. It just doesn't fit with how Rust works very well. For use in CI you could just rename it to __all-types and declare it unstable.

We assumed barely anyone used async-std anymore and look how wrong we were.

That's about a whole crate, though, not about an older version. Dropping older versions absolutely seems reasonable.

I would also take issue with the formulation of "look how wrong we were". Despite being fairly distributed, there were three people in that thread who explicitly said they were relying on async-std support (and one of them is the CEO of the company that started async-std). The proposal got 194 thumbs up and 43 thumbs down, or a 82% positive response rate, and arguably folks using async-std would be much more likely to respond at all. In other words, it's a vocal minority, but it's a clear minority nonetheless.

I don't like type wrappers, it feels like these will just make things more opaque and will make compiler errors harder to interpret.

Can you elaborate? It seems like it should be easier, especially if the types are named the same as their SQL equivalents. We currently have to maintain large tables explaining which types map to which because it may not be obvious.

Well, for well-known types like maybe chrono::DateTime<Utc> this would mean doing another conversion in my project's code to convert from your wrapper types to the more-native types I use in my own projects, making everything more complex.

For what it's worth, I run dependabot on all of my projects and for Instant Domains (work) projects we typically upgrade all dependencies every week. IMO adopting ecosystem upgrades faster is just good engineering, as quality usually increases over time and postponing the upgrades makes little sense.

I do think it might make sense to classify dependencies and treat them differently. I think feature-flag pluggability definitely makes sense for type provider dependencies, but not as much for rustls. For rustls on the other hand, I think an integration trait could make sense, where you provide a trait that exchanges some TLS config for an AsyncRead + AsyncWrite for whatever runtime you need, then there's a sqlx-rustls crate (for the near future I'd be happy to maintain that) that can be versioned separately. I don't know what the git2 dependency is used for in sqlx.

@genusistimelord
Copy link
Contributor

A friend of mine after I asked him about this basically said: Don't be Debian. So I would go with breaking changes.

@Wopple
Copy link
Contributor

Wopple commented Apr 14, 2022

Speaking for me personally...

However, because we only provide bugfixes to the latest 0.x release, this means that people who choose to remain on old versions (or just forget to upgrade) won't receive those fixes. Maybe a possible solution here would be to start offering LTS releases, but that's a lot of work to do for free.

I wouldn't mind more frequent upgrades even without LTS releases. Security is more important than API stability. Making secure code possible sooner is more important than making more secure code a non-breaking change.

The main reasons why we're not already doing this are that I personally feel like breaking-change releases should have some meat to them to justify the effort of upgrading, and I don't really like the optics of frequent breaking-change releases.

I understand this, but it's not the highest priority for me. The effort of upgrading is just the cost of doing business. However, I do agree it's a fair consideration for projects which aren't funded (like this one!).

When I look at a crate's version history and see that it puts out a breaking release every month or so, it doesn't exactly inspire a lot of confidence. It shows that it's maintained, sure, but also that it's probably in a pretty experimental state or maybe has an inexperienced maintainer who doesn't put a lot of thought into their API design. That's not the kind of vibe we want to put out with SQLx.

I understand this too, but again for me personally I don't mind much. Security is a never ending battle. The only way to minimize the amount of maintenance spent due to breaking security fixes from libraries is... to minimize the number of libraries you depend on.

I think what it comes down to is weighting of priorities. And if a decision speaks to a higher priority at the expense of a lower priority, then that can be okay. Every decision is a trade-off.

If I were to rank my priorities when choosing sqlx it would be something like this:

  1. core functionality (it has to work...)
  2. security
  3. sensible API
  4. "batteries included" functionality
  5. stable API

While these are all priorities, not all priorities are made equal. 😉 Now, that doesn't mean the first option is the best or even my preferred choice, but the rationales in that section felt like I had the most to say about.

@benwis
Copy link

benwis commented Apr 14, 2022

Everybody's pretty much laid out the arguments, I would vote for more frequent breaking changes. Updating code is a cost of doing business, and users control when they update anyway. As long as it's properly documented, it shouldn't be a problem for us. We already have to deal with security updates pretty regularly.

Ease of debugging for us and ease of development for y'all are far more important.

@djc
Copy link
Contributor

djc commented Apr 14, 2022

I just want to stress again, while semver-incompatible changes are kind of binary (a change is either semver-compatible or semver-incompatible), the pain of upgrading has a wide range of impact. Many semver-incompatible updates don't actually require many/big code changes, and it's fine to make releases of this kind of more often. There is a risk of requiring downstream users to do so much work on each upgrade that they will switch away, but as I understand it the core sqlx API would be pretty stable and so that doesn't seem like a big risk here.

@apiraino
Copy link

Updating code is a cost of doing business

I came here to write exactly the same thing, thanks @benwis for spelling them out for me. I like sqlx and in my budget there is always time slots allocated for upgrades, especially when I work with <1.x libraries.

Slightly tangential:

LTS support (...) seems like an opportunity for monetizing those users

I am not sure I would use these words, the sqlx team is already doing an amazing work and providing a tool for free that we all use in production, so as far as I am concerned I should already be paying these guys for their work. I'd rather frame the matter this way: where is the donate button? :-) Let them keep on working. I fully trust them on the technical choices.

@abonander
Copy link
Collaborator Author

abonander commented Apr 15, 2022

I'd rather frame the matter this way: where is the donate button? :-) Let them keep on working. I fully trust them on the technical choices.

Since Launchbadge is a for-profit company, accepting donations doesn't really work. We've been talking at length about our plans to provide some sort of paid service related to SQLx to support its development, but we've been busy with client projects (not that I'm complaining) and haven't had the time to lay the groundwork for that yet.

I don't know what the git2 dependency is used for in sqlx.

It's a type integration I'm trying to phase out, cause honestly it doesn't make much sense to me either: #1733 (comment)

It was easier to just update it for 0.6.0 but I don't plan on keeping it past that.

@tyt2y3
Copy link

tyt2y3 commented Apr 18, 2022

Hi all, my two cents being the maintainer of SeaORM.

We have a different release philosophy: we releases a major version roughly every one-and-a-half month. To me, it's an unavoidable fact of writing Rust: the language and ecosystem is rapidly evolving. That should be the case before everyone settle on a 1.0 release.

As a downstream crate, I would not mind bumping SQLx version every time we release a major version, as long as there are instructions on how to upgrade / resolve compilation failures. Users wanting to keep their codebase up to date can follow this relatively low friction path. Because "having to upgrade things once a month" is definitely better than "having to upgrade everything once a year".

To keep things simple, I suggest the latest version ONLY support the latest major version of upstream crates. And we can backport certain important features to an older major release upon user request. We can probably commit to supporting the latest two or three major releases, so users with slower upgrade schedule can still be taken care of.

@epage
Copy link

epage commented Apr 26, 2022

In case this is relevant to the discussion, it appears diesel has been taking the approach of version ranges like >=0.7.0, <2.0.0. This does not actually play well with cargo though users can workaround it. See rust-lang/cargo#10599

@genusistimelord
Copy link
Contributor

genusistimelord commented Apr 26, 2022

Yeah, that is a Difficult issue. Though it really is not the responsibility of the library owner to always support older versions. In this case diesel just dictates that every newest version is supported as the used API has not changed since 0.7 and they expect that 2.0.0 might have possible API changes. So, in this case the end user should just update to the current latest support version of UUID. diesel will always be the latest version till UUID updates to 2.0.0 which then they need to reevaluate the changes to see if they are still compatible with the current usage of the API. If they are then they might change that to be

=0.7.0, <3.0.0 and so forth.

If something like this occurs, it is still forcing the end user to update to the latest version as they guarantee each new version won’t break diesel. This also enforces the end users to update their UUID to match that of diesel. which is great in the Security Aspect of things. It also means Diesel needs to make a lot less version changes to update their 3rd party library support.

@abonander
Copy link
Collaborator Author

Closing this since the community consensus appears to be "just do breaking releases more often".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests