Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify RID Model #260

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
249 changes: 249 additions & 0 deletions accepted/2022/simplify-rid-model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
# Simplify RID Model

Date: April 2022

[Runtime IDs (RIDs)](https://docs.microsoft.com/dotnet/core/rid-catalog) are one of the most challenging features of the product. They are necessary, impact many product scenarios, and have very obscure and challenging UX. They also equally affect development and runtime and sometimes prevent .NET from running on operating systems that Microsoft doesn't support. This document proposes a plan to improve some aspects of the way RIDs work, particularly for scenarios that we've found to be insufficient for our needs.

## Context

RIDs are (for the most part) modelled as a graph of [target triplet](https://wiki.osdev.org/Target_Triplet) symbols that describe legal combinations of operating system, chip architecture, and C-runtime, including an extensive fallback scheme. This graph is codified in [`runtime.json`](https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.NETCore.Platforms/src/runtime.json), which is describes the RID catalog. It is a massive (database in a) file. That's a lot of data to reason about, hard to edit correctly, and a bear to maintain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target triplet

AFAIK, target triplets have more cons than pros. It really depends on target audience, but there are negative connotations with the concept of triplet in native tooling communities. They consider it vague and archaic concept. e.g. for the modern ISA, the compiler toolchain documentation was written in terms of ABI (-mabi), architecture (-march) and microarchitecture (-mtune). In case of .NET tooling, --arch and --os are relatable concepts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--arch and --os are indeed nicer to work with. They came out of our effort to make it easier to support emulated environments (like Rosetta 2 on macOS). They are however just a convenience wrapper over the RID concept. More importantly, NuGet packages need a currency for describing assets intended for various. The triplets are the simplest scheme we've found for that. We haven't found a better system.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely understand and it is probably too late to do anything about it. Since RID is inspired by triplets, I just wanted to point out that modern toolchains prefer explicitly specifying each aspect of platform individually over having a triplet scheme to interpret magic strings, which parser/consumer would always need to remember. They also started their journey with triplets but over the years, found it as an unnecessary overhead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all makes sense when you are using a compiler to produce artifacts that you will use. I fail to see how that works for a package manager that supports a broad computing ecosystem. Those specific flags also seem too low-level. By definition, they are not intended (per my read) to be used outside of the RISC-V ecosystem.

Copy link
Member

@am11 am11 Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to see how that works for a package manager that supports a broad computing ecosystem.

rid: linux-x64 versus arch: x64, os: linux, latter case is self-explanatory and involves no parsing for tool to extract values.

Copy link
Member

@jkotas jkotas Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, target triplets have more cons than pros. It really depends on target audience, but there are negative connotations with the concept of triplet in native tooling communities.

I agree that the classic machine-vendor-operatingsystem triplets have archaic feel, due to the vendor field and all sorts of cryptic suffixes that tend to be appended to them.

Most RIDs are operatingsystem-machine that does not have the same feel. Maybe the doc should not lead with explaining RIDs as target triplets to avoid giving readers wrong impression.

Python Platform Tag defined in PEP 0425 is much more similar to RIDs than target triplets. The definition of the Python Platform Tag does not make any associations with target triplets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Target triplets do indeed have problems and the parsing is annoying. But I think the fundamental problem is the same -- in order to provide binary compatibility between components, we need to have an encoding for whatever's necessary for binary compatiblity on that platform.

For Mac and Windows, OS version and architecture is probably sufficient. For Linux, we'll also need libc type (and version, although it could be implicitly tied to the TFM).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are in agreement about what we want to do. This is just a discussion about wording - whether it is helpful to describe RID as target triplet or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, RID is essentially a close-enough hint and we cannot reliably capture the "desired system configuration" with it accurately. As you pointed out in other thread @agocke, that was not the goal RID was designed to achieve. We perform run time introspections and we have abstraction layers in place to adapt to the system we are running on; in (a fex/countable) places where it makes a difference in dotnet/runtime. This is probably the extent to which 3P package, with OS specific native artifacts, rely on RID.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to remove the linkage with target triplet if that is what folks want.


Note: `runtimes.json` is a generated file, from [runtimeGroups.props](https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.NETCore.Platforms/src/runtimeGroups.props).

Examples of triplets today are:

- `win-x86`
- `macos-arm64`
- `linux-arm64`
- `linux-musl-x64`

Note: A larger set of triplets are included near the end of the document.

The first three examples may seem like doubles and not triplets. For those examples, the C-runtime is assumed, implicitly completing the triplet. For Windows and macOS, the OS-provided C-runtime is always used. For Linux, the OS-provided C-runtime is also used, but there isn't a single answer across all distros. Most distributions use [glibc](https://www.gnu.org/software/libc/), while others use [musl](https://musl.libc.org/). For Linux doubles, glibc is the implicit C-runtime, which completes the triplet.

Native code compilers will generate different code for each triplet, and triplets are incompatible with one another. That's not a .NET concept or design-point, but a reality of modern computing.

The `runtimes.json` file is used for asset description, production, and selection. Code can be conditionally compiled in terms of RIDs, and NuGet packages may contain RID-specific code (often native dependencies). At restore-time, assets may be restored in terms of a specific RID. [SkiaSharp](https://nuget.info/packages/SkiaSharp/) is a good example of a package that contains RID-specific assets, in the `runtimes` folder. Each folder within `runtimes` is named as the appropriate RID. At runtime, the host must select the best-match assets from those same `runtime` folders if present (no longer within a package, but retaining the same layout).

RIDs can be thought of as making statements about native code, specifically:

- I offer code of this RID.
- I request code of this RID.

Often times, the RIDs being offered and asked for do not match (in terms of the actual string) but are compatible. The role of `runtimes.json` is determining whether two RIDs are compatible (and best-match). That's done via a graph walk of the nodes in that file.

The current system is unnecessarily complex, expensive to maintain, makes it difficult to manage community needs, and costly/fragile at runtime. More simply, it is useful, but significantly overshoots our needs.

Note: RIDs are almost entirely oriented around calling native code. However, the domain is broader, more generally oriented on native interop and calling [Foreign Function Interfaces (FFI)](https://en.wikipedia.org/wiki/Foreign_function_interface), including matching the environment. For example, we'd still likely need to use RIDs for interacting with another managed platform like Java, possibly with the use of native code or not.

Note: RIDs are a close cousin to Target Framework Monikers (TFMs), however have less polished UX and don't describe the same concept (although they overlap).

Note: This topic area applies to other development platforms, demonstrated by the following documents. It is universally a domain of more challenging UX and partial solutions.

- https://peps.python.org/pep-0600/
- https://peps.python.org/pep-0656/
- https://doc.rust-lang.org/rustc/platform-support.html

## Biggest problems

The following are the biggest problems we see:
richlander marked this conversation as resolved.
Show resolved Hide resolved

- We get requests to add new RIDs ([Asianux Server 8](https://github.com/dotnet/runtime/issues/2129), [CBL-Mariner](https://github.com/dotnet/runtime/issues/65566), [Anolis Linux](https://github.com/dotnet/runtime/pull/66132)), with no end in sight.
- RID [authoring is needed for new versions of existing OSes](https://github.com/dotnet/runtime/issues/59803) and [sometimes doesn't work when that doesn't happen](https://github.com/dotnet/runtime/issues/65152)
- Reasoning about [portable vs non-portable Linux RIDs](https://github.com/dotnet/runtime/pull/62942).
- We have to read and process `runtime.json` at application startup, which has a (unmeasured) performance impact.
- The fact that the RID graph is so large (and continuing to grow) demonstrates that we chose a poor design point.

## General approach

- Ensure `runtimes.json` is correct.
- Freeze `runtimes.json`.
richlander marked this conversation as resolved.
Show resolved Hide resolved
- New RIDs will only be added for [interchange across multiple platforms](https://github.com/dotnet/designs/pull/260#discussion_r843009872), which isn't expected.
richlander marked this conversation as resolved.
Show resolved Hide resolved
- Continue to use `runtimes.json` for all NuGet scenarios.
richlander marked this conversation as resolved.
Show resolved Hide resolved
- Disable using `runtimes.json` for all host scenarios, by default (starting with a TBD .NET version).
- Enable a host compatibility mode that uses `runtimes.json` via MSBuild property (which writes to), `runtimeconfig.json`, and a host CLI argument.
- Implement a new algorithmic host scheme for RID selection, enabled by default.

The new scheme will be based on the following:

- Each host is built with a hard-coded set of RIDs that it probes for, both triplets and singles.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we either

(1) Hard code the (large set) fallbacks into the host
(2) Evaluate the fallbacks at build time and put them in the (small) set of RID specific folders the host is built for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither. This is covered later in the doc. Each host will know about 2-3 RIDs and none other. If that doesn't work, there will be a fall-back mechanism (which will just be a MSBuild + runtimeconfig.json property) to use the RID catalog. Make sense?

- Each host is already built for a specific RID triple, so this change is an evolutionary change.
- There are processor-agnostic managed-code scenarios where a RID single is relevant, like Windows-only code (for example, accessing the registry) that works the same x64, Arm64 and x86. The same is true for macOS and Linux.
- The RID that `dotnet --info` returns will always match the first in this hard-coded set of RIDs.

Let's assume that an app is running on Alpine 3.16, the host will probe for the following RIDs:

- `linux-musl-x64`
- `unix`

A glibc distro like Debian would be similar:

- `linux-arm64`
- `unix`

macOS would be similar:

- `osx-arm64`
- `unix`

Note: The abstract `unix` RID is used for macOS and Linux, instead of a concrete RID for those OSes.

Note: We may want to add `osx` to describe macOS-specific processor-agnostic managed code. If we find some scenarios that need it, we should add it.

Windows would similar:

- `win-x86`
- `win`

Note: `win7` and `win10` also exist (as processor-specific and -agnostic). Ideally, we don't have to support those in the host-specific scheme, but we need to do some research on that.

Note: A mix of processor-specific RIDs are used above, purely to demonstrate the range of processor-types supported. Each operating system supports a variety of processor-types, and this is codified in their associated RIDs.

More generally, the host will probe for:

- This environment, RID triplet (OS-CRuntime-arch)
- This environment, RID single (OS-only).
Comment on lines +102 to +105
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we build a host for a specific RID, why do we need to do any probing at runtime? Asked differently, why can't we do all the probing at build time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of portable apps.


The host will implement "first one wins" probing logic for selecting a RID-specific asset.

This behavior only applies for portable apps. RID-specific apps will not probe for RID-specific assets.

There are other RIDs in `runtimes.json` today. This new scheme will not support those. The `runtimes.json` host compat feature will need to be used to support those, which is likely very uncommon. Even if they are quasi-common, we will likely still make this change and require some combination of the ecosystem to adapt and app developers to opt-in to the `runtimes.json` compat mode.

The host and `runtimes.json` must remain compatible. The RIDs known by the host must be a subset of `runtimes.json` RIDs. We may need to update `runtimes.json` to ensure that the RIDs known by the host are present in that file.


## Minimum CRT/libc version

This scheme doesn't explicitly define a CRT/libc version. Instead, we will define a minimum CRT/libc version, per major .NET version. That minimum CRT/libc version will form a contract and should be documented, per .NET version. Practically, it defines the oldest OS version that you can run an app on, for a given .NET version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using C runtime version to define the OS version makes sense on Linux. It is the best approximation of the OS version to use for determining binary compatibility (of binaries written in C/C++).

Using C runtime version for this purpose makes less sense on Windows and macOS. Windows and macOS have well-defined OS version. Both Windows (Windows SDK) and macOS (XCode) native toolchains are oriented on using the OS version for this purpose (WINVER on Windows and -mmacos-version-min on macOS). We should use OS version for determining binary compatibility on Windows and macOS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to add that text, or adapt its meaning into the current text?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adapting the meaning into the current text would look better. I would lead with minimum OS version as a documented contract, per .NET version. libc version is Linux-specific approximation of OS version across distros for the purposes of this contract.


For .NET 7 (for all supported architectures):

- For Windows, .NET will target the Windows 10 CRT.
- For Linux with glibc, .NET will target CentOS 7 (with glibc version `2.17`).
- For Linux with musl, .NET will target the oldest supported Alpine version.

Note: Source-build project will likely make different choices. For example, the IBM s390x project supports RHEL 8, not CentOS 7. glibc compatibility relationships are discussed later.

For .NET 8, we'll continue with the model. However, we will no longer be able to use [CentOS 7 (due to EOL)](https://wiki.centos.org/About/Product), but will need to adopt another distro that provides us with an old enough glibc version such that we can offer broad compatibility.

For .NET 9, we'll likely drop support for Windows 10 and instead target the Windows 11 CRT.
richlander marked this conversation as resolved.
Show resolved Hide resolved

As part of investigating this topic, we noticed the [libc compatibility model that TensorFlow uses](https://github.com/tensorflow/tensorflow/blob/f3963e82f21c9d094503568699877655f9dac508/tensorflow/tools/ci_build/devtoolset/build_devtoolset.sh#L48-L57). That model enables the use of a modern OS with an artificially old libc. We also saw that Python folks are doing something similar with their [`manylinux` approach but with docker](https://github.com/pypa/manylinux).

This in turn made us realize that our [`build-tools-prereqs` Docker images](https://github.com/dotnet/dotnet-buildtools-prereqs-docker/blob/main/src/centos/7/Dockerfile) are very similar to the Python approach. We don't need a new contract like `manylinux2014` since we can rely on the contract changing with each major .NET version. We need augment our build-tools-prereq scheme to also include an old glibc version, once CentOS 7 is EOL.

The biggest difference with our `build-tools-prereqs` images is that they are not intended for others to use, while the `manylinux` images are intended as a general community artifact. There are two primary cases where extending the use of the `build-tools-prereqs` images would be useful: enabling others to produce a .NET build with the same compatibility reach, and enabling NuGet authors to build native dependencies with matching compatibility as the underlying runtime. Addressing that topic is outside the scope of this document. This context is included to help frame how the TensorFlow, Python, and .NET solutions compare, and to inspire future conversation.

Note: It appears that some other folks have been [reacting to CentOS 7 not being a viable compilation target](https://gist.github.com/wagenet/35adca1a032cec2999d47b6c40aa45b1) for much longer.

## Legal RIDs

RIDs are defined by legal combinations of operating system, chip architecture, and C-runtime. We support both triplets (primarily native code) and singles (only processor-agnostic managed code).

A given RID is used to describe:

- This system supports this RID, per the .NET host.
- This code will run on / requires a system that supports this RID, per the author.

The following table describes RIDs supported by the .NET host. It is exhaustive, per the hosts provided by Microsoft.
richlander marked this conversation as resolved.
Show resolved Hide resolved

| RID | Description of scope |
|------------|----------------------|
| unix | All Unix-based OSes (macOS and Linux), versions, and architecture builds. |
| win | All Windows versions and architecture builds. |
| linux-arm32| All Linux glibc-based distros, for Arm32. |
richlander marked this conversation as resolved.
Show resolved Hide resolved
richlander marked this conversation as resolved.
Show resolved Hide resolved
| linux-arm64| All Linux glibc-based distros, for Arm64. |
richlander marked this conversation as resolved.
Show resolved Hide resolved
| linux-x64 | All Linux glibc-based distros, for x64. |
| linux-x86 | All Linux glibc-based distros, x86. |
| linux-musl-arm64| All Linux musl-based distros, for Arm64. |
| linux-musl-x64 | All Linux musl-based distros, for x64. |
| osx-arm64 | All macOS versions, for Arm64.
| osx-x64 | All macOS versions, for x64.
| win-arm64 | All Windows versions, for Arm64. |
| win-x64 | All Windows versions, for x64. |
| win-x86 | All Windows versions, for x86. |

Note: Singles are for processor-agnostic managed code.

Note: All RIDs are subject to .NET support policy. For example, .NET 7 doesn't support Ubuntu 16.04. The `linux-x64` RID doesn't include that specificity.

Note: `osx` is used instead of `macOS` within the RID scheme. This design change may be a good opportunity to introduce `macOS`. It probably makes sense only to do that for Arm64.

The following table describes RIDs supported only by `runtimes.json`. It is not exhaustive.

| RID | Description of scope |
|------------|----------------------|
| any | The root RID, which is compatible with all other RIDs. |
| alpine | All Alpine versions and architecture builds. |
| alpine-x64 | All Alpine versions, for x64. Other processor-types are also supported. |
| alpine3.16-arm64 | Alpine 3.16, for Arm64. Other processor-types are also supported.|
| centos | CentOS has a similar scheme as Alpine. |
| debian | Debian has a similar scheme as Alpine. |
| osx | macOS has a similar scheme as Alpine. |
| rhel | Red Hat Enterprise Linux has a similar scheme as Alpine. |
| tizen | Tizen has a similar scheme as Alpine. |
| ubuntu | Ubuntu has a similar scheme as Alpine. |

Note: Many other Linux distros are represented in `runtimes.json`.

Note: `runtimes.json` will be frozen, which means that RID schemes only supported by this file are abandoned. For example, RIDs that include OS versions will not be updated going forward.

## Source-build

A major design tenet of the RID system is maximizing portability of code. This is particularly true for Linux. That makes sense from the perspective of Microsoft wanting to make one build of .NET available across many Linux distros, separately for both glibc and musl. It makes less sense for distros themselves building .NET from source and publishing binaries to a distro package feed.

We sometimes refer to `linux-x64` (and equally to `linux-arm64` and `linux-x86`) as the "portable Linux RID". As mentioned in the libc section, this RID establishes a wide glibc compatibility range. In addition, the RID also establishes broad compatibility across distros (and their associated versions) for .NET dependencies, like OpenSSL and ICU. The way this particular form of compatibility shows up is observable but is nuanced (and isn't explained here).

We'll use Red Hat as an example of an organization that builds .NET from source, to complete this discussion. They have raised concerns on [source-build not playing nicely with `linux-x64`](https://github.com/dotnet/runtime/pull/62942#issuecomment-1056942235).

For Red Hat, it makes sense to accept and support `linux-x64` NuGet assets, but not to produce or offer them. Instead, Red Hat would want to produce `rhel-x64` runtime pack assets. It's easiest to describe this design-point in terms of concrete scenarios.

**NuGet packages** -- NuGet authors will typically value breadth and efficiency. For example, a package like `SkiaSharp` might target `linux-x64`, but not `rhel-x64`. There is no technical reason for that package to include distro-specific assets. Also, if the author produced a `rhel-x64` asset, they would need to consider producing an `ubuntu-x64` and other similar assets for other distros, and that's not scalable. The expectation is that the `rhel-x64` .NET supports `linux-x64` NuGet assets, enabling NuGet authors to target a minimal set of RIDs.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this document generally, and this section specifically, is making some assumptions that I don't think are justified. There are absolutely scenarios where distro-specific assets need to be shipped in a NuGet package.

As an example of this, take a look at https://www.nuget.org/packages/LibGit2Sharp.NativeBinaries/2.0.306

LibGit2Sharp is a managed wrapper around the native libgit2 library and has to accommodate the native dependencies of libgit2. One of the dependencies is OpenSSL, which varies wildly among all the distros that .NET supports.

In order to have a reasonable chance of having a native binary that just works on all platforms, I have had to spend a lot of time understanding the RID graph and shipping enough distro-specific binaries to cover all the supported Linux distros

I will point out that libgit2 has recently changed how they are binding to OpenSSL which has let me simplify things down to more of an "ideal" situation (see the newer package), but that was largely out of my control since I'm not a libgit2 maintainer.

If the ability to ship distro-specific assets in a NuGet package and have the proper one selected had not been a feature that I could count on, then I would not have been able to support LibGit2Sharp on Linux at all for the past several years. Maybe I'm misunderstanding something, but it seems like this proposal is removing this feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind you, libgit2/libgit2sharp#1714 changed the model used by libgit2sharp. AFAIK, the change basically makes libgit2sharp try and load all native libgit2 libraries one by one until one of them can be loaded successfully (ie, links to a version of OpenSSL available on the system). The actual name (and RID) is not too relevant in this scenario.

I feel like your use-case would still work if nuget did not "know" about distro-specific RIDs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great feedback. We discussed it.

There are three models:

  • Don't support specific distros as a general concept (the current proposal).
  • Fully support specific distros as a general concept (some variation on status-quo).
  • Adopt a model that is a variation on the Python manylinux wheel plan, including support for OpenSSL.

I like what @omajid is proposing. It's the same approach we use for OpenSSL. We don't use RIDs. Also, the current proposal works for the vast majority of cases and is a massive simplification. I'm hoping we can keep it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libgit2sharp was suffering from the rid scalability issue: it was unable to deal with unknown rids.

Using NativeLibrary.SetDllImportResolver it now tries the different so files. The code is here: https://github.com/libgit2/libgit2sharp/blob/97bee65fd296f1c7dd2d1d64581c170f45b584e1/LibGit2Sharp/Core/NativeMethods.cs#L99-L121

This way, the library can have its own logic for picking so files. It's not limited to/by rids.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this logic was added, but I consider it a fallback solution, and not something desirable to rely on as the primary logic, for a couple of reasons:

  • It only works if the entire runtimes folder is shipped as part of the application, which is only going to be true if you use one of the cross-platform publishing options
  • Having to attempt to load every binary until you find one that works seems like it would have a negative impact on startup perf.

If libraries no longer can assume that the best native asset is selected for them by the framework/NuGet, then that means they have to manually ensure that all binaries are included in the publish output to be able to select from them at runtime. That can pretty massively increase the size of the output. Using LibGit2Sharp.NativeBinaries 2.0.306 as an example, that means going from having a single ~1MB .so file in your output to having 9 different copies, and that's assuming I'd have some way to know to only copy the Linux binaries and not the entire runtimes folder.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example add a property to csproj that enables the user to choose the supported OpenSSL versions, and include so files based on that.

The problem does not end with OpenSSL. The same problem exists for many other libraries.

I believe that the only truly scalable option for Linux ecosystem is to have building from source as an option (NuGet/Home#9631). Building from source is capable of producing a binary that works for the specific configuration of your system without the package maintainer pre-building it for you.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem does not end with OpenSSL. The same problem exists for many other libraries.

Yes, and this was the main point of the feedback I was trying to raise here. The proposal currently assumes that there is never a reason to care about the specific distro you're running on, but that is demonstrably not true.

This all comes down to there being a disconnect between the .NET ecosystem expectations of sharing pre-compiled binaries, and the Linux ecosystem that expects you to share source and compile it on your own system. Shipping a Linux binary in a package is always going to be an uphill battle, so I would like to see the .NET side of things evolve in a way that makes it easier, not harder.

Copy link
Member

@tmds tmds Apr 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that it does not have support for probing for OpenSSL 3 yet so it is going to break once distros switch to OpenSSL 3 only. dotnet/runtime has probing for OpenSSL3 that adds a whole new level of shimming and complexity.

Let's assume OpenSSL 3 won't be supported by a libgit2 shimming (immediately) and you want to solve it issue using distro rids.

Fedora 36 should use the OpenSSL 3 build.

Fedora 37 should be using the OpenSSL 3 build also. That doesn't work automagically today, because the rid grah doesn't describe a supports relationship between Fedora 37 and Fedora 36.

For your package to remain maintainable, this needs to be addressed (and preferably not by bloating the graph but by making the rid logic smarter).

I've only mentioned Fedora here, but OpenSSL will be adopted by Ubuntu, Debian, so you need to manage this problem for them too.

Your package now looks like this:

linux-x64 -> supports OpenSSL 1, 2, 3 and picks using `NativeLibrary`
fedora.36 -> OpenSSL 3
ubuntu.22.04 -> OpenSSL 3
debian.12 -> OpenSSL 3
...

Note that you are including the distro specific assets solely for the purpose of rid-based trimming. They bloat your package and require you to track in what version a distro adopts a new OpenSSL version.

The proposal currently assumes that there is never a reason to care about the specific distro you're running on

NativeLibrary can be used, and should be used to make linux-x64 work across a range of distros.

What is lost, is the ability to trim against a rid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bording I hope this example helped illustrate how unscalable the rid mechanism is?

And, if it should stay, inheritance between successive distro versions is a must-have, which is now missing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, if it should stay, inheritance between successive distro versions is a must-have, which is now missing.

To elaborate. The rids as we have them are not scalable. The alternative to the simple model that is proposed here, is to have semantic rids. That means rids follow a <distro>.<version>-<arch> naming scheme. Information is derived from that instead of hard coding it in the json file (e.g. Fedora 36 -> Fedora 35). The json file should then only contain non-derivable information (like what distros are glibc/musl based).

This is a much more complex solution.


**Runtime and host packs** -- The .NET SDK doesn't include all assets that are required for all scenarios, but instead relies on many assets being available on a NuGet feed (nuget.org or otherwise). That's a good design point for a variety of reasons. Runtime packs are a concrete scenario, which are used for building self-contained apps. Red Hat should be able to (A) build RHEL-specific runtime packages, (B) publish them to a feed of their choice, and (C) enable their users to download them by specifying a RHEL-specific RID as part of a typical .NET CLI workflow. RHEL-specific runtime packs would only be compatible with Red Hat Enterprise Linux family OSes. For example, the RHEL-specific runtime for RHEL 8 would only support the default OpenSSL version that is provided in the RHEL 8 package feed.

**Host RID support** -- The Red Hat provided host would need to probe for assets that might be included in the app, much like was described earlier. It would use the following scheme, here specific to x64.

- `rhel.8-x64`
- `linux-x64`
- `unix`

A source-built RID (here `rhel.8-x64`) will typically be distro-specific and versioned. Red Hat would naturally want to build .NET specifically and separately for RHEL versions, like RHEL7 and RHEL 8, since the packages available in each of their versioned Red Hat package feeds will be different. There might be other differences to account for as well. An unversioned RID (like `rhel-x64`) would not enable that.
Copy link
Member

@am11 am11 Apr 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Red Hat would naturally want to build .NET specifically and separately for RHEL versions,

Do we know if other programming platforms (Go, Python, Node.js etc.) also use RH specific triplets? If they do not (which I think is the case), then why do we see it as a natural requirement for .NET, specifically?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go is well known for its ability to to produce native executables for different platforms.
It has no dependency on a c-library, instead it has its own Go-implementation that makes syscalls directly.
Consequently it doesn't have the need to describe this dependency as a rid.

I don't know the story for building native executables from Python/JavaScript.


This means that we'll have two forms of RIDs:

- A basic, default, form that is distro-agnostic and unversioned, like `linux-x64`, oriented on broad compatibility.
- A specific, source-build, form, that is distro-specific and versioned, like `rhel.8-x64`, oriented on distro version specialization (and matching distro practices).

Consider these two commands:

```bash
`dotnet build --self-contained`
`dotnet build --self-contained -r rhel.8-x64`
```

On a RHEL 8 machine (using a RH-provided .NET), those commands are intended to be equivalent, and to produce a RHEL 8 specific .NET app.

The `rhel.8-x64` RID needs to be have a compatibility relationship with `linux-x64` for the purpose of NuGet, since the former will typically not to present in NuGet packages. Source-build users can [add new RIDs as part of the build](https://github.com/dotnet/runtime/pull/50818) to enable this scenario.

This same technique can enable restoring `rhel.8-x64` NuGet assets if they exist. Those assets would not be restorable on other systems. If there are scenarios where RHEL 8 apps need to be composed on other systems, we can consider how to resolve that.

There are a few gaps that need to be resolved before we can deliver on this model, such as:

- The various packs have a pre-defined naming scheme, including the string "Microsoft". That may or may not be OK.
- Packs are assumed to be on NuGet.org and can (currently) only be published by Microsoft.
- Packs published to other feeds may require the use of [Package Source Mapping](https://docs.microsoft.com/nuget/consume-packages/package-source-mapping), which may be challenging and would likely need support in the .NET CLI.
- All of these scenarios are currently NuGet-oriented. There may be cases where it makes sense to deploy packs via a package manager feed, but to still enable the typical associated workflows.

We need to work through these and other topics with Red Hat and other source-build users.

## Related topics

The RID topic is quite broad. This proposal is intended to simplify an important aspect of RID handling in the product. There are others to consider that are outside of the scope of the proposal. They are listed for interest and to inspire further improvements.

- [Switch to building RID-specific apps by default](https://github.com/dotnet/sdk/issues/23540).
- Enable multi-pass builds where multiple RIDs are specified, much like multi-targeting for TFMs.
- Enable build RID-specific packages as a first-class .NET CLI experience.
- Enable building RID-split packages (for example a `SkiSharp` parent package with RID-specific dependencies).
- Good experience for using RIDs with `dotnet restore`.