-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC3255: Use SRV record for homeservers discovery by clients #3255
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
# Proposal to leverage SRV records to discover homeservers from clients | ||
|
||
Currently, the [specifications on server discovery by client](https://spec.matrix.org/unstable/client-server-api/#server-discovery) merely mentions the use of the `/.well-known/matrix/` HTTP path. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should also be noted that |
||
This comes in contradiction with the [specifications on server discovery by servers](https://spec.matrix.org/unstable/server-server-api/#server-discovery) which also leverage the existence of a SRV record. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My experience of the use of SRV in server-server discovery is that the number of times it is useful there is tiny, and its existence repeatedly confuses people into thinking it is something they want or need. Irrespective of the practicalities of supporting this in web-based clients, I would be very much opposed to adding this complication to the C-S API, particularly given there doesn't seem to be a particularly compelling reason for it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's no contradiction here. Server-Server API is exposed at an unrelated set of endpoints and can even be served from a different host than Client-Server API. |
||
|
||
Furthermore, this oddity makes the user front, arguably the most crucial part for technology | ||
adoption, the most complicated when a HTTP path is not used by the Matrix instance operator. | ||
For instance, in such a case, when an instance operator uses a `_matrix._tcp.example.org` SRV record | ||
pointing to an `example.com` instance on port `8448`, the `example.org` hostname shall be used | ||
in conjunction with the instance's (`example.com`) port for a client to find the homeserver. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's pretty fragile and not very flexible. If I want to expose my Client-Server API at matrix.example.com (with Server-Server API still running at example.com:8448), what do I do? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This part was merely describing what address is to be used today when you do not leverage the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm afraid we're not on the same page. If my MXID is |
||
|
||
This oddity strengthens when you consider the technologies at hand: by design, the hierarchical and | ||
distributed design of DNS and the usual cache done by recursive (non-authoritative) resolvers, | ||
distributed amongst different operators, makes it so extra DNS requests usually have small to no impact | ||
on a target instance. On the other hand, HTTP endpoints scaling remaining in the hands of the end-of-line | ||
operators, extra HTTP request _do_ have an immediate impact on them. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you are unable to handle scaling for what is essentially a static file, I'd question your ability to handle scaling for your Matrix server. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment misses the point. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The point is that we don't look at things in isolation, but we look at things in the context of the larger system. Yes, DNS scales better than .well-known. But given that clients only check the .well-known at login/registration time, and .well-known is essentially a static file (and is a static file in many setups), scaling the actual Matrix server is a much bigger issue than scaling the .well-known server, and an admin who is able to scale the Matrix server should also be able to scale the .well-known server. So even though DNS does scale better than .well-known, and there is no argument about that, it just isn't a compelling argument for switching to DNS in this situation, especially in balance with other arguments for sticking with .well-known. |
||
|
||
## Proposal | ||
|
||
The `SRV` record shall be used as specified in the server -> server API for client -> server discovery. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't really enough detail for a proposal. Are you proposing to remove the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The specs I quote include both: I never remotely implied There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The thing is, an MSC has to provide enough detail that a spec editor knows exactly how the spec should be changed, because the next step after an MSC is accepted is actually changing the spec to reflect what the MSC says. So big questions such as which one should be tried first can't be left open. (Unless you intend for it to be up to the client developer which one should be tried first, in which case you should explicitly say that. But that's probably a bad idea.) |
||
|
||
## Tradeoffs | ||
|
||
If current server -> server API is kept, an extra DNS lookup will only be made in case the HTTP one fails. | ||
In case it is decided the `SRV` lookup shall be done first, the cost would still be minimal, | ||
DNS being designed to scale. | ||
|
||
## Additional information | ||
This proposal would help proving clients, like Element, cf. [vector-im/element-web#15054](https://github.com/vector-im/element-web/issues/15054#issuecomment-681969376) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that the linked issue relates to the Element Web, and we've already established that Element Web cannot make use of SRV records, that doesn't seem relevant here. (Also, the linked issue turned out to be a misconfiguration rather than an actual problem.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As already pointed out, I am afraid this Additional information section actually hurts the proposal, because the issue belongs to one implementation which made the implementation choice of a browser framework which kills the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think the linked issue can be dropped from this proposal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I know this MSC is impossible. the difference exists as Browsers are unable to access any DNS related data. So it is impossible for those to get the SRV record unfortunately. Apart from that the SRV record is from what I heard more or less deprecated and well-known should get preferred. Though on the deprecation the spec core team should correct me if I am wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impossible for browser-based, aka Web clients, which the reference implementation Element belongs to. That way, I admit my additional note might hurt my proposal a tad.
However, that merely questions the soundness of using a browser framework as a foundation for clients. The specifications being broader that any choice made during implementation, I see nothing in this MSC being impossible nor invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we've been there before. I suggest looking at MSC1708 (it's for federation but the rationale is equally applicable to client-server interaction). Also, the original MSC for
.well-known
lookup in clients says this in the rationale (aside from the point mentioned above about browser-based clients being unable to use SRV records):Regarding the soundness of using a browser framework - I personally can relate to that but nobody's going to rewrite Element Web outside of the web, and it's the most used Matrix client for now and in the foreseeable future. Not supporting web implementations is a deal-breaker, I'm afraid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have that backwards. We are not saying to use
.well-known
because we are only thinking about browser-based clients. We are saying that browser-based clients are something that we want people to be able to create, and so things that we add to the spec should take browser-based clients into consideration unless there's a really good reason to omit support for browser-based clients.As a more concrete example, there is an MSC for a low-bandwidth CS API, which is something that browser-based clients cannot use, but is being pursued because it gives an actual practical benefit. On the other hand, browser-based clients aren't really disadvantaged by it because they can still use the normal HTTP-based API. So it has a big benefit + no disadvantage to browser-based clients means that nobody is going to raise the "it doesn't support browser-based clients" flag on that MSC.
However, this proposal has little benefit, and prevents use by browser-based clients, hence the "it doesn't support browser-based clients" flag gets raised here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my looking glass was putting more concern on the users and the operators of instances than on the clients' developers.
Having a service you can delegate to any host, any port, is flexible, and opens possibilities. The
SRV
record is there to make sure resolution for a specific service encompasses all the required information to connect to said service, hence address and port. It is an elegant, long-lasting feature of the DNS standard, serving this exact purpose.The way I see it, the
.well-known
feature is trying to reimplement/reinvent that name resolution ability on the HTTP layer.As you pointed out in your own comment, if one wants to supports browser-based clients today, configuration of a
well-known
path seems somewhat mandatory. This proposal doesn't remove anything to that, hence such client can continue being served, provided instances' management take special care of them.The benefit here is to manage service resolution at the proper location, in the name resolution protocol, allowing strong decoupling between actual application layers and the piping leading to them.
The leads to simpler names being used in clients (helps users and adoption), and name resolution can potentially exclusively be managed in DNS if operators wish so (helps operators).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with this is mainly that if we add SRV there will be people only adding that and complaining on the element-web repo or other web based client repos that it doesnt work. Simply because this makes "normal admins" think that this solves it for all clients while it clearly cannot while well-known archives a uniform experience for all clients and therefor users.
I dont think this makes in this proto too much of a difference simply for the fact it is a HTTP based proto. If it were a mainly tcp or udp based proto DNS would make for sure sense but HTTP is for matrix the common ground thats already a hard requirement anyway so it can be expected to work with any client and server setup while DNS will not always work due to browser limitations.
well-known already solved this.
I dont think this is actually true as like above said (and also by others) this would exclude proper usage from any web or electron client which is quite a substantial amount of clients at this time.
While it doesnt remove them I think this will lead to even more confusion on this process than already exists.
well-known does the same thing according to RFC8615 (well-known RFC).
I honestly dont see the benefit for operators or users here as it just adds confusion and more possible configuration mistakes because people aren't aware that browser applications dont have access to DNS requests.