Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Request & follow redirects for /media/v3/download #16701

Merged
merged 5 commits into from
Nov 29, 2023

Conversation

clokep
Copy link
Member

@clokep clokep commented Nov 28, 2023

This has two mostly unrelated changes to it:

  • Request /media/v3/download over federation (falling back to /media/r0/download)
  • When using the v3 endpoint, add allow_redirects from MSC3860

I tested this by setting up a server locally and doing:

curl --header "Authorization: Bearer $TOK" http://localhost:8080/_matrix/media/v3/download/beeper.com/18850ea089e0ecc16d7db55527925b43ad63295c

Which had the following logs:

synapse.http.matrixfederationclient - 665 - DEBUG - GET-0 - {GET-O-1} [beeper.com] Sending request: GET matrix-federation://beeper.com/_matrix/media/v3/download/beeper.com/18850ea089e0ecc16d7db55527925b43ad63295c?allow_remote=false&timeout_ms=20000&allow_redirect=true; timeout 60.000000s
synapse.http.matrixfederationclient - 665 - DEBUG - GET-0 - {GET-O-2} [beeper.com] Sending request: GET https://thumbnails.eu-central-2.media.beeper.com/8ca/7d9a66bed41b67440ab3807a03e70e22ac70b; timeout 60.000000s
synapse.http.matrixfederationclient - 716 - DEBUG - GET-0 - {GET-O-2} [beeper.com] Got response headers: 200 OK
synapse.http.matrixfederationclient - 1491 - INFO - GET-0 - {GET-O-1} [beeper.com] Completed: 200 OK [140970 bytes] GET matrix-federation://beeper.com/_matrix/media/v3/download/beeper.com/18850ea089e0ecc16d7db55527925b43ad63295c?allow_remote=false&timeout_ms=20000&allow_redirect=true

You can see it attempts to fetch the media from Beeper's homeserver, gets a redirect to a CDN and then fetches it from there instead.

In order to accomplish this I reworked the fetching of downloaded media to be more similar to other federation requests. You can compare the solid to the dotted line. The main goal of this was to add the fallback from v3 -> r0.

flowchart
    A[MediaRepository._download_remote_file]
    B[FederationClient.download_media]
    C1[TransportLayerClient.download_media_r0]
    C2[TransportLayerClient.download_media_v3]
    D[MatrixFederationHttpClient.get_file]
    A --> D

    A -.-> B -.-> C2 -.-> D
    B -.-> C1 -.-> D
Loading

Fixes #15196, part of #15661.

@@ -714,6 +720,26 @@ async def _send_request(
response.code,
response_phrase,
)
elif (
response.code in (307, 308)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about wrapping the agent in a RedirectAgent, but that handles additional codes sadly.

Comment on lines +164 to 168
_generate_uri: bool = True
"""True to automatically generate the uri field based on the above information.

Set to False if manually configuring the URI.
"""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out a better way to force attrs to do what I want. Using a default/factory works if you're either just creating the instance or evolving it and updating the URI, but not when evolving it and wanting to generate the URI again. I figured being explicit was best.

@clokep clokep marked this pull request as ready for review November 29, 2023 14:59
@clokep clokep requested a review from a team as a code owner November 29, 2023 14:59
Comment on lines +261 to +264
def write_err(f: Failure) -> Failure:
f.trap(HttpResponseException)
output_stream.write(f.value.response)
return f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one was new to me: https://docs.twisted.org/en/stable/api/twisted.python.failure.Failure.html#trap

TL;DR the trap call is a no-op if f contains an HTTPResponseException; otherwise the trap raises immediately so that the next errback can handle this Failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd missed that this was in test code though. Why do we need to add this as an errback all of a sudden?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add it because we know call errback sometimes on the list of Deferreds so that we can resolve a request with an error instead of a response.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, I think I see: we never called errback on the mock until now!

new_uri = urllib.parse.urljoin(request.uri, location)

return await self._send_request(
attr.evolve(request, uri=new_uri, generate_uri=False),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's spelled _generate_uri in the struct definition. Is this some attrs way of declaring a private attribute that you can initialise via the constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this some attrs way of declaring a private attribute that you can initialise via the constructor?

Yes, pretty much. attrs strips the preceding underscores to let you have 'private' attributes that can be initialized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clokep and others added 2 commits November 29, 2023 13:37
Co-authored-by: David Robertson <davidr@element.io>
Copy link
Contributor

@DMRobertson DMRobertson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, thanks!

@clokep clokep enabled auto-merge (squash) November 29, 2023 18:59
@clokep
Copy link
Member Author

clokep commented Nov 29, 2023

Something like matrix-org/synapse-s3-storage-provider#107 might be useful if you're interested in supporting this for the Client-Server API. Implementing it in Synapse proper seems out of scope?

@clokep clokep merged commit d6c3b75 into develop Nov 29, 2023
41 checks passed
@clokep clokep deleted the clokep/media-redirects branch November 29, 2023 19:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Synapse calls /r0 for media over federation
2 participants