Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Proposal: mitigate extremities accumulation using lazy-transmitted dummy events #5319

Closed
ara4n opened this issue Jun 3, 2019 · 4 comments
Closed
Labels
A-Performance Performance, both client-facing and admin-facing

Comments

@ara4n
Copy link
Member

ara4n commented Jun 3, 2019

This is a proposal for a pragmatic way to mitigate extremity-build-up without being blocked on the longer term goals of speeding up state res via delta state res (#3122) or chunks (#3226, #3240 etc).

  • We expose an experimental setting in Synapse to let it automatically heal extremities.
  • When Synapse sees more than N forward extremities in a room, it can synthesise an m.dummy timeline event for the room, which has parents to the current extremity set.
  • This event would not be proactively transmitted over federation, but instead stored locally, and so used to speed up your local server's state resolution for rooms which are heavily fragmented but with lurking local users (i.e. the source of typical Forward extremities accumulate and lead to poor performance #1760 extremity buildup).
  • The next time a user sends a real message in the room, any queued dummy event(s) would be sent along with the message. Servers could potentially filter these out from being sent to clients, given they're not meaningful to clients, but this wouldn't be a hard requirement.
  • You would end up typically having at least 2 extremities for the room (the dummy event, plus whatever traffic is coming in over federation), but this is better than extremities growing unbounded.
  • The sender of the dummy event could be the server itself (perhaps using the @:foo.com convention proposed in MSC1777.). However, this would necessitate a new room version, so it might be easier to just send on behalf of an arbitrary local user instead (picking one who has permission to speak; if none have permission to speak, then the dummy events can just accumulate and won't be sent until the user gets permission to do so)
  • N needs to be picked to be less than or equal to the maximum parents for a given event (currently 10), and less than the threshold which starts to pose performance problems for state res in a typical room. However, it should also be as high as possible, to avoid too many dummy events getting created, and risking a traffic storm developing. N=5 might be a good bet?

In the past, I think this approach was dismissed because:

  • Of the risk of a traffic storm as servers synthesise dummy events in response to fragmentation, which could cause an amplification attack of sorts, especially if the dummy events themselves created new fragmentation.
    • However, this proposal hopefully solves this by not proactively sending the dummy events, as well as reducing the risk of a storm by only creating dummy events every N extremities. (It's worth noting that the global rate of creating dummy events could still exceed the rate at which extremities are locally generated however, given each server in the room might end up synthesising them).
  • Of the risk that the additional DAG connectedness might increase the risk of problematic state resolution (e.g. state resets, or pathologically slow state resolution).
    • Hopefully v2 state res mitigates this, though.
@ara4n ara4n changed the title Proposal: mitigate extremities using lazy-transmitted dummy events Proposal: mitigate extremities accumulation using lazy-transmitted dummy events Jun 3, 2019
@turt2live
Copy link
Member

One concern with sending the events to clients would be them flagging rooms as unread (the "why is this room bold?" bug). Making it a hard requirement to not do that would be appreciated.

@neilisfragile neilisfragile added A-Performance Performance, both client-facing and admin-facing p1 labels Jun 7, 2019
@ara4n
Copy link
Member Author

ara4n commented Jun 12, 2019

well, clients shouldn't consider rooms unread for event types they don't know about (and we need to fix that in general for Matrix). Edit: plus the server could just ignore dummy events when calculating unread counts.

On discussing this with @erikjohnston another potential issue cropped up, which is that he's worried that when the dummy event(s) eventually does get sent from the fragmented room by the lurker speaking there, they are going to have loads of parents which could spiderweb over the whole history of the room. This will cause other servers in the room to go madly fishing around for the events in question, even if they predate that server's participation in the room, and increase the computational complexity of state resolution due to needing to model all the paths. Therefore it might be better to intelligently prune extremities where possible rather than using dummy events. I'm writing the new prop up as another bug.

@ara4n
Copy link
Member Author

ara4n commented Sep 3, 2019

this was implemented in #5480

@richvdh
Copy link
Member

richvdh commented Oct 2, 2019

this was implemented in #5480

yes it was

@richvdh richvdh closed this as completed Oct 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Performance Performance, both client-facing and admin-facing
Projects
None yet
Development

No branches or pull requests

4 participants