Sliding Sync: Pre-populate room data for quick filtering/sorting #17512

MadLittleMods · 2024-07-31T23:42:00Z

Pre-populate room data for quick filtering/sorting in the Sliding Sync API

This PR is acting as the Synapse version N+1 step in the gradual migration being tracked by #17623

Adding two new database tables:

sliding_sync_joined_rooms: A table for storing room meta data that the local server is still participating in. The info here can be shared across all Membership.JOIN. Keyed on (room_id) and updated when the relevant room current state changes or a new event is sent in the room.
sliding_sync_membership_snapshots: A table for storing a snapshot of room meta data at the time of the local user's membership. Keyed on (room_id, user_id) and only updated when a user's membership in a room changes.

Also adds background updates to populate these tables with all of the existing data.

We want to have the guarantee that if a row exists in the sliding sync tables, we are able to rely on it (accurate data). And if a row doesn't exist, we use a fallback to get the same info until the background updates fill in the rows or a new event comes in triggering it to be fully inserted. This means we need a couple extra things in place until we bump SCHEMA_COMPAT_VERSION and run the foreground update in the N+2 part of the gradual migration. For context on why we can't rely on the tables without these things see [1].

On start-up, block until we clear out any rows for the rooms that have had events since the max-stream_ordering of the sliding_sync_joined_rooms table (compare to max-stream_ordering of the events table). For sliding_sync_membership_snapshots, we can compare to the max-stream_ordering of local_current_membership
- This accounts for when someone downgrades their Synapse version and then upgrades it again. This will ensure that we don't have any stale/out-of-date data in the sliding_sync_joined_rooms/sliding_sync_membership_snapshots tables since any new events sent in rooms would have also needed to be written to the sliding sync tables. For example a new event needs to bump event_stream_ordering in sliding_sync_joined_rooms table or some state in the room changing (like the room name). Or another example of someone's membership changing in a room affecting sliding_sync_membership_snapshots.
Add another background update that will catch-up with any rows that were just deleted from the sliding sync tables (based on the activity in the events/local_current_membership). The rooms that need recalculating are added to the sliding_sync_joined_rooms_to_recalculate table.
Making sure rows are fully inserted. Instead of partially inserting, we need to check if the row already exists and fully insert all data if not.

All of this extra functionality can be removed once the SCHEMA_COMPAT_VERSION is bumped with support for the new sliding sync tables so people can no longer downgrade (the N+2 part of the gradual migration).

^[1]

For sliding_sync_joined_rooms, since we partially insert rows as state comes in, we can't rely on the existence of the row for a given room_id. We can't even rely on looking at whether the background update has finished. There could still be partial rows from when someone reverted their Synapse version after the background update finished, had some state changes (or new rooms), then upgraded again and more state changes happen leaving a partial row.

For sliding_sync_membership_snapshots, we insert items as a whole except for the forgotten column so we can rely on rows existing and just need to always use a fallback for the forgotten data. We can't use the forgotten column in the table for the same reasons above about sliding_sync_joined_rooms. We could have an out-of-date membership from when someone reverted their Synapse version. (same problems as outlined for sliding_sync_joined_rooms above)

Discussed in an internal meeting

TODO

Dev notes

SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase

SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase

SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_sliding_sync.FilterRoomsTestCase

Reference:

Development docs on background updates and worked examples of gradual migrations
A real example of a gradual migration: N + 3: Read from column full_user_id rather than user_id of tables profiles and user_filters matrix-org/synapse#15649 (comment)
Adding rooms.creator field that needed a background update to backfill data, Populate rooms.creator field for easy lookup matrix-org/synapse#10697
Adding rooms.room_version that needed a background update to backfill data, Add rooms.room_version column matrix-org/synapse#6729
Adding room_stats_state.room_type that needed a background update to backfill data, Implement MSC3827: Filtering of /publicRooms by room type matrix-org/synapse#13031
Tables from MSC2716: insertion_events, insertion_event_edges, insertion_event_extremities, batch_events
current_state_events updated in synapse/storage/databases/main/events.py

persist_event (adds to queue)
_persist_event_batch
_persist_events_and_state_updates (assigns `stream_ordering` to events)
_persist_events_txn
	_store_event_txn
        _update_metadata_tables_txn
            _store_room_members_txn
	_update_current_state_txn

Concatenated Indexes [...] (also known as multi-column, composite or combined index)

[...] key consists of multiple columns.

We can take advantage of the fact that the first index column is always usable for searching

-- https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys

Dealing with portdb (synapse/_scripts/synapse_port_db.py), #17512 (comment)

SQL queries:

Both of these are equivalent and work in SQLite and Postgres

Options 1:

WITH data_table (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) AS (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
)
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT * FROM data_table
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}

Option 2:

INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT 
    column1 as room_id,
    column2 as user_id,
    column3 as membership_event_id,
    column4 as membership,
    column5 as event_stream_ordering,
    {", ".join("column" + str(i) for i in range(6, 6 + len(insert_keys)))}
FROM (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
) as v
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}

If we don't need the membership condition, we could use:

INSERT INTO sliding_sync_non_join_memberships
    (room_id, membership_event_id, user_id, membership, event_stream_ordering, {", ".join(insert_keys)})
VALUES (
    ?, ?, ?,
    (SELECT membership FROM room_memberships WHERE event_id = ?),
    (SELECT stream_ordering FROM events WHERE event_id = ?),
    {", ".join("?" for _ in insert_values)}
)
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Code style is correct
(run the linters)

…oom-meta-data

See https://github.com/element-hq/synapse/blob/1dfa59b238cee0dc62163588cc9481896c288979/docs/development/database_schema.md#boolean-columns

synapse/storage/schema/main/delta/87/01_sliding_sync_memberships.sql

…he joined rooms table

…tream ordering

``` common column name "room_id" appears more than once in left table ```

So we can craft `PersistedEventPosition(...)`

This way if the row exists, we can rely on the information in it. And only use a fallback for rows that don't exist.

…oom-meta-data Conflicts: synapse/handlers/sliding_sync/__init__.py

erikjohnston · 2024-08-29T15:09:48Z

Woo 🎉 🚀 🎉 🥳 🎈

MadLittleMods · 2024-08-29T15:40:44Z

@erikjohnston Super grateful that you were able to try this PR out multiple times on your own server to find all of the weird old corners in the data that we still need to deal with today 🙇

Thank you @erikjohnston and @reivilibre for all of the discussion so we can ensure that the data is reliable to use before and after the gradual migration is fully complete!

…7632) This reverts commit ab414f2. Introduced in #17512

… sync tables (#17635) Fix outlier re-persisting causing problems with sliding sync tables Follow-up to #17512 When running on `matrix.org`, we discovered that a remote invite is first persisted as an `outlier` and then re-persisted again where it is de-outliered. The first the time, the `outlier` is persisted with one `stream_ordering` but when persisted again and de-outliered, it is assigned a different `stream_ordering` that won't end up being used. Since we call `_calculate_sliding_sync_table_changes()` before `_update_outliers_txn()` which fixes this discrepancy (always use the `stream_ordering` from the first time it was persisted), we're working with an unreliable `stream_ordering` value that will possibly be unused and not make it into the `events` table.

Follows on from #17512, other fixes include: #17633, #17634, #17635

Based on #17629 Utilizing the new sliding sync tables added in #17512 for fast acquisition of rooms for the user and filtering/sorting. --------- Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>

Follow-up to #17634, #17631 and #17632 to fix-up #17512

…t's faster (#17658) Get `bump_stamp` from [new sliding sync tables](#17512) which should be faster (performance) than flipping through the latest events in the room.

…nd job (#17673) Follow-up to #17652, #17641, #17634, #17631 and #17632 to fix-up #17512

Start thinking about schemas

d26ac74

MadLittleMods added the A-Sync label Jul 31, 2024

MadLittleMods added 17 commits July 31, 2024 18:43

Add changelog

e7e9cb2

Use foreign keys

8392d6a

Merge branch 'develop' into madlittlemods/sliding-sync-pre-populate-r…

ad1c887

…oom-meta-data

Start of updating sliding_sync_joined_rooms

2b5f07d

Fill in sliding_sync_non_join_memberships when current state changes

1a251d5

Special treatment for boolean columns

f96d0c3

See https://github.com/element-hq/synapse/blob/1dfa59b238cee0dc62163588cc9481896c288979/docs/development/database_schema.md#boolean-columns

Test is running

2f3bd27

Server left room test

cb33580

Change to updating the latest membership in the room

87d9561

Closer to right

61cea4e

Fix comparison and insert

68a3daf

Better test assertions

5b1053f

Test non-joins

c590474

Add more tests

a1aaa47

Handle to_delete

bf78692

Handle server left room

5cf3ad3

Fix some lints

bc3796d

MadLittleMods commented Aug 8, 2024

View reviewed changes

synapse/storage/schema/main/delta/87/01_sliding_sync_memberships.sql Outdated Show resolved Hide resolved

MadLittleMods added 10 commits August 8, 2024 15:41

Fill in stream_ordering/bump_stamp when we add current state to t…

cc2d2b6

…he joined rooms table

Fill in stream_ordering/bump_stamp for any event being persisted

ca90901

Need to fix upsert

3367422

Fix bumping when events are persisted out of order

ed47a7e

Refactor to sliding_sync_membership_snapshots

0af3b48

Update descriptions

552f8f4

Fix lints

f069659

Fill in for remote invites (out of band, outlier membership)

53232e6

Fix events from rooms we're not joined to affecting the joined room s…

ab074f5

…tream ordering

User ID is not unique because user is joined to many rooms

3e1f24e

MadLittleMods added 5 commits August 28, 2024 17:21

Fix join condition not working in Postgres

2f6ee08

``` common column name "room_id" appears more than once in left table ```

Add instance_name to sliding_sync_membership_snapshots

6622a1c

So we can craft `PersistedEventPosition(...)`

Fully-insert sliding_sync_joined_rooms rows

bcc3e50

This way if the row exists, we can rely on the information in it. And only use a fallback for rows that don't exist.

Merge branch 'develop' into madlittlemods/sliding-sync-pre-populate-r…

95d5471

…oom-meta-data Conflicts: synapse/handlers/sliding_sync/__init__.py

Explain more in schema

b63188c

erikjohnston approved these changes Aug 29, 2024

View reviewed changes

erikjohnston merged commit 1a6b718 into develop Aug 29, 2024
39 checks passed

erikjohnston deleted the madlittlemods/sliding-sync-pre-populate-room-meta-data branch August 29, 2024 15:09

MadLittleMods mentioned this pull request Aug 29, 2024

Fix background update for sliding sync (find previous membership) (v2) #17632

Merged

erikjohnston added a commit that referenced this pull request Aug 29, 2024

Fix background update for sliding sync (find previous membership) (#1…

d844afd

…7632) This reverts commit ab414f2. Introduced in #17512

This was referenced Aug 29, 2024

Sliding Sync: Fix outlier re-persisting causing problems with sliding sync tables #17635

Merged

Sliding sync: use new DB tables #17630

Merged

erikjohnston mentioned this pull request Aug 30, 2024

Sliding sync: various fixes to background update #17636

Merged

erikjohnston added a commit that referenced this pull request Sep 1, 2024

Sliding sync: various fixes to background update (#17636)

d52c17c

Follows on from #17512, other fixes include: #17633, #17634, #17635

This was referenced Sep 1, 2024

Sliding sync: Fix bg update again (v3) #17634

Merged

Fix background update to handle invalid events #17641

Merged

erikjohnston mentioned this pull request Sep 3, 2024

Sliding sync: various fixups to the background update #17652

Merged

This was referenced Sep 3, 2024

Sliding Sync: Update filters to be robust against remote invite rooms #17450

Merged

Sliding Sync: Get bump_stamp from new sliding sync tables because it's faster #17658

Merged

erikjohnston added a commit that referenced this pull request Sep 5, 2024

Fix background update to handle invalid events (#17641)

b09bcf1

Follow-up to #17634, #17631 and #17632 to fix-up #17512

erikjohnston mentioned this pull request Sep 5, 2024

Sliding sync: various fixups to the sliding sync joined room background job #17673

Merged

MadLittleMods added the A-Database label Sep 6, 2024

MadLittleMods mentioned this pull request Sep 6, 2024

Sliding Sync: Speed up background updates to populate Sliding Sync tables #17676

Open

3 tasks

erikjohnston added a commit that referenced this pull request Sep 10, 2024

Sliding sync: various fixups to the sliding sync joined room backgrou…

b3047f3

…nd job (#17673) Follow-up to #17652, #17641, #17634, #17631 and #17632 to fix-up #17512

MadLittleMods mentioned this pull request Sep 11, 2024

Sliding Sync: Support filtering by 'tags' / 'not_tags' in SSS #17662

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sliding Sync: Pre-populate room data for quick filtering/sorting #17512

Sliding Sync: Pre-populate room data for quick filtering/sorting #17512

MadLittleMods commented Jul 31, 2024 •

edited

Loading

erikjohnston commented Aug 29, 2024

MadLittleMods commented Aug 29, 2024

Sliding Sync: Pre-populate room data for quick filtering/sorting #17512

Sliding Sync: Pre-populate room data for quick filtering/sorting #17512

Conversation

MadLittleMods commented Jul 31, 2024 • edited Loading

TODO

Dev notes

Pull Request Checklist

erikjohnston commented Aug 29, 2024

MadLittleMods commented Aug 29, 2024

MadLittleMods commented Jul 31, 2024 •

edited

Loading