Skip to content

Commit

Permalink
Extract post context join out into temporary table
Browse files Browse the repository at this point in the history
Upgrading to MySQL 8 has enabled better performance analysis, which I've
immediately applied to the longest-running query.

The big JOIN used to gather context for posts (their user, their parent,
and their thread) forms the start of the notifiable users subquery. As
it's in the subquery, it is executed each time the subquery is run.
While it's not the whole query, it is a significant part of it.

Extracting this JOIN out to a temporary table which is only created once
will make the query more efficient overall.

The temporary table will be destroyed at the end of the connection
session, so I don't need to worry about that, or about the table
becoming out of date when new posts are downloaded as it won't exist
until after that.

The query results in a table scan on this temporary table, which may not
be the most efficient approach. That being said, I didn't find any
indexes that make it any faster - all I tried slowed it down. The query
could be hugely sped up by reducing the rows in this table i.e. the
number of posts - e.g. by limiting it by date - but that is out of scope
for now.
  • Loading branch information
rossjrw committed Jun 10, 2023
1 parent de6ef4e commit f067ac2
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 23 deletions.
1 change: 1 addition & 0 deletions notifier/database/drivers/mysql.py
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,7 @@ def get_user_configs(self, frequency: str) -> List[CachedUserConfig]:
return user_configs

def get_notifiable_users(self, frequency: str) -> List[str]:
self.execute_named("cache_post_context")
user_ids = [
cast(str, row["user_id"])
for row in self.execute_named(
Expand Down
24 changes: 24 additions & 0 deletions notifier/database/queries/cache_post_context.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
CREATE TEMPORARY TABLE IF NOT EXISTS post_with_context
WITH cte AS (
SELECT
post.user_id AS post_user_id,
post.posted_timestamp AS post_posted_timestamp,
parent_post.id AS parent_post_id,
parent_post.user_id AS parent_post_user_id,
thread.id AS thread_id,
first_post_in_thread.user_id AS first_post_in_thread_user_id
FROM
post
INNER JOIN
thread ON thread.id = post.thread_id
INNER JOIN
thread_first_post ON thread_first_post.thread_id = thread.id
INNER JOIN
post AS first_post_in_thread ON first_post_in_thread.id = thread_first_post.post_id
LEFT JOIN
post AS parent_post ON parent_post.id = post.parent_post_id
WHERE
-- Remove deleted threads/posts
thread.is_deleted = 0 AND post.is_deleted = 0
)
SELECT * FROM cte
Original file line number Diff line number Diff line change
Expand Up @@ -11,46 +11,32 @@ WHERE
-- Only users with a notification waiting for them
AND EXISTS (
SELECT NULL FROM
post
INNER JOIN
thread ON thread.id = post.thread_id
LEFT JOIN
post AS parent_post ON parent_post.id = post.parent_post_id
INNER JOIN
thread_first_post ON thread_first_post.thread_id = thread.id
INNER JOIN
post AS first_post_in_thread ON first_post_in_thread.id = thread_first_post.post_id
post_with_context
WHERE
-- Remove deleted posts
post.is_deleted = 0

-- Remove posts made by the user
AND post.user_id <> user_config.user_id
post_with_context.post_user_id <> user_config.user_id

-- Only posts posted since the user was last notified
AND post.posted_timestamp > user_last_notified.notified_timestamp

-- Remove deleted threads
AND thread.is_deleted = 0
AND post_with_context.post_posted_timestamp > user_last_notified.notified_timestamp

-- Only posts matching thread or post subscription criteria
AND (
-- Posts in threads started by the user
first_post_in_thread.user_id = user_config.user_id
post_with_context.first_post_in_thread_user_id = user_config.user_id

-- Replies to posts made by the user
OR parent_post.user_id = user_config.user_id
OR post_with_context.parent_post_user_id = user_config.user_id

-- Posts in threads subscribed to and replies to posts subscribed to
OR EXISTS (
SELECT NULL FROM
manual_sub
WHERE
manual_sub.user_id = user_config.user_id
AND manual_sub.thread_id = thread.id
AND manual_sub.thread_id = post_with_context.thread_id
AND (
manual_sub.post_id IS NULL -- Threads
OR manual_sub.post_id = parent_post.id -- Post replies
OR manual_sub.post_id = post_with_context.parent_post_id -- Post replies
)
AND manual_sub.sub = 1
)
Expand All @@ -62,10 +48,10 @@ WHERE
manual_sub
WHERE
manual_sub.user_id = user_config.user_id
AND manual_sub.thread_id = thread.id
AND manual_sub.thread_id = post_with_context.thread_id
AND (
manual_sub.post_id IS NULL -- Threads
OR manual_sub.post_id = parent_post.id -- Post replies
OR manual_sub.post_id = post_with_context.parent_post_id -- Post replies
)
AND manual_sub.sub = -1
)
Expand Down

0 comments on commit f067ac2

Please sign in to comment.