Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Refactor Filter to handle data differently #11194

Merged
merged 12 commits into from
Oct 27, 2021
26 changes: 13 additions & 13 deletions synapse/api/filtering.py
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,7 @@ def check(self, event: FilterEvent) -> bool:
"senders": lambda v: user_id == v,
"types": lambda v: "m.presence" == v,
}
contains_url = False
return self._check_fields(literal_keys)
else:
sender = event.get("sender", None)
content = event.get("content") or {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I think we generally write this this as event.get("content", {})?

If event["content"] was an empty string, False or an empty list then we'd interpret the content as an empty dictionary. That feels like it might be unintentionally suppressing a type check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done on purpose to handle those cases actually! If it is of the wrong type we really just want to pretend it is a dictionary.

Regardless this is mostly just moving code which could be improved. I'll add some more error checking though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, I guess this is an event we have already persisted and so it needs to remain as it is? (As in, we can't reject it and say "this is of the wrong format"?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right they should always have content, but event is a misnomer -- this is sometimes account data or potentially other things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make this better with 14a805c.

Expand All @@ -338,7 +338,6 @@ def check(self, event: FilterEvent) -> bool:
ev_type = event.get("type", None)

# check if there is a string url field in the content for filtering purposes
contains_url = isinstance(content.get("url"), str)
labels = content.get(EventContentFields.LABELS, [])

literal_keys = {
Expand All @@ -348,13 +347,19 @@ def check(self, event: FilterEvent) -> bool:
"labels": lambda v: v in labels,
}

return self._check_fields(literal_keys, contains_url)
result = self._check_fields(literal_keys)
if not result:
return result

def _check_fields(
self,
literal_keys: Dict[str, Callable[[str], bool]],
contains_url: bool,
) -> bool:
contains_url_filter = self.contains_url
if contains_url_filter is not None:
contains_url = isinstance(content.get("url"), str)
if contains_url_filter != contains_url:
return False

return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully follow what this code trying to do, but the refactor looks safe to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contains_url was always set to False for the user-presence case, so I inlined it in the non-user-presence case instead of keeping it in the generic function.

What the code is trying to do is match if there's a URL in the content and the filter is set to only allow content with URLs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense. And I can see that corresponds to what filters are required to do in the spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, pretty much! This flag doesn't apply to presence updates so no reason to run it (and I want to add more differentiation between these in the future...)


def _check_fields(self, literal_keys: Dict[str, Callable[[str], bool]]) -> bool:
"""Checks whether the filter matches the given event fields.
clokep marked this conversation as resolved.
Show resolved Hide resolved

Returns:
Expand All @@ -372,11 +377,6 @@ def _check_fields(
if not any(map(match_func, allowed_values)):
return False

contains_url_filter = self.contains_url
if contains_url_filter is not None:
if contains_url_filter != contains_url:
return False

return True

def filter_rooms(self, room_ids: Iterable[str]) -> Set[str]:
Expand Down