Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Export data admin command is broken. #10908

Closed
erikjohnston opened this issue Sep 24, 2021 · 6 comments · Fixed by #11078
Closed

Export data admin command is broken. #10908

erikjohnston opened this issue Sep 24, 2021 · 6 comments · Fixed by #11078
Assignees
Labels
S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@erikjohnston
Copy link
Member

Running python -m synapse.app.admin_cmd -c <config_file> export-data <user_id> results in:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.8/site-packages/synapse/app/admin_cmd.py", line 236, in <module>
    start(sys.argv[1:])
  File "/usr/local/lib/python3.8/site-packages/synapse/app/admin_cmd.py", line 190, in start
    and not config.worker_log_file
  File "/usr/local/lib/python3.8/site-packages/synapse/config/_base.py", line 340, in __getattr__
    return self._get_unclassed_config(None, item)
  File "/usr/local/lib/python3.8/site-packages/synapse/config/_base.py", line 363, in _get_unclassed_config
    raise AttributeError(item, "not found in %s" % (list(self._configs.keys()),))
AttributeError: ('worker_log_file', "not found in ['modules', 'server', 'tls', 'federation', 'caches', 'database', 'logging', 'ratelimiting', 'media', 'oembed', 'captcha', 'voip', 'registration', 'account_validity', 'metrics', 'api', 'appservice', 'key', 'saml2', 'oidc', 'cas', 'sso', 'jwt', 'auth', 'email', 'authproviders', 'push', 'spamchecker', 'room', 'groups', 'userdirectory', 'consent', 'stats', 'servernotices', 'roomdirectory', 'thirdpartyrules', 'tracing', 'worker', 'redis', 'experimental']")

The effect of this is that its currently not possible to export a users data, e.g due to a GDPR request.

c.f. #10900

@erikjohnston erikjohnston added S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. labels Sep 24, 2021
@dklimpel
Copy link
Contributor

dklimpel commented Sep 28, 2021

When I am ty to fix it by removing this line:

and not config.worker.worker_log_file

I get this new error:

2021-09-28 14:59:19,138 - synapse.handlers.admin - 107 - INFO - command - [@user:server] Handling room !QdAJnGzpNsxiimllTe:server, 1/37
2021-09-28 14:59:20,909 - twisted - 271 - CRITICAL - sentinel - main function encountered error
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.8/site-packages/twisted/internet/defer.py", line 662, in callback
    self._startRunCallbacks(result)
  File "/opt/venvs/matrix-synapse/lib/python3.8/site-packages/twisted/internet/defer.py", line 764, in _startRunCallbacks
    self._runCallbacks()
  File "/opt/venvs/matrix-synapse/lib/python3.8/site-packages/twisted/internet/defer.py", line 858, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/opt/venvs/matrix-synapse/lib/python3.8/site-packages/twisted/internet/defer.py", line 1751, in gotResult
    current_context.run(_inlineCallbacks, r, gen, status)
--- <exception caught here> ---
  File "/opt/venvs/matrix-synapse/lib/python3.8/site-packages/twisted/internet/defer.py", line 1661, in _inlineCallbacks
    result = current_context.run(gen.send, result)
  File "/opt/venvs/matrix-synapse/lib/python3.8/site-packages/synapse/app/admin_cmd.py", line 224, in run
    await args.func(ss, args)
  File "/opt/venvs/matrix-synapse/lib/python3.8/site-packages/synapse/app/admin_cmd.py", line 80, in export_data_command
    res = await hs.get_admin_handler().export_user_data(
  File "/opt/venvs/matrix-synapse/lib/python3.8/site-packages/synapse/handlers/admin.py", line 169, in export_user_data
    events = await filter_events_for_client(self.storage, user_id, events)
  File "/opt/venvs/matrix-synapse/lib/python3.8/site-packages/synapse/visibility.py", line 108, in filter_events_for_client
    ] = await storage.main.get_retention_policy_for_room(room_id)
builtins.AttributeError: 'AdminCmdSlavedStore' object has no attribute 'get_retention_policy_for_room'

2021-09-28 14:59:20,914 - twisted - 271 - INFO - sentinel - Main loop terminated.

@erikjohnston Do you have a hint to fix it? Is it a missing import?

@H-Shay H-Shay self-assigned this Sep 30, 2021
@H-Shay
Copy link
Contributor

H-Shay commented Oct 1, 2021

It looks like the issue is that we recently changed config.worker_log_file to config.worker.worker_log_file in synapse/synapse/app/admin_cmd.py however, the config.worker object has no attribute worker_log_file, nor is there any references to a worker_log_file anywhere else in the code. A little digging did uncover that the logging configuration file for a worker will contain a filename/path to a logfile for that worker.

My question is, since we are already checking that the config.worker.worker_log_config exists, do we really also need to check that the log file that it presumably references exists as well? I.e. can we drop that line, or is it necessary to figure out how to extract the reference to the worker log file from the worker_log_config and check for it when this is the only place that reference will be used? I spent some time trying to do just that and couldn't figure it out, which led me to wonder if it isn't necessary in the first place.

@DMRobertson
Copy link
Contributor

nor is there any references to a worker_log_file anywhere else in the code.

Indeed, but I think this goes back further than the recent changes you mention (#10897?).

Running

git log -G worker_log_file

the first two hits are bb7fdd8 and 6a85cb5. The former is part of Patrick's PR; the latter is part of (the pleasingly-numberered) #5678 from 2019.

Has this script been broken since 2019? Surely we'd have noticed sooner...

I.e. can we drop that line

I think so. 6a85cb5 seems to be removing log_file configurations. And in e484101 we explicitly said "this option is gone".

So yeah, my vote is to drop that line.

@DMRobertson
Copy link
Contributor

(Though as @dklimpel notes, there might be more fixing required afterwards.)

@richvdh
Copy link
Member

richvdh commented Oct 5, 2021

Has this script been broken since 2019? Surely we'd have noticed sooner...

seems like it has, and nobody at element uses it. If we're going to have it, we should probably have some CI to check it keeps working (like we do for the port_db script).

@H-Shay
Copy link
Contributor

H-Shay commented Oct 5, 2021

Okay excellent. I will write a test case to keep an eye on this functionality going forwards, fix this issue, and hopefully fix anything else that might have broken in the interval.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants