Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-51] [Bug] dbt deps does not handle bad git tarball download #4579

Closed
1 task done
barberscott opened this issue Jan 17, 2022 · 2 comments · Fixed by #4609
Closed
1 task done

[CT-51] [Bug] dbt deps does not handle bad git tarball download #4579

barberscott opened this issue Jan 17, 2022 · 2 comments · Fixed by #4609
Assignees
Labels
bug Something isn't working deps dbt's package manager
Milestone

Comments

@barberscott
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

When downloading the tar file from github, it is possible for the tar file to be malformed or incomplete (the likely most common case is that the tar file in the local filesystem actually contains an error message instead of it being a real tar file) and when this happens deps will throw the following error:

Traceback (most recent call last):
  File "/usr/src/app/sinter/clients/dbt.py", line 1200, in call
    dbt_main.handle(command + extra_args)
  File "/usr/local/lib/python3.8/dist-packages/dbt/main.py", line 159, in handle
    res, success = handle_and_check(args)
  File "/usr/local/lib/python3.8/dist-packages/dbt/main.py", line 205, in handle_and_check
    task, res = run_from_args(parsed)
  File "/usr/local/lib/python3.8/dist-packages/dbt/main.py", line 258, in run_from_args
    results = task.run()
  File "/usr/local/lib/python3.8/dist-packages/dbt/task/deps.py", line 66, in run
    package.install(self.config, renderer)
  File "/usr/local/lib/python3.8/dist-packages/dbt/deps/registry.py", line 74, in install
    system.untar_package(tar_path, deps_path, package_name)
  File "/usr/local/lib/python3.8/dist-packages/dbt/clients/system.py", line 489, in untar_package
    with tarfile.open(tar_path, 'r') as tarball:
  File "/usr/lib/python3.8/tarfile.py", line 1608, in open
    raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully

This typically is indicative of a transient problem with github itself but in the context of dbt Cloud will cause a run to fail when a retry could possibly succeed.

Expected Behavior

  • Identify the bad tar file and retry the download, perhaps by checking for a successful list (tar -tzf foo.tar.gz >/dev/null or something similar) or, rather than spend 2x the effort (check, then untar) simply handle the failure to untar.

  • Optionally log a certain number of trailing bytes/characters in the case where the tar file is bad so we can understand, when this happens, whether it's just a truncated tar (download is being cut off) or something like an error message in a json response being pumped into the tar file since, I think, in the download iterating over what github is giving us and dropping it into a .tar.gz file.

Steps To Reproduce

No response

Relevant log output

No response

Environment

No response

What database are you using dbt with?

No response

Additional Context

No response

@barberscott barberscott added bug Something isn't working triage labels Jan 17, 2022
@github-actions github-actions bot changed the title [Bug] dbt deps does not handle bad git tarball download [CT-51] [Bug] dbt deps does not handle bad git tarball download Jan 17, 2022
@jtcohen6 jtcohen6 added Team:Language packages Functionality for interacting with installed packages labels Jan 17, 2022
@jtcohen6 jtcohen6 added this to the v1.0.2 milestone Jan 20, 2022
@emmyoop emmyoop self-assigned this Jan 20, 2022
@emmyoop
Copy link
Member

emmyoop commented Jan 21, 2022

Thanks so much for the detailed issue @barberscott!

This is definitely something we want to get resolved. We've added it as a task for the 1.0.2 release and I'm actively working on resolving it. I'm going to add in some retry logic in the least. I'm going to think on how/if logging any of the tarfile seems like the right solution.

@leahwicz
Copy link
Contributor

leahwicz commented Feb 1, 2022

BAckport: #4649

@jtcohen6 jtcohen6 added deps dbt's package manager and removed packages Functionality for interacting with installed packages labels Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working deps dbt's package manager
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants