Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wheel relocation issues #2777

Merged
merged 30 commits into from
Oct 14, 2020
Merged

Fix wheel relocation issues #2777

merged 30 commits into from
Oct 14, 2020

Conversation

andfoy
Copy link
Contributor

@andfoy andfoy commented Oct 8, 2020

No description provided.

@codecov
Copy link

codecov bot commented Oct 8, 2020

Codecov Report

Merging #2777 into master will decrease coverage by 0.08%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2777      +/-   ##
==========================================
- Coverage   73.31%   73.22%   -0.09%     
==========================================
  Files          99       96       -3     
  Lines        8724     8446     -278     
  Branches     1373     1320      -53     
==========================================
- Hits         6396     6185     -211     
+ Misses       1909     1859      -50     
+ Partials      419      402      -17     
Impacted Files Coverage Δ
torchvision/ops/feature_pyramid_network.py 75.82% <0.00%> (-15.39%) ⬇️
torchvision/models/detection/rpn.py 93.16% <0.00%> (-0.71%) ⬇️
torchvision/models/detection/faster_rcnn.py 81.17% <0.00%> (-0.22%) ⬇️
torchvision/io/__init__.py 89.65% <0.00%> (ø)
torchvision/ops/__init__.py 100.00% <0.00%> (ø)
torchvision/models/detection/__init__.py 100.00% <0.00%> (ø)
torchvision/ops/focal_loss.py
torchvision/models/detection/retinanet.py
torchvision/models/detection/anchor_utils.py
torchvision/models/detection/backbone_utils.py 100.00% <0.00%> (+5.71%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2831f11...f5917e2. Read the comment docs.

@andfoy
Copy link
Contributor Author

andfoy commented Oct 9, 2020

I have bad news here, we need to compile FFmpeg from scratch on the manylinux image in order to have manylinux wheel compatibility

@fmassa
Copy link
Member

fmassa commented Oct 9, 2020

Ok. I propose that we first fix the issues with libjpeg / libpng, merge this, and then fix ffmpeg in a follow-up PR

@andfoy
Copy link
Contributor Author

andfoy commented Oct 9, 2020

We need to compile libPNG ourselves as well

@andfoy andfoy changed the title [DEBUG/DO NOT MERGE] Check wheel relocation issues Fix wheel relocation issues Oct 13, 2020
@andfoy
Copy link
Contributor Author

andfoy commented Oct 14, 2020

Regarding the Windows testing errors, it seems that we're picking the global conda ffmpeg binaries instead of the wheel ones

imagen

@andfoy
Copy link
Contributor Author

andfoy commented Oct 14, 2020

My solution on this case would be the following:

  1. Compile the pyd for video_reader
  2. Copy the FFmpeg libraries and rename them
  3. Update the pyd to point to the new binaries
  4. Include those as part of both the conda recipe and the wheels

This seems relevant: https://github.com/njsmith/machomachomangler

@andfoy
Copy link
Contributor Author

andfoy commented Oct 14, 2020

I tried to rename avcodec-58.dll to avcodec-asdasd-50.dll on video_reader.pyd, however, it seems that PyAV dlls are picking avcodec-58.dll from conda, that's because C:\Users\circleci\project\env\Library\bin is first on PATH, and PyAV wheels are not renaming the packaged DLLs inside the wheels. That means that PyAV wheels are the culprit here, since those should be loading their own binaries to prevent clashes

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Oct 14, 2020

PyAV is not a hard dependency, right ? For windows, we can put "experimentatl" mention in the docs, put try/catch while it is imported and include a detailed instruction on what to do instead.

@fmassa
Copy link
Member

fmassa commented Oct 14, 2020

My take on this: let's separate this into two PRs: one with only libjpeg / libpng, and one with FFmpeg. This way we can start merging the libjpeg / libpng bits and testing it, while we iterate on FFmpeg.

EDIT: I'm instead disabling FFmpeg on Windows in this PR to move forward, and we fix the rest in a follow-up PR

@fmassa
Copy link
Member

fmassa commented Oct 14, 2020

Test failures are unrelated, merging this to unblock. We should still fix FFmpeg on Windows and Linux

@fmassa fmassa merged commit 4db3dc6 into pytorch:master Oct 14, 2020
@vfdev-5
Copy link
Collaborator

vfdev-5 commented Oct 14, 2020

Manual tests:

pip_install_req.sh

#!/bin/bash

python -m pip uninstall -y torchvision
python -m pip uninstall -y torch
conda uninstall -y torchvision
conda uninstall -y pytorch

python -m pip install --pre --upgrade torch torchvision -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html

python -m pip list | grep torch

pip install opencv-python

conda_install_req.sh

#!/bin/bash

pip uninstall -y torchvision
pip uninstall -y torch
conda uninstall -y torchvision
conda uninstall -y pytorch

conda install -y pytorch torchvision -c pytorch-nightly

conda list | grep torch

pip install opencv-python

test_io_image.py

import cv2
from PIL import Image

import torch, torchvision
from torchvision.io.image import read_image

print("torch.ops.load_library(torchvision.io.image.ext_specs.origin):",
      torch.ops.load_library(torchvision.io.image.ext_specs.origin))

im1 = cv2.imread("test-image.jpg")
print("Opencv:", im1.shape, im1.dtype)

im2 = Image.open("test-image.jpg")
print("PIL:", im2.size, im2.mode)

tensor_image = read_image("test-image.jpg")
print("tensor image info: ", tensor_image.shape, tensor_image.dtype)

@fmassa
Copy link
Member

fmassa commented Oct 14, 2020

Awesome, thanks a lot for checking this @vfdev-5 !

@andfoy andfoy deleted the debug_wheels branch October 14, 2020 21:40
fmassa added a commit that referenced this pull request Oct 16, 2020
* [DEBUG] Check wheel relocation issues

* Call delocate on Mac

* Use yum instead of conda

* Do not copy ffmpeg dylibs

* Linux dry run

* Do not download FFmpeg on Mac

* Install bzip2

* Restore FFmpeg on Windows

* Remove ffmpeg temporarily

* Do not remove dependencies

* Disable FFmpeg temporarily on Linux wheels

* Test relocation in Linux

* Add docstring

* Call relocation script

* Minor error correction

* Import auditwheel only on Linux

* Restore ffmpeg again on Windows

* Return *device

* Try fix Windows

* Fix clang-format

* Start windows patchwork

* Copy all DLLs

* Disable FFmpeg on Windows for now

* Bugfix

Co-authored-by: Francisco Massa <fvsmassa@gmail.com>
bryant1410 pushed a commit to bryant1410/vision-1 that referenced this pull request Nov 22, 2020
* [DEBUG] Check wheel relocation issues

* Call delocate on Mac

* Use yum instead of conda

* Do not copy ffmpeg dylibs

* Linux dry run

* Do not download FFmpeg on Mac

* Install bzip2

* Restore FFmpeg on Windows

* Remove ffmpeg temporarily

* Do not remove dependencies

* Disable FFmpeg temporarily on Linux wheels

* Test relocation in Linux

* Add docstring

* Call relocation script

* Minor error correction

* Import auditwheel only on Linux

* Restore ffmpeg again on Windows

* Return *device

* Try fix Windows

* Fix clang-format

* Start windows patchwork

* Copy all DLLs

* Disable FFmpeg on Windows for now

* Bugfix

Co-authored-by: Francisco Massa <fvsmassa@gmail.com>
vfdev-5 pushed a commit to Quansight/vision that referenced this pull request Dec 4, 2020
* [DEBUG] Check wheel relocation issues

* Call delocate on Mac

* Use yum instead of conda

* Do not copy ffmpeg dylibs

* Linux dry run

* Do not download FFmpeg on Mac

* Install bzip2

* Restore FFmpeg on Windows

* Remove ffmpeg temporarily

* Do not remove dependencies

* Disable FFmpeg temporarily on Linux wheels

* Test relocation in Linux

* Add docstring

* Call relocation script

* Minor error correction

* Import auditwheel only on Linux

* Restore ffmpeg again on Windows

* Return *device

* Try fix Windows

* Fix clang-format

* Start windows patchwork

* Copy all DLLs

* Disable FFmpeg on Windows for now

* Bugfix

Co-authored-by: Francisco Massa <fvsmassa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants