Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance on Apple Silicon #190

Open
RobBW opened this issue Sep 14, 2024 · 16 comments
Open

Performance on Apple Silicon #190

RobBW opened this issue Sep 14, 2024 · 16 comments

Comments

@RobBW
Copy link

RobBW commented Sep 14, 2024

I have just installed IQA-PyTorch on my M2 Mac mini and subjectively it seems to run quite slowly.
On examining your code I note that there are no device specification for "mps".
Before I launch any experimentation I am asking you whether there is likely to be any performance benefit in adding .device("mps") to these files:

  • pyiqa/archs/arch_util.py#L165)
  • pyiqa/archs/fid_arch.py#L205)
  • docs/examples.rst#L14)
  • README.md?plain=1#L77)
@chaofengc
Copy link
Owner

The code uses PyTorch for GPU acceleration, allowing you to set the compute device in the same way as in PyTorch. For instance, you can use metric.to(torch.device("mps")) to specify the device.

Please note, I haven't tested the code on Mac yet. Let me know if it works for you!

@RobBW
Copy link
Author

RobBW commented Sep 16, 2024

pyiqa runs on my Apple Silicon M2, Sonoma.
I've only run the hyperiqa metric on 2 images and got results more or less in line with my own judgement.
If you post some test images that you have already measured and advise which metrics you would like me to run I can then verify that pyiqa is working correctly on MacOS.
I will test and confirm the different device modes soon. Its pretty fast anyway.

@chaofengc
Copy link
Owner

Thank you for providing information about macOS. I greatly appreciate your willingness to help verify the functionality of pyiqa on macOS.

To run the tests, please follow these steps:

  1. Clone the repository using git.
  2. Modify the device settings by editing the lines at: conftest.py#L13.
  3. Run make test_cal. This will verify the results against official values and save them in ResultsCalibra/calibration_summary.csv.

@RobBW
Copy link
Author

RobBW commented Sep 17, 2024 via email

@chaofengc
Copy link
Owner

chaofengc commented Sep 17, 2024

Thank you so much for testing on macOS. I will update the requirements accordingly.

At the moment, it's not feasible for me to resolve the interpolation issue on the mps backend. Additionally, downgrading to fp32 isn't a suitable option, as it could lead to incorrect results for certain metrics. As a result, we will need to wait for PyTorch to fully support the upsample operation on the mps device.

In the meantime, we can work with the M-series CPU. Since the M2 chip is significantly more powerful, it would be fantastic if you could help test the efficiency on it using the following command:

git pull
python tests/test_efficiency.py -d cpu -c cpu_m2

Results will be saved in tests/Efficiency_benchmark.csv. Thanks again for your support!

@RobBW
Copy link
Author

RobBW commented Sep 17, 2024 via email

@chaofengc
Copy link
Owner

chaofengc commented Sep 18, 2024

Thank you for your help! Could you please attach the results file? It should be located at tests/Efficiency_benchmark.csv.

Regarding the new work you mentioned, I'll review it when I have the time.

@RobBW
Copy link
Author

RobBW commented Sep 19, 2024 via email

@chaofengc
Copy link
Owner

Thank you for your help! However, I’m unable to see the attached file. Could you please upload the result directly here?

@RobBW
Copy link
Author

RobBW commented Sep 20, 2024

This time?

calibration_summary.csv

@chaofengc
Copy link
Owner

Apologies for providing the incorrect file path earlier, this is the results calibration file. The time benchmark results can be found in tests/Efficiency_benchmark.csv

@RobBW
Copy link
Author

RobBW commented Sep 20, 2024

No problem:
Efficiency_benchmark.csv

@chaofengc
Copy link
Owner

That's it! Thank you so much!

@RobBW
Copy link
Author

RobBW commented Sep 20, 2024

I've been looking at: https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html
Are there any bottleneck points you can pinpoint that would benefit from using torch.compile? If yes I'd be interested to test them for you.

@chaofengc
Copy link
Owner

Thank you for your suggestions. It is indeed possible to further accelerate certain pure deep neural network metrics using torch.compile. However, this process may take some time, as different metrics could require distinct modifications. Since our goal is not to provide production-level code, the current GPU performance (all metrics <1s for a $1080\times800$ image on a V100) is largely sufficient for research purposes.

@RobBW
Copy link
Author

RobBW commented Sep 20, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants