-
-
Notifications
You must be signed in to change notification settings - Fork 16.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Torch MPS (gpu) acceleration not working M1 Mac. #8102
Comments
👋 Hello @jerjer1223, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution. If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available. For business inquiries or professional support requests please visit https://ultralytics.com or email support@ultralytics.com. RequirementsPython>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started: git clone https://github.com/ultralytics/yolov5 # clone
cd yolov5
pip install -r requirements.txt # install EnvironmentsYOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
StatusIf this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on macOS, Windows, and Ubuntu every 24 hours and on every commit. |
* experimental.py Apple MPS fix May resolve #8102 * Update experimental.py * Update experimental.py
@jerjer1223 good news 😃! Your original issue may now be fixed ✅ in PR #8121. This PR removes MPS from the torch.device() To receive this update:
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀! |
@glenn-jocher the above solved the same issue for me
|
@GerardWalsh great, I'm glad we resolved the original issue. The buffer size issue is known to the pytorch team and I believe they are working on solutions for it. See pytorch/pytorch#77886 |
Also it seems to have problems with the CPU as well with PyTorch 1.13. When I ran it under CPU, it gave me this error. PyTorch version 1.13.0.dev20220607 RuntimeError: Couldn't load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check https://github.com/pytorch/vision#installation for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install. |
@jerjer1223 try torchvision 0.14.0.dev20220603, with that torch version (1.13.0.dev20220607) that you're using. |
* experimental.py Apple MPS fix May resolve ultralytics#8102 * Update experimental.py * Update experimental.py
Hi @glenn-jocher and good day to you, I am having a challenge where I am trying to use my M1 GPU for training via python3 train.py etc.. I have been trying to implement this for some time now by googling but this seems a little more challenging than I expected it to be. I discovered the MPS component but that based on my research is used when deploying inference or detect.py. I can be wrong based on my limited experience. Can you guide me as to how to install the M1 GPU Silicon Chip on the new Macbook Pro for YOLOV5 Training, Please? This training procedure is extremely painful on my old mac so I bought a newer model to handle the processing and I'm not sure how this works. Thanx loads for the YOLOV5 approach and your efforts. This is working on my old mac but that has been training since last Friday morning 3am and today is Tuesday, 16th Aug 2021 and it's now gotten to 8 epochs out of 30??!!!?!??! Please help me initiate this faster with the M1 GPU or MPS not sure how it goes nevertheless my googling. Thanx loads for anyone responding to my limitation, I am grateful just to learn |
@Symbadian MPS support is in place currently for YOLOv5, but PyTorch has not completed sufficient support for MPS training. If you have an M1/M2 machine you'll already see faster inference and training vs Intel chips simply by installing Python with Universal2 installers for python>=3.9. The speedup is about 200ms Intel vs 70ms M1 with universal2. MPS support would theoretically be faster still when available from pytorch. |
Ok @glenn-jocher, so this would not work for MPS just yet, wow! DISAPPOINTED.....
This is taking 3-5 days to train the (yolov5m, l and x ) model: 30 epochs, 32 batch size, I tried implementing the --hyp low, med, high and testing these (FROM SCRATCH and PRE-TRAINED WEIGHTS) to see which is superior in performance for my solution. Every time I try implementing a larger model than the (m), I get the prompt below..???!!
THANX LOADS @glenn-jocher FOR YOUR Works, really appreciate this, I'm just trying to get this to work and understand what I am doing!! |
@Symbadian you can track (and vote on) ongoing aten operator development in pytorch/pytorch#77764 that's needed for full MPS training to work correctly. |
Hi @Symbadian , can you please file an issue in PyTorch with "MPS" label, we will take a look. |
Hi @kulinseth how do I do so? I’ve never file an issue before and would like to have the most productive Impact to help others as well. I am still struggling with this challenge, no matter what I do all of the resources are being drained and currently, Googling is not providing a solution.. please guide me |
Hi @Symbadian - to file a PyTorch issue, you can go to https://github.com/pytorch/pytorch/issues and click on the green button |
Hi Denis,
Thank you for acknowledging my digital presence, the reports has been logged!
From: Denis Vieriu ***@***.***>
Date: Tuesday, 23 August 2022 at 21:05
To: ultralytics/yolov5 ***@***.***>
Cc: Symbadian ***@***.***>, Mention ***@***.***>
Subject: Re: [ultralytics/yolov5] Torch MPS (gpu) acceleration not working M1 Mac. (Issue #8102)
Hi @Symbadian<https://github.com/Symbadian> - to file a PyTorch issue, you can go to https://github.com/pytorch/pytorch/issues and click on the green button New Issue (nearby the search bar). From there select Bug Report and please add the necessary info to reproduce it (e.g command line used, machine config info, pytorch version). In the labels tab, please add module: mps - we'll take a look from there.
Thanks!
—
Reply to this email directly, view it on GitHub<#8102 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AL7WSHIWURMTYP26MOJGVL3V2UVJTANCNFSM5X4MJC6Q>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
* experimental.py Apple MPS fix May resolve ultralytics#8102 * Update experimental.py * Update experimental.py
Search before asking
YOLOv5 Component
Detection
Bug
When I change the device to mps with --device mps. It gives me "RuntimeError: don't know how to restore data location of torch.storage._UntypedStorage (tagged with mps)."
Torch 1.13 has GPU acceleration, as stated on their website and this article (https://towardsdatascience.com/gpu-acceleration-comes-to-pytorch-on-m1-macs-195c399efcc1)
Environment
YOLOv5 🚀 2022-6-3 Python-3.9.13 torch-1.13.0.dev20220604 MPS
Minimal Reproducible Example
python detect.py --device mps
Additional
Full log here
detect: weights=yolov5s.pt, source=data/images, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=mps, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 2022-6-3 Python-3.9.13 torch-1.13.0.dev20220604 MPS
Traceback (most recent call last):
File "/Users/jerry/Documents/yolov5-master/detect.py", line 252, in
main(opt)
File "/Users/jerry/Documents/yolov5-master/detect.py", line 247, in main
run(**vars(opt))
File "/opt/homebrew/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/Users/jerry/Documents/yolov5-master/detect.py", line 92, in run
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
File "/Users/jerry/Documents/yolov5-master/models/common.py", line 334, in init
model = attempt_load(weights if isinstance(weights, list) else w, device=device)
File "/Users/jerry/Documents/yolov5-master/models/experimental.py", line 80, in attempt_load
ckpt = torch.load(attempt_download(w), map_location=device)
File "/opt/homebrew/lib/python3.9/site-packages/torch/serialization.py", line 712, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/opt/homebrew/lib/python3.9/site-packages/torch/serialization.py", line 1049, in _load
result = unpickler.load()
File "/opt/homebrew/lib/python3.9/site-packages/torch/serialization.py", line 1019, in persistent_load
load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "/opt/homebrew/lib/python3.9/site-packages/torch/serialization.py", line 1001, in load_tensor
wrap_storage=restore_location(storage, location),
File "/opt/homebrew/lib/python3.9/site-packages/torch/serialization.py", line 973, in restore_location
return default_restore_location(storage, str(map_location))
File "/opt/homebrew/lib/python3.9/site-packages/torch/serialization.py", line 178, in default_restore_location
raise RuntimeError("don't know how to restore data location of "
RuntimeError: don't know how to restore data location of torch.storage._UntypedStorage (tagged with mps)
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: