Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue crashing when converting model, and more #5

Closed
enzyme69 opened this issue Dec 2, 2022 · 9 comments
Closed

Issue crashing when converting model, and more #5

enzyme69 opened this issue Dec 2, 2022 · 9 comments

Comments

@enzyme69
Copy link

enzyme69 commented Dec 2, 2022

Running on iMac M1 8 GB, I found this error:

INFO:__main__:Converted vae_decoder
INFO:__main__:Converting unet
INFO:__main__:Attention implementation in effect: AttentionImplementations.SPLIT_EINSUM
INFO:__main__:Sample inputs spec: {'sample': (torch.Size([2, 4, 64, 64]), torch.float32), 'timestep': (torch.Size([2]), torch.float32), 'encoder_hidden_states': (torch.Size([2, 768, 1, 77]), torch.float32)}
INFO:__main__:JIT tracing..
/Users/blendersushi/Documents/CoreMLDiffusion/ml-stable-diffusion-main/python_coreml_stable_diffusion/layer_norm.py:61: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert inputs.size(1) == self.num_channels
INFO:__main__:Done.
INFO:__main__:Converting unet to CoreML..
WARNING:coremltools:Tuple detected at graph output. This will be flattened in the converted model.
Converting PyTorch Frontend ==> MIL Ops:   0%|                                                                                                      | 0/7876 [00:00<?, ? ops/s]WARNING:coremltools:Saving value type of int64 into a builtin type of int32, might lose precision!
Converting PyTorch Frontend ==> MIL Ops: 100%|████████████████████████████████████████████████████████████████████████████████████████▉| 7874/7876 [00:01<00:00, 4933.86 ops/s]
Running MIL Common passes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 39/39 [00:23<00:00,  1.63 passes/s]
Running MIL FP16ComputePrecision pass: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:40<00:00, 40.70s/ passes]
Running MIL Clean up passes:  18%|███████████████████                                                                                      | 2/11 [00:15<01:10,  7.85s/ passes]zsh: killed     python -m python_coreml_stable_diffusion.torch2coreml --model-version      -o
(coreml_stable_diffusion) blendersushi@192-168-1-102 ml-stable-diffusion-main % /Users/blendersushi/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
(coreml_stable_diffusion) blendersushi@192-168-1-102 ml-stable-diffusion-main % 
@enzyme69
Copy link
Author

enzyme69 commented Dec 2, 2022

I tried running the command prompt anyway and getting error:
FileNotFoundError: text_encoder CoreML model doesn't exist at /Users/blendersushi/Documents/CoreMLDiffusion/ml-stable-diffusion-main/mlmodel/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder.mlpackage

Screenshot 2022-12-02 at 12 16 34 pm

Should there be 2 files under that mlmmodel that I specified during the "conversion" (which also failed and giving error message"?

Ok my suspicion is the previous error is causing this error what should I do?

@enzyme69
Copy link
Author

enzyme69 commented Dec 2, 2022

My HDD is 6 GB left and keep "running out of memory" could this be an issue causing the crash?

And I am using Stable Diffusion 1.5 from runwayml, not sure if that could be extra info to find this semaphore leak bug.

Also:
Torch version 1.13.0 has not been tested with coremltools. You may run into unexpected errors. Torch 1.12.1 is the most recent version that has been tested.

I pip install this myself, maybe I need to downgrade?

@enzyme69 enzyme69 changed the title issue crashing when converting issue crashing when converting and more Dec 2, 2022
@enzyme69
Copy link
Author

enzyme69 commented Dec 2, 2022

So that none of my effort lost, since I cannot compile manually, I did use this method of using "snapshots" from pcuenq - Pedro Cuenca, but also failing to generate anything..:
https://huggingface.co/blog/diffusers-coreml

(coreml_stable_diffusion) blendersushi@192-168-1-102 ml-stable-diffusion-main % python -m python_coreml_stable_diffusion.pipeline --model-version runwayml/stable-diffusion-v1-5 --prompt "a photo of an astronaut riding a horse on mars" -i ./models/models--apple--coreml-stable-diffusion-v1-5/snapshots/ddab1155adfd564d1d8ef7db3ac345a8ed2bad65/original/packages -o ./output --compute-unit ALL --seed 1234


WARNING:coremltools:Torch version 1.13.0 has not been tested with coremltools. You may run into unexpected errors. Torch 1.12.1 is the most recent version that has been tested.
INFO:__main__:Setting random seed to 1234
INFO:__main__:Initializing PyTorch pipe for reference configuration
Fetching 15 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 8746.64it/s]
INFO:__main__:Removed PyTorch pipe to reduce peak memory consumption
INFO:__main__:Loading Core ML models in memory from ./models/models--apple--coreml-stable-diffusion-v1-5/snapshots/ddab1155adfd564d1d8ef7db3ac345a8ed2bad65/original/packages
INFO:python_coreml_stable_diffusion.coreml_model:Loading text_encoder mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Loading ./models/models--apple--coreml-stable-diffusion-v1-5/snapshots/ddab1155adfd564d1d8ef7db3ac345a8ed2bad65/original/packages/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder.mlpackage
/Users/blendersushi/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/site-packages/coremltools/models/model.py:145: RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was: {
    NSLocalizedDescription = "at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder_BD5227E8-06E1-4259-B20A-CC6741905890.mlmodelc/model.mil:10:244: Error parsing MIL model: at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder_BD5227E8-06E1-4259-B20A-CC6741905890.mlmodelc/model.mil:10:244: Could not open /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder_BD5227E8-06E1-4259-B20A-CC6741905890.mlmodelc/weights/weight.bin";
    NSUnderlyingError = "Error Domain=com.apple.CoreML Code=110 \"(null)\"";
}
  _warnings.warn(
INFO:python_coreml_stable_diffusion.coreml_model:Done. Took 0.6 seconds.
INFO:python_coreml_stable_diffusion.coreml_model:Loading unet mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Loading ./models/models--apple--coreml-stable-diffusion-v1-5/snapshots/ddab1155adfd564d1d8ef7db3ac345a8ed2bad65/original/packages/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_unet.mlpackage
/Users/blendersushi/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/site-packages/coremltools/models/model.py:145: RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was: {
    NSLocalizedDescription = "at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_unet_83DE3B82-1717-4486-8701-26D36C2442BE.mlmodelc/model.mil:8:133: Error parsing MIL model: at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_unet_83DE3B82-1717-4486-8701-26D36C2442BE.mlmodelc/model.mil:8:133: Could not open /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_unet_83DE3B82-1717-4486-8701-26D36C2442BE.mlmodelc/weights/weight.bin";
    NSUnderlyingError = "Error Domain=com.apple.CoreML Code=110 \"(null)\"";
}
  _warnings.warn(
INFO:python_coreml_stable_diffusion.coreml_model:Done. Took 2.8 seconds.
INFO:python_coreml_stable_diffusion.coreml_model:Loading vae_decoder mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Loading ./models/models--apple--coreml-stable-diffusion-v1-5/snapshots/ddab1155adfd564d1d8ef7db3ac345a8ed2bad65/original/packages/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_vae_decoder.mlpackage
/Users/blendersushi/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/site-packages/coremltools/models/model.py:145: RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was: {
    NSLocalizedDescription = "at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_vae_decoder_12203C0C-D09D-4FE3-A319-B37CCB0889A0.mlmodelc/model.mil:10:174: Error parsing MIL model: at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_vae_decoder_12203C0C-D09D-4FE3-A319-B37CCB0889A0.mlmodelc/model.mil:10:174: Could not open /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_vae_decoder_12203C0C-D09D-4FE3-A319-B37CCB0889A0.mlmodelc/weights/weight.bin";
    NSUnderlyingError = "Error Domain=com.apple.CoreML Code=110 \"(null)\"";
}
  _warnings.warn(
INFO:python_coreml_stable_diffusion.coreml_model:Done. Took 0.7 seconds.
INFO:python_coreml_stable_diffusion.coreml_model:Loading safety_checker mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Loading ./models/models--apple--coreml-stable-diffusion-v1-5/snapshots/ddab1155adfd564d1d8ef7db3ac345a8ed2bad65/original/packages/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_safety_checker.mlpackage
/Users/blendersushi/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/site-packages/coremltools/models/model.py:145: RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was: {
    NSLocalizedDescription = "at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_safety_checker_F6E660E9-99D4-4589-AE75-91F2290BEB50.mlmodelc/model.mil:11:258: Error parsing MIL model: at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_safety_checker_F6E660E9-99D4-4589-AE75-91F2290BEB50.mlmodelc/model.mil:11:258: Could not open /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_safety_checker_F6E660E9-99D4-4589-AE75-91F2290BEB50.mlmodelc/weights/weight.bin";
    NSUnderlyingError = "Error Domain=com.apple.CoreML Code=110 \"(null)\"";
}
  _warnings.warn(
INFO:python_coreml_stable_diffusion.coreml_model:Done. Took 1.0 seconds.
INFO:__main__:Done.
INFO:__main__:Initializing Core ML pipe for image generation
INFO:__main__:Stable Diffusion configured to generate 512x512 images
INFO:__main__:Done.
INFO:__main__:Beginning image generation.
Traceback (most recent call last):
  File "/Users/blendersushi/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/blendersushi/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/blendersushi/Documents/CoreMLDiffusion/ml-stable-diffusion-main/python_coreml_stable_diffusion/pipeline.py", line 534, in <module>
    main(args)
  File "/Users/blendersushi/Documents/CoreMLDiffusion/ml-stable-diffusion-main/python_coreml_stable_diffusion/pipeline.py", line 478, in main
    image = coreml_pipe(
  File "/Users/blendersushi/Documents/CoreMLDiffusion/ml-stable-diffusion-main/python_coreml_stable_diffusion/pipeline.py", line 297, in __call__
    text_embeddings = self._encode_prompt(
  File "/Users/blendersushi/Documents/CoreMLDiffusion/ml-stable-diffusion-main/python_coreml_stable_diffusion/pipeline.py", line 127, in _encode_prompt
    text_embeddings = self.text_encoder(
  File "/Users/blendersushi/Documents/CoreMLDiffusion/ml-stable-diffusion-main/python_coreml_stable_diffusion/coreml_model.py", line 79, in __call__
    return self.model.predict(kwargs)
  File "/Users/blendersushi/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/site-packages/coremltools/models/model.py", line 545, in predict
    raise self._framework_error
  File "/Users/blendersushi/miniconda3/envs/coreml_stable_diffusion/lib/python3.8/site-packages/coremltools/models/model.py", line 143, in _get_proxy_and_spec
    return (_MLModelProxy(filename, compute_units.name), specification, None)
RuntimeError: {
    NSLocalizedDescription = "at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder_BD5227E8-06E1-4259-B20A-CC6741905890.mlmodelc/model.mil:10:244: Error parsing MIL model: at /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder_BD5227E8-06E1-4259-B20A-CC6741905890.mlmodelc/model.mil:10:244: Could not open /private/var/folders/sc/60t303_n5ysgp00p2d7gg6p80000gq/T/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder_BD5227E8-06E1-4259-B20A-CC6741905890.mlmodelc/weights/weight.bin";
    NSUnderlyingError = "Error Domain=com.apple.CoreML Code=110 \"(null)\"";
}
(coreml_stable_diffusion) blendersushi@192-168-1-102 ml-stable-diffusion-main % 
(coreml_stable_diffusion) blendersushi@192-168-1-102 ml-stable-diffusion-main % 

@enzyme69 enzyme69 changed the title issue crashing when converting and more Issue crashing when converting model, and more Dec 2, 2022
@enzyme69
Copy link
Author

enzyme69 commented Dec 2, 2022

pcuenq - Pedro Cuenca updated the instruction, and so finally the path etc seems to be resolved, I manage to run python version but then the speed is too slow total time almost 4-5 minutes per image.

I have yet to try Swift version.

So this is the log text of using the precompiled version from Pedro.

(coreml_stable_diffusion) blendersushi@192-168-1-102 ml-stable-diffusion-main % python -m python_coreml_stable_diffusion.pipeline --model-version runwayml/stable-diffusion-v1-5 --prompt "a photo of an astronaut riding an octopus on jupiter" -i models/coreml-stable-diffusion-v1-5_original_packages -o ./output --compute-unit ALL --seed 93

WARNING:coremltools:Torch version 1.13.0 has not been tested with coremltools. You may run into unexpected errors. Torch 1.12.1 is the most recent version that has been tested.
INFO:__main__:Setting random seed to 93
INFO:__main__:Initializing PyTorch pipe for reference configuration
Fetching 15 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 11335.96it/s]
INFO:__main__:Removed PyTorch pipe to reduce peak memory consumption
INFO:__main__:Loading Core ML models in memory from models/coreml-stable-diffusion-v1-5_original_packages
INFO:python_coreml_stable_diffusion.coreml_model:Loading text_encoder mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Loading models/coreml-stable-diffusion-v1-5_original_packages/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_text_encoder.mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Done. Took 8.2 seconds.
INFO:python_coreml_stable_diffusion.coreml_model:Loading unet mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Loading models/coreml-stable-diffusion-v1-5_original_packages/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_unet.mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Done. Took 144.8 seconds.
INFO:python_coreml_stable_diffusion.coreml_model:Loading a CoreML model through coremltools triggers compilation every time. The Swift package we provide uses precompiled Core ML models (.mlmodelc) to avoid compile-on-load.
INFO:python_coreml_stable_diffusion.coreml_model:Loading vae_decoder mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Loading models/coreml-stable-diffusion-v1-5_original_packages/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_vae_decoder.mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Done. Took 6.9 seconds.
INFO:python_coreml_stable_diffusion.coreml_model:Loading safety_checker mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Loading models/coreml-stable-diffusion-v1-5_original_packages/Stable_Diffusion_version_runwayml_stable-diffusion-v1-5_safety_checker.mlpackage
INFO:python_coreml_stable_diffusion.coreml_model:Done. Took 1.8 seconds.
INFO:__main__:Done.
INFO:__main__:Initializing Core ML pipe for image generation
INFO:__main__:Stable Diffusion configured to generate 512x512 images
INFO:__main__:Done.
INFO:__main__:Beginning image generation.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 51/51 [01:48<00:00,  2.12s/it]
INFO:__main__:Generated image has nsfw concept=False
INFO:__main__:Saving generated image to ./output/a_photo_of_an_astronaut_riding_an_octopus_on_jupiter/randomSeed_93_computeUnit_ALL_modelVersion_runwayml_stable-diffusion-v1-5.png

@ocordeiro
Copy link

@enzyme69 where are updated instructions?
I'm getting same issue on M1 Air 8GB

@enzyme69
Copy link
Author

enzyme69 commented Dec 2, 2022

@ocordeiro
same article blog post has been updated, you need to delete the model folder, and re-run the download into model (packaged).

Ensure you do download the correct model and specify in path.

Python version works for me, but still too slow when loading. The generation of image, seems fast at 1 minute, but if you want to make multiple image, you need to run it in one session, just like when using Jupyter Notebook.

Swift version failing still....

(coreml_stable_diffusion) blendersushi@192-168-1-102 ml-stable-diffusion-main % swift run StableDiffusionSample --resource-path ./models/coreml-stable-diffusion-v1-5_original_compiled --compute-units all "a photo of an astronaut riding a horse on mars"

Building for debugging...
Build complete! (0.08s)
Loading resources and creating pipeline
(Note: This can take a while the first time using these resources)
Sampling ...
2022-12-03 07:58:08.244 StableDiffusionSample[28445:446174] Error calling plan_submit in batch processing.
zsh: trace trap  swift run StableDiffusionSample --resource-path  --compute-units all 

@ParityError
Copy link

@ocordeiro same article blog post has been updated, you need to delete the model folder, and re-run the download into model (packaged).

Ensure you do download the correct model and specify in path.

Python version works for me, but still too slow when loading. The generation of image, seems fast at 1 minute, but if you want to make multiple image, you need to run it in one session, just like when using Jupyter Notebook.

Swift version failing still....

(coreml_stable_diffusion) blendersushi@192-168-1-102 ml-stable-diffusion-main % swift run StableDiffusionSample --resource-path ./models/coreml-stable-diffusion-v1-5_original_compiled --compute-units all "a photo of an astronaut riding a horse on mars"

Building for debugging...
Build complete! (0.08s)
Loading resources and creating pipeline
(Note: This can take a while the first time using these resources)
Sampling ...
2022-12-03 07:58:08.244 StableDiffusionSample[28445:446174] Error calling plan_submit in batch processing.
zsh: trace trap  swift run StableDiffusionSample --resource-path  --compute-units all 

I had the same error with --compute-units all, and a different error with --compute-units cpuAndNeuralEngine:

Error: Failed to obtain prediction for sample 0

although when I did --compute-units cpuAndGPU the generation was successful.

@enzyme69
Copy link
Author

enzyme69 commented Dec 5, 2022

It seems like my unability to build the model was because of 8 GB iMac memory issue. Running it with terminal only could solve the issue. But it might still crash a few times during compile.

I got a compiled model from Yasuhito, this one is a great one to try as app:
https://github.com/ynagatomo/ImgGenSD2

@enzyme69 enzyme69 closed this as completed Dec 5, 2022
@enzyme69
Copy link
Author

enzyme69 commented Dec 6, 2022

For other users trying to convert into coreML using iMac 8 GB low memory issue will happen, but just keep on running the compile many2 times, it will work eventually. Close other apps.

The output "Resources" folder is the one you need if you want to test it in app.
https://github.com/ynagatomo/ImgGenSD2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants