You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However when do inference for float16 onnx, it comes errors:
session = InferenceSession(model_file, providers=['CUDAExecutionProvider'])
File "/usr/local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/usr/local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 310, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from /home/model_compression/BERT-of-Theseus/onnx/bert_fp16.onnx failed:This is an invalid model. Error: Duplicate definition of name (last_hidden_state).
Have you try convert the pytorch model to float 16 onnx ? Why the half-precision operation makes the Duplicate definition of name (last_hidden_state). By the way, float32 onnx and run successfully without errors.
The text was updated successfully, but these errors were encountered:
After training like that:
I convert the result model to Hugging Face with
convert_to_hf_ckpt.py
.And I try to export to onnx:
The result is float32 ONNX. I try to convert to float 16 onnx:
However when do inference for float16 onnx, it comes errors:
Have you try convert the pytorch model to float 16 onnx ? Why the half-precision operation makes the
Duplicate definition of name (last_hidden_state)
. By the way, float32 onnx and run successfully without errors.The text was updated successfully, but these errors were encountered: