Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate definition of name (last_hidden_state) for float 16 onnx #16

Open
MrRace opened this issue May 7, 2022 · 0 comments
Open

Comments

@MrRace
Copy link

MrRace commented May 7, 2022

After training like that:

# For compression with a replacement scheduler
export GLUE_DIR=glue_script/glue_data
export TASK_NAME=MRPC

python ./run_glue.py \
  --model_name_or_path /home/bert-base \
  --task_name $TASK_NAME \
  --do_train \
  --do_eval \
  --do_lower_case \
  --data_dir "$GLUE_DIR/$TASK_NAME" \
  --max_seq_length 128 \
  --per_gpu_train_batch_size 32 \
  --per_gpu_eval_batch_size 32 \
  --learning_rate 2e-5 \
  --save_steps 50 \
  --num_train_epochs 15 \
  --output_dir result/ \
  --evaluate_during_training \
  --replacing_rate 0.3 \
  --scheduler_type linear \
  --scheduler_linear_k 0.0006

I convert the result model to Hugging Face with convert_to_hf_ckpt.py.
And I try to export to onnx:

output = torch.onnx.export(model,
                               org_dummy_input,
                               MODEL_ONNX_PATH,
                               verbose=True,
                               operator_export_type=OPERATOR_EXPORT_TYPE,
                               opset_version=12,
                               input_names=['input_ids', 'attention_mask', 'token_type_ids'], 
                               output_names=['last_hidden_state', 'pooler_output'],  
                               do_constant_folding=True,
                               dynamic_axes={"input_ids": {0: "batch_size", 1: 'seq_length'},
                                             "token_type_ids": {0: "batch_size", 1: 'seq_length'},
                                             "attention_mask": {0: "batch_size", 1: 'seq_length'},
                                             "pooler_output": {0: "batch_size"},
                                             "last_hidden_state": {0: "batch_size", 1: 'seq_length'}}
                               )

The result is float32 ONNX. I try to convert to float 16 onnx:

python3 -m onnxruntime_tools.transformers.optimizer --input  onnx/bert_fp32.onnx --output onnx/bert_fp16.onnx --float16

However when do inference for float16 onnx, it comes errors:

  session = InferenceSession(model_file, providers=['CUDAExecutionProvider'])
  File "/usr/local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/usr/local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 310, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from /home/model_compression/BERT-of-Theseus/onnx/bert_fp16.onnx failed:This is an invalid model. Error: Duplicate definition of name (last_hidden_state).

Have you try convert the pytorch model to float 16 onnx ? Why the half-precision operation makes the Duplicate definition of name (last_hidden_state). By the way, float32 onnx and run successfully without errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant