Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_audiovisual.py #27

Open
TanYuChen1 opened this issue Sep 17, 2023 · 1 comment
Open

run_audiovisual.py #27

TanYuChen1 opened this issue Sep 17, 2023 · 1 comment

Comments

@TanYuChen1
Copy link

Hi, Thank you for your open-source codes. I used my own dataset on your model but encountered a problem.
In fact, I ran extract_faces.py and write_records_tcd.py without any issues.
The error message is as follows:

Traceback (most recent call last):
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: Nan in summary histogram for: Decoder/decoder/my_dense/bias_0-grad
         [[{{node Decoder/decoder/my_dense/bias_0-grad}}]]
  (1) Invalid argument: Nan in summary histogram for: Decoder/decoder/my_dense/bias_0-grad
         [[{{node Decoder/decoder/my_dense/bias_0-grad}}]]
         [[gradients/Decoder/decoder/while/BasicDecoderStep/decoder/attention_wrapper/Select_grad/Select/StackPopV2/_198]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_audiovisual.py", line 64, in <module>
    main()
  File "run_audiovisual.py", line 59, in main
    logfile=logfile,
  File "/home/exp/test/avsr-tf1-yjq/avsr/experiment.py", line 111, in run_experiment
    try_restore_latest_checkpoint=True
  File "/home/exp/test/avsr-tf1-yjq/avsr/avsr.py", line 274, in train
    ], **self.sess_opts)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: Nan in summary histogram for: Decoder/decoder/my_dense/bias_0-grad
         [[node Decoder/decoder/my_dense/bias_0-grad (defined at /home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
  (1) Invalid argument: Nan in summary histogram for: Decoder/decoder/my_dense/bias_0-grad
         [[node Decoder/decoder/my_dense/bias_0-grad (defined at /home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
         [[gradients/Decoder/decoder/while/BasicDecoderStep/decoder/attention_wrapper/Select_grad/Select/StackPopV2/_198]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'Decoder/decoder/my_dense/bias_0-grad':
  File "run_audiovisual.py", line 64, in <module>
    main()
  File "run_audiovisual.py", line 59, in main
    logfile=logfile,
  File "/home/exp/test/avsr-tf1-yjq/avsr/experiment.py", line 106, in run_experiment
    **kwargs
  File "/home/exp/test/avsr-tf1-yjq/avsr/avsr.py", line 216, in __init__
    self._create_models()
  File "/home/exp/test/avsr-tf1-yjq/avsr/avsr.py", line 531, in _create_models
    batch_size=self._hparams.batch_size[0])
  File "/home/exp/test/avsr-tf1-yjq/avsr/avsr.py", line 574, in _make_model
    hparams=self._hparams
  File "/home/exp/test/avsr-tf1-yjq/avsr/seq2seq.py", line 26, in __init__
    self._init_optimiser()
  File "/home/exp/test/avsr-tf1-yjq/avsr/seq2seq.py", line 231, in _init_optimiser
    summary = tf.summary.histogram("%s-grad" % variable.name, value)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/summary/summary.py", line 179, in histogram
    tag=tag, values=values, name=scope)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/ops/gen_logging_ops.py", line 329, in histogram_summary
    "HistogramSummary", tag=tag, values=values, name=name)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/home/exp/anaconda3/envs/avsr-tf1/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Is this possibly caused by different data dimensions?
Thanks a lot.

@georgesterpu
Copy link
Owner

Hi @TanYuChen1
Thanks for creating this issue.
Your error message appears to suggest that there were NaNs in one of the gradient histogram tensors.

Please note that I am no longer maintaining this repository. If I had the chance to work again in AVSR, I would probably start by porting everything here to Pytorch (+ e.g. Lightning) or Keras, in order to leverage the latest advancements in the space of ML frameworks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants