Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression: az ml job stream and download commands do not work in 2.31.0 (do work in 2.30.0) #20712

Closed
gro1m opened this issue Dec 13, 2021 · 12 comments
Assignees
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning az ml needs-author-feedback More information is needed from author to address the issue. Service Attention This issue is responsible by Azure service team.

Comments

@gro1m
Copy link

gro1m commented Dec 13, 2021

az feedback auto-generates most of the information requested below, as of CLI version 2.0.62

Describe the bug
Cannot stream Azure ML job logs with Azure CLI version 2.31.0 and Azure ML CLI 2.0.3.

To Reproduce
Answer: az ml job stream --name '$JOB_NAME'
ERROR: 'ascii' codec can't encode character '\u2018' in position 11: ordinal not in range(128)

az ml job download --name '$JOB_NAME'
ERROR: This job is in state Preparing. Download is allowed only in states ['Completed', 'Failed', 'Canceled', 'NotResponding', 'Paused']

Expected behavior
No ERROR message, but successful streaming and file downloads. The second error occurs when removing the streaming. So it should actually wait for the job to be finished.

Environment summary
Docker:
Ubuntu 18.04
CLI version: 2.31.0
Shell Type: Bash
ML CLI version: 2.0.3

Additional context
Works with Azure CLI 2.30.0, but we cannot use it due to this issue resolved in 2.31.0:
#20628.

Also opened Microsoft Support Request and ML CLI Team confirmed that has to be issue with CLI itself and that I should raise the issue here.

@ghost ghost added needs-triage This is a new issue that needs to be triaged to the appropriate team. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Dec 13, 2021
@jiasli jiasli added Machine Learning az ml Service Attention This issue is responsible by Azure service team. labels Dec 14, 2021
@ghost ghost removed the needs-triage This is a new issue that needs to be triaged to the appropriate team. label Dec 14, 2021
@ghost
Copy link

ghost commented Dec 14, 2021

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github.

Issue Details

az feedback auto-generates most of the information requested below, as of CLI version 2.0.62

Describe the bug
Cannot stream Azure ML job logs with Azure CLI version 2.31.0 and Azure ML CLI 2.0.3.

To Reproduce
Answer: az ml job stream --name '$JOB_NAME'
ERROR: 'ascii' codec can't encode character '\u2018' in position 11: ordinal not in range(128)

az ml job download --name '$JOB_NAME'
ERROR: This job is in state Preparing. Download is allowed only in states ['Completed', 'Failed', 'Canceled', 'NotResponding', 'Paused']

Expected behavior
No ERROR message, but successful streaming and file downloads. The second error occurs when removing the streaming. So it should actually wait for the job to be finished.

Environment summary
Docker:
Ubuntu 18.04
CLI version: 2.31.0
Shell Type: Bash
ML CLI version: 2.0.3

Additional context
Works with Azure CLI 2.30.0, but we cannot use it due to this issue resolved in 2.31.0:
#20628.

Also opened Microsoft Support Request and ML CLI Team confirmed that has to be issue with CLI itself and that I should raise the issue here.

Author: gro1m
Assignees: -
Labels:

Service Attention, question, Machine Learning, customer-reported, needs-triage

Milestone: -

@jiasli
Copy link
Member

jiasli commented Dec 14, 2021

Routing to service team.

@needuv
Copy link
Member

needuv commented Dec 14, 2021

Hi @gro1m, could you include the output when you run your commands with the --debug flag? For example, az ml job stream --name $JOB_NAME --debug? This should help us locate where in the code an error is getting thrown

@gro1m
Copy link
Author

gro1m commented Dec 15, 2021

Hi @needuv
Here you are:

Traceback (most recent call last):
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/custom/job.py", line 192, in ml_job_stream
    ml_client.jobs.stream(name=name)
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ml/_operations/job_operations.py", line 304, in stream
    self._runs, job_object, self._all_operations.all_operations[AzureMLResourceType.DATASTORE]
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ml/_operations/job_ops_helper.py", line 248, in stream_logs_until_completion
    _incremental_print(content, processed_logs, current_log, file_handle)
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ml/_operations/job_ops_helper.py", line 99, in _incremental_print
    fileout.write(line + "\n")
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 41, in write
    self.__convertor.write(text)
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 162, in write
    self.write_and_convert(text)
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 190, in write_and_convert
    self.write_plain_text(text, cursor, len(text))
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 195, in write_plain_text
    self.wrapped.write(text[start:end])
UnicodeEncodeError: 'ascii' codec can't encode character '\u2018' in position 11: ordinal not in range(128)
ERROR: cli: None
DEBUG: cli: Traceback (most recent call last):
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/custom/job.py", line 192, in ml_job_stream
    ml_client.jobs.stream(name=name)
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ml/_operations/job_operations.py", line 304, in stream
    self._runs, job_object, self._all_operations.all_operations[AzureMLResourceType.DATASTORE]
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ml/_operations/job_ops_helper.py", line 248, in stream_logs_until_completion
    _incremental_print(content, processed_logs, current_log, file_handle)
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ml/_operations/job_ops_helper.py", line 99, in _incremental_print
    fileout.write(line + "\n")
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 41, in write
    self.__convertor.write(text)
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 162, in write
    self.write_and_convert(text)
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 190, in write_and_convert
    self.write_plain_text(text, cursor, len(text))
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 195, in write_plain_text
    self.wrapped.write(text[start:end])
UnicodeEncodeError: 'ascii' codec can't encode character '\u2018' in position 11: ordinal not in range(128)
DEBUG: cli.azure.cli.core.util: azure.cli.core.util.handle_exception is called with an exception:
DEBUG: cli.azure.cli.core.util: Traceback (most recent call last):
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/custom/job.py", line 192, in ml_job_stream
    ml_client.jobs.stream(name=name)
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ml/_operations/job_operations.py", line 304, in stream
    self._runs, job_object, self._all_operations.all_operations[AzureMLResourceType.DATASTORE]
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ml/_operations/job_ops_helper.py", line 248, in stream_logs_until_completion
    _incremental_print(content, processed_logs, current_log, file_handle)
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ml/_operations/job_ops_helper.py", line 99, in _incremental_print
    fileout.write(line + "\n")
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 41, in write
    self.__convertor.write(text)
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 162, in write
    self.write_and_convert(text)
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 190, in write_and_convert
    self.write_plain_text(text, cursor, len(text))
  File "/opt/az/lib/python3.6/site-packages/colorama/ansitowin32.py", line 195, in write_plain_text
    self.wrapped.write(text[start:end])
UnicodeEncodeError: 'ascii' codec can't encode character '\u2018' in position 11: ordinal not in range(128)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/opt/az/lib/python3.6/site-packages/knack/cli.py", line 231, in invoke
    cmd_result = self.invocation.execute(args)
  File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 658, in execute
    raise ex
  File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 721, in _run_jobs_serially
    results.append(self._run_job(expanded_arg, cmd_copy))
  File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 692, in _run_job
    result = cmd_copy(params)
  File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 328, in __call__
    return self.handler(*args, **kwargs)
  File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/command_operation.py", line 121, in handler
    return op(**command_args)
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/custom/job.py", line 194, in ml_job_stream
    log_and_raise_error(err, debug)
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/custom/raise_error.py", line 89, in log_and_raise_error
    _raise_cli_error(error)
  File "/home/dockeruser/azure-cli-extensions/ml/azext_mlv2/manual/custom/raise_error.py", line 68, in _raise_cli_error
    raise CLIError(message)
knack.util.CLIError: 'ascii' codec can't encode character '\u2018' in position 11: ordinal not in range(128)
ERROR: cli.azure.cli.core.azclierror: 'ascii' codec can't encode character '\u2018' in position 11: ordinal not in range(128)
ERROR: az_command_data_logger: 'ascii' codec can't encode character '\u2018' in position 11: ordinal not in range(128)
DEBUG: cli.knack.cli: Event: Cli.PostExecute [<function AzCliLogging.deinit_cmd_metadata_logging at 0x7fb572ea68c8>]
INFO: az_command_data_logger: exit code: 1
INFO: cli.__main__: Command ran in 440.559 seconds (init: 0.125, invoke: 440.434)
INFO: telemetry.save: Save telemetry record of length 3149 in cache
INFO: telemetry.check: Returns Positive.
INFO: telemetry.main: Begin creating telemetry upload process.
INFO: telemetry.process: Creating upload process: "/usr/bin/../../opt/az/bin/python3 /opt/az/lib/python3.6/site-packages/azure/cli/telemetry/__init__.py /home/dockeruser/.azure"
INFO: telemetry.process: Return from creating process
INFO: telemetry.main: Finish creating telemetry upload process.

@gro1m
Copy link
Author

gro1m commented Jan 5, 2022

Hi @needuv
I saw that Azure CLI v2.32.0 has been released - were you able to do anything with the information I gave you above or do you need more?

@gro1m
Copy link
Author

gro1m commented Feb 16, 2022

@jiasli @needuv
Is there any update? - this is a serious issue for us unfortunately.
Also would be great if az ml (v2) would not be in preview too long anymore...

@singankit
Copy link

@gro1m Can you please try updating az ml extension to the latest version (2.0.7)

@brandonwatts
Copy link

brandonwatts commented Feb 18, 2022

This works in 2.0.7 but breaks again in the newest version 2.1.1 (I am using Az cli version 2.33.1). it failed with a the following error when attempting to download job logs:

Saving blob with prefix ExperimentRun/<RUNIDHERE>/ was unsuccessful. exception=empty separator

@gro1m
Copy link
Author

gro1m commented Feb 18, 2022

This works in 2.0.7 but breaks again in the newest version 2.1.1 (I am using Az cli version 2.33.1). it failed with a the following error when attempting to download job logs:

Saving blob with prefix ExperimentRun// was unsuccessful. exception=empty separator

@bweben Could you verify again, as far as I know it did not work with Azure CLI version 2.33.1 and Azure ML CLI 2.0.7?

@hnky
Copy link

hnky commented Mar 20, 2022

I have the same error:

{
"azure-cli": "2.34.1",
"azure-cli-core": "2.34.1",
"azure-cli-telemetry": "1.0.6",
"extensions": {
"ml": "2.1.2"
}
}

globalai@SandboxHost-637833806236582481:~$ az ml job download -n 91f47b76-993a-4c19-9910-7c75e5a92a60
Command group 'ml job' is in preview and under development. Reference and support levels: https://aka.ms/CLI_refstatus
Downloading the job logs ExperimentRun/dcid.91f47b76-993a-4c19-9910-7c75e5a92a60/ at /home/ubuntu/91f47b76-993a-4c19-9910-7c75e5a92a60

Saving blob with prefix ExperimentRun/dcid.91f47b76-993a-4c19-9910-7c75e5a92a60/ was unsuccessful. exception=empty separator

@needuv
Copy link
Member

needuv commented Mar 21, 2022

Hi @Grom1, apologies for the delay. Could you please check the encoding your terminal is using for stdout? You can check it in python by running sys.stdout.encoding in an interactive python shell in your terminal. To get you unblocked, could you switch your terminal's stdout encoding to utf-8 and see if that works?

@hnky, @brandonwatts could you please share your outputs when you run the az ml job download command with the --debug flag (please make sure you scrub any secrets from the log output)?

@yonzhan yonzhan removed the question The issue doesn't require a change to the product in order to be resolved. Most issues start as that label Mar 21, 2022
@RakeshMohanMSFT RakeshMohanMSFT added the needs-author-feedback More information is needed from author to address the issue. label Apr 22, 2022
@ghost ghost added the no-recent-activity There has been no recent activity on this issue. label Apr 29, 2022
@ghost
Copy link

ghost commented Apr 29, 2022

Hi, we're sending this friendly reminder because we haven't heard back from you in a while. We need more information about this issue to help address it. Please be sure to give us your input within the next 7 days. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

@ghost ghost closed this as completed May 14, 2022
@RakeshMohanMSFT RakeshMohanMSFT self-assigned this Jul 13, 2022
@ghost ghost removed the no-recent-activity There has been no recent activity on this issue. label Jul 13, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning az ml needs-author-feedback More information is needed from author to address the issue. Service Attention This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

8 participants