Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azureml groups optimization #1732

Merged
merged 41 commits into from
Jun 20, 2022
Merged

Azureml groups optimization #1732

merged 41 commits into from
Jun 20, 2022

Conversation

miguelgfierro
Copy link
Collaborator

@miguelgfierro miguelgfierro commented Jun 7, 2022

Description

Optimizing the AzureML groups to maximize compute usage.

Could you please review @pradnyeshjoshi

Related Issues

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.
  • This PR is being made to staging branch and not to main branch.

@miguelgfierro
Copy link
Collaborator Author

@pradnyeshjoshi there is a failure:

"error": ***
        "code": "UserError",
        "message": "User program failed with KeyError: 'group_notebooks_cpu_002'",
        "messageParameters": ***,
        "detailsUri": "https://aka.ms/azureml-run-troubleshooting",
        "details": []
    ***,

It seems the system is trying to find a group that doesn't exist, could you please take a look? if we are looping through a hardcoded list of groups, can we change that to a dynamic list? so we can add any new group and the system will catch it?

@miguelgfierro miguelgfierro mentioned this pull request Jun 13, 2022
4 tasks
@pradnyeshjoshi pradnyeshjoshi self-assigned this Jun 16, 2022
@pradnyeshjoshi
Copy link
Collaborator

@pradnyeshjoshi there is a failure:

"error": ***
        "code": "UserError",
        "message": "User program failed with KeyError: 'group_notebooks_cpu_002'",
        "messageParameters": ***,
        "detailsUri": "https://aka.ms/azureml-run-troubleshooting",
        "details": []
    ***,

It seems the system is trying to find a group that doesn't exist, could you please take a look? if we are looping through a hardcoded list of groups, can we change that to a dynamic list? so we can add any new group and the system will catch it?

@miguelgfierro the workflows now dynamically retrieve the group names from test_groups.py. Could you please review?
I will optimize the groups further in another PR.

@miguelgfierro
Copy link
Collaborator Author

Looks good to me, I can't approve because I started the PR

@pradnyeshjoshi pradnyeshjoshi merged commit db22152 into staging Jun 20, 2022
@miguelgfierro miguelgfierro deleted the azureml_groups_opt branch June 21, 2022 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants