Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker_container: DeviceRequests.Capabilities are improperly validated #42

Closed
skokec opened this issue Dec 7, 2020 · 7 comments · Fixed by #43
Closed

docker_container: DeviceRequests.Capabilities are improperly validated #42

skokec opened this issue Dec 7, 2020 · 7 comments · Fixed by #43

Comments

@skokec
Copy link

skokec commented Dec 7, 2020

SUMMARY

When using device_requests with capabilities option for nvidia GPU in docker_container, ansible returns error indicating that capabilities are not formatted properly. This happens for the example provided in the documentation.

The issue seems to be around the line 1441 in docker_container.py where capabilitieslist is validated, however, this validation expects capabilities to be 'list of list of list of string' instead of 'list of list of string'. If I actually provide capabilities: [[[gpu]]] then this goes through but then I get docker marshal error indicating this is not what docker API expects.

A simple patch can fix this by removing the line 1441 in docker_container.py and updating the variable names. With this patch I do not get any issues and container is correctly created with the requested devices.

ISSUE TYPE
  • Bug Report
COMPONENT NAME

docker_container

ANSIBLE VERSION
ansible 2.10.4
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/domen/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python2.7/dist-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.17 (default, Sep 30 2020, 13:38:04) [GCC 7.5.0]

CONFIGURATION

OS / ENVIRONMENT

Linux ubuntu 4.15.0-118-generic #119-Ubuntu SMP Tue Sep 8 12:30:01 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

STEPS TO REPRODUCE
- name: Start container with GPUs
  community.docker.docker_container:
    name: test
    image: ubuntu:18.04
    state: started
    device_requests:
      - # Add nVidia GPUs to this container
        driver: nvidia
        count: -1  # this means we want all
        capabilities:
          - - gpu
            - compute
EXPECTED RESULTS

Deploying container only on specific devices without errors

ACTUAL RESULTS

The following error:

Traceback (most recent call last):
  File \"/home/ubuntu/.ansible/tmp/ansible-tmp-1607364771.4-34439-220056238326582/AnsiballZ_docker_container.py\", line 102, in <module>
    _ansiballz_main()
  File \"/home/ubuntu/.ansible/tmp/ansible-tmp-1607364771.4-34439-220056238326582/AnsiballZ_docker_container.py\", line 94, in _ansiballz_main
    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)
  File \"/home/ubuntu/.ansible/tmp/ansible-tmp-1607364771.4-34439-220056238326582/AnsiballZ_docker_container.py\", line 40, in invoke_module
    runpy.run_module(mod_name='ansible_collections.community.docker.plugins.modules.docker_container', init_globals=None, run_name='__main__', alter_sys=True)
  File \"/usr/lib/python3.6/runpy.py\", line 205, in run_module
    return _run_module_code(code, init_globals, run_name, mod_spec)
  File \"/usr/lib/python3.6/runpy.py\", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File \"/usr/lib/python3.6/runpy.py\", line 85, in _run_code
    exec(code, run_globals)
  File \"/tmp/ansible_community.docker.docker_container_payload_vj3muuqt/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/modules/docker_container.py\", line 3538, in <module>
  File \"/tmp/ansible_community.docker.docker_container_payload_vj3muuqt/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/modules/docker_container.py\", line 3529, in main
  File \"/tmp/ansible_community.docker.docker_container_payload_vj3muuqt/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/modules/docker_container.py\", line 2682, in __init__
  File \"/tmp/ansible_community.docker.docker_container_payload_vj3muuqt/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/modules/docker_container.py\", line 1447, in __init__
TypeError: 'str' object does not support item assignment

@felixfontein
Copy link
Collaborator

Thanks for reporting! I'll take a look at this later today or tomorrow. I would be glad if you could test a PR. (I don't have the means to properly test the option.)

@skokec
Copy link
Author

skokec commented Dec 8, 2020

Sure, I can test PR. Let me know when to do so.

@felixfontein
Copy link
Collaborator

@skokec please test it as soon as you can :)

@skokec
Copy link
Author

skokec commented Dec 8, 2020

It works OK now.

I've tested you branch fix-docker_container-device_requests with the following:

- hosts: <hostname>
  become: yes
  
  tasks:
    - name: Start container with GPUs
      community.docker.docker_container:
        name: test
        image: nvidia/cuda:10.1-runtime-ubuntu18.04
        state: started
        command: 'nvidia-smi -L'
        detach: false
        device_requests:
          - # Add nVidia GPUs to this container
            driver: nvidia
            device_ids: 
              - '0'
              - '1'
            capabilities:
              - ['gpu','nvidia']
      register: docker_container_output      
              
    - name: Show test output
      debug:
        msg: "{{ docker_container_output.container.Output  }}"

which correctly allocates first two GPUs:

TASK [Show test output] *******************************************************************************************
ok: [hostname] => {}

MSG:

GPU 0: GeForce RTX 2080 Ti (UUID: GPU-5185bf4e-d65e-9559-66aa-255ce2afeeb4)
GPU 1: GeForce RTX 2080 Ti (UUID: GPU-00ef98af-750e-2db7-ccbe-59054f6c8054)


PLAY RECAP ********************************************************************************************************
hostname                : ok=3    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

@felixfontein
Copy link
Collaborator

@skokec thanks! :)

@felixfontein
Copy link
Collaborator

I merged the PR and plan to release a new version of this collection later this week.

@felixfontein
Copy link
Collaborator

FYI, community.docker 1.0.1 has been released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants