Skip to content

Commit

Permalink
Release Network-Operator v0.4.0 (Mellanox#137)
Browse files Browse the repository at this point in the history
Signed-off-by: Ivan Kolodiazhny <ikolodiazhny@nvidia.com>
  • Loading branch information
e0ne committed Mar 31, 2021
1 parent 61a9584 commit ebe5e20
Show file tree
Hide file tree
Showing 4 changed files with 148 additions and 24 deletions.
61 changes: 41 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Nvidia Mellanox Network Operator Helm Chart
# Nvidia Network Operator Helm Chart

Nvidia Mellanox Network Operator Helm Chart provides an easy way to install, configure and manage
Nvidia Network Operator Helm Chart provides an easy way to install, configure and manage
the lifecycle of Nvidia Mellanox network operator.

## Nvidia Mellanox Network Operator
Nvidia Mellanox Network Operator leverages [Kubernetes CRDs](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
## Nvidia Network Operator
Nvidia Network Operator leverages [Kubernetes CRDs](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
and [Operator SDK](https://github.com/operator-framework/operator-sdk) to manage Networking related Components in order to enable Fast networking,
RDMA and GPUDirect for workloads in a Kubernetes cluster.
Network Operator works in conjunction with [GPU-Operator](https://github.com/NVIDIA/gpu-operator) to enable GPU-Direct RDMA
Expand Down Expand Up @@ -102,9 +102,9 @@ chart parameters to deploy it together with the operator.

## Helm Tests

Network Operator has Helm tests to verify deployment. To run tests it is required to set the following chart parameters on helm install/upgrade: `deployCR`, `devicePlugin`, `secondaryNetwork` as the test depends on `NicClusterPolicy` instance being deployed by Helm.
Network Operator has Helm tests to verify deployment. To run tests it is required to set the following chart parameters on helm install/upgrade: `deployCR`, `rdmaSharedDevicePlugin`, `secondaryNetwork` as the test depends on `NicClusterPolicy` instance being deployed by Helm.
Supported Tests:
- Device Plugin Resource: This test creates a pod that requests the first resource in `devicePlugin.resources`
- Device Plugin Resource: This test creates a pod that requests the first resource in `rdmaSharedDevicePlugin.resources`
- RDMA Traffic: This test creates a pod that test loopback RDMA traffic with `rping`

Run the helm test with following command after deploying network operator with helm
Expand Down Expand Up @@ -152,7 +152,7 @@ Production cluster environment can deny direct access to the Internet and instea
| `ofedDriver.deploy` | bool | `false` | deploy Mellanox OFED driver container |
| `ofedDriver.repository` | string | `mellanox` | Mellanox OFED driver image repository |
| `ofedDriver.image` | string | `mofed` | Mellanox OFED driver image name |
| `ofedDriver.version` | string | `5.2-1.0.4.0` | Mellanox OFED driver version |
| `ofedDriver.version` | string | `5.3-1.0.0.1` | Mellanox OFED driver version |

#### NVIDIA Peer memory driver

Expand All @@ -168,11 +168,11 @@ Production cluster environment can deny direct access to the Internet and instea

| Name | Type | Default | description |
| ---- | ---- | ------- | ----------- |
| `devicePlugin.deploy` | bool | `true` | Deploy device plugin |
| `devicePlugin.repository` | string | `mellanox` | Device plugin image repository |
| `devicePlugin.image` | string | `k8s-rdma-shared-dev-plugin` | Device plugin image name |
| `devicePlugin.version` | string | `v1.1.0` | Device plugin version |
| `devicePlugin.resources` | list | See below | Device plugin resources |
| `rdmaSharedDevicePlugin.deploy` | bool | `true` | Deploy RDMA Shared device plugin |
| `rdmaSharedDevicePlugin.repository` | string | `mellanox` | RDMA Shared device plugin image repository |
| `rdmaSharedDevicePlugin.image` | string | `k8s-rdma-shared-dev-plugin` | RDMA Shared device plugin image name |
| `rdmaSharedDevicePlugin.version` | string | `v1.1.0` | RDMA Shared device plugin version |
| `rdmaSharedDevicePlugin.resources` | list | See below | RDMA Shared device plugin resources |

##### RDMA Device Plugin Resource configurations

Expand All @@ -189,6 +189,27 @@ resources:
vendors: [15b3]
deviceIDs: [1017]
ifNames: [ib0, ib1]
```

#### SR-IOV Network Device plugin

| Name | Type | Default | description |
| ---- | ---- | ------- | ----------- |
| `sriovDevicePlugin.deploy` | bool | `true` | Deploy SR-IOV Network device plugin |
| `sriovDevicePlugin.repository` | string | `docker.io/nfvpe` | SR-IOV Network device plugin image repository |
| `sriovDevicePlugin.image` | string | `sriov-device-plugin` | SR-IOV Network device plugin image name |
| `sriovDevicePlugin.version` | string | `v3.3` | SR-IOV Network device plugin version |
| `sriovDevicePlugin.resources` | list | See below | SR-IOV Network device plugin resources |

##### SR-IOV Network Device Plugin Resource configurations

Consists of a list of RDMA resources each with a name and selector of RDMA capable network devices
to be associated with the resource. Refer to [SR-IOV Network Device Plugin Selectors](https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin#device-selectors) for supported selectors.

```
resources:
- name: hostdev
vendors: [15b3]
```

>__Note__: The parameter listed are non-exhaustive, for the full list of chart parameters refer to
Expand Down Expand Up @@ -253,10 +274,10 @@ __values.yaml:__
deployCR: true
ofedDriver:
deploy: true
version: 5.2-1.0.4.0
devicePlugin:
version: 5.3-1.0.0.1
rdmaSharedDevicePlugin:
deploy: true
reources:
resources:
- name: rdma_shared_device_a
ifNames: [enp1]
```
Expand All @@ -272,9 +293,9 @@ ofedDriver:
deploy: true
nvPeerDriver:
deploy: true
devicePlugin:
rdmaSharedDevicePlugin:
deploy: true
reources:
resources:
- name: rdma_shared_device_a
ifNames: [enp1, enp2]
- name: rdma_shared_device_b
Expand All @@ -292,9 +313,9 @@ Network Operator deployment with:
__values.yaml:__
```:yaml
deployCR: true
devicePlugin:
rdmaSharedDevicePlugin:
deploy: true
reources:
resources:
- name: rdma_shared_device_a
ifNames: [ib0]
secondaryNetwork:
Expand All @@ -314,7 +335,7 @@ mapped to Mellanox ConnectX-5.
__values.yaml:__
```:yaml
deployCR: true
devicePlugin:
rdmaSharedDevicePlugin:
deploy: true
resources:
- name: rdma_shared_device_a
Expand Down
1 change: 0 additions & 1 deletion index.yaml

This file was deleted.

78 changes: 78 additions & 0 deletions index.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
apiVersion: v1
entries:
network-operator:
- apiVersion: v2
appVersion: v0.4.0
created: "2021-03-31T23:04:44.482706333+03:00"
dependencies:
- condition: nfd.enabled
name: node-feature-discovery
repository: ""
version: 0.1.0
- condition: sriovNetworkOperator.enabled
name: sriov-network-operator
repository: ""
version: 0.1.0
description: Nvidia Mellanox network operator
digest: 027f50616a6c484077df2a2a6db8f33e492e0224050dac4cbe44553cc158f817
home: https://mellanox.github.io/network-operator
keywords:
- gpu-direct
- rdma
kubeVersion: '>= 1.17.0'
name: network-operator
sources:
- https://github.com/Mellanox/network-operator
type: application
urls:
- https://mellanox.github.io/network-operator/release/release/network-operator-0.4.0.tgz
version: 0.4.0
- apiVersion: v2
appVersion: v0.3.0
created: "2021-03-31T23:04:44.481083565+03:00"
dependencies:
- condition: nfd.enabled
name: node-feature-discovery
repository: ""
version: 0.1.0
- condition: sriovNetworkOperator.enabled
name: sriov-network-operator
repository: ""
version: 0.1.0
description: Nvidia Mellanox network operator
digest: f789a6160feed316f7425d0b2fc615ece98ae6281fb5a110ae62b8ce78f1981c
home: https://mellanox.github.io/network-operator
keywords:
- gpu-direct
- rdma
kubeVersion: '>= 1.17.0'
name: network-operator
sources:
- https://github.com/Mellanox/network-operator
type: application
urls:
- https://mellanox.github.io/network-operator/release/release/network-operator-0.3.0.tgz
version: 0.3.0
- apiVersion: v2
appVersion: v0.2.0
created: "2021-03-31T23:04:44.479560496+03:00"
dependencies:
- condition: nfd.enabled
name: node-feature-discovery
repository: ""
version: 0.1.0
description: Nvidia Mellanox network operator
digest: 40a83770531e342c0906f3ba644e8e9f10f4c71385af077240d9ed18d417502f
home: https://mellanox.github.io/network-operator
keywords:
- gpu-direct
- rdma
kubeVersion: '>= 1.15.0'
name: network-operator
sources:
- https://github.com/Mellanox/network-operator
type: application
urls:
- https://mellanox.github.io/network-operator/release/release/network-operator-0.2.0-beta.tgz
version: 0.2.0-beta
generated: "2021-03-31T23:04:44.478339697+03:00"
32 changes: 29 additions & 3 deletions release/index.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,35 @@
apiVersion: v1
entries:
network-operator:
- apiVersion: v2
appVersion: v0.4.0
created: "2021-04-01T00:43:59.383708975+03:00"
dependencies:
- condition: nfd.enabled
name: node-feature-discovery
repository: ""
version: 0.1.0
- condition: sriovNetworkOperator.enabled
name: sriov-network-operator
repository: ""
version: 0.1.0
description: Nvidia Mellanox network operator
digest: 0d39d335e2fd82135d32df856b2d13814cf82fe1beae42fe817b9290caa0dd29
home: https://mellanox.github.io/network-operator
keywords:
- gpu-direct
- rdma
kubeVersion: '>= 1.17.0'
name: network-operator
sources:
- https://github.com/Mellanox/network-operator
type: application
urls:
- https://mellanox.github.io/network-operator/release/network-operator-0.4.0.tgz
version: 0.4.0
- apiVersion: v2
appVersion: v0.3.0
created: "2021-03-02T17:15:54.329405714+02:00"
created: "2021-04-01T00:43:59.382094173+03:00"
dependencies:
- condition: nfd.enabled
name: node-feature-discovery
Expand All @@ -29,7 +55,7 @@ entries:
version: 0.3.0
- apiVersion: v2
appVersion: v0.2.0
created: "2021-01-05T12:36:04.481706778Z"
created: "2021-04-01T00:43:59.380532262+03:00"
dependencies:
- condition: nfd.enabled
name: node-feature-discovery
Expand All @@ -49,4 +75,4 @@ entries:
urls:
- https://mellanox.github.io/network-operator/release/network-operator-0.2.0-beta.tgz
version: 0.2.0-beta
generated: "2021-03-02T17:15:54.323340996+02:00"
generated: "2021-04-01T00:43:59.379260819+03:00"
Binary file added release/network-operator-0.4.0.tgz
Binary file not shown.

0 comments on commit ebe5e20

Please sign in to comment.