Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HostDeviceNetwork basic support #126

Merged
merged 1 commit into from
Mar 29, 2021
Merged

Conversation

e0ne
Copy link
Collaborator

@e0ne e0ne commented Mar 1, 2021

This patch adds HostDeviceNetwork CRD support and basic
transformation to NetworkAttachmentDefinition to allow
Multus create secondary network.

SR-IOV Network Device Plugin is deployed using Network Operator
Helm chart with default configuration.

NOTE: temporary added "//nolint:dupl" before code will be
refactored.

Closes: #44

@e0ne e0ne force-pushed the host-net branch 3 times, most recently from 03bbd3b to ff532e6 Compare March 2, 2021 15:21
api/v1alpha1/hostdevicenetwork_types.go Outdated Show resolved Hide resolved
pkg/state/factory.go Outdated Show resolved Hide resolved
pkg/state/state_hostdevice_network.go Outdated Show resolved Hide resolved
@e0ne e0ne force-pushed the host-net branch 5 times, most recently from b855226 to 4ddc2d2 Compare March 2, 2021 22:39
@moshe010
Copy link
Collaborator

moshe010 commented Mar 3, 2021

can you add in the commit msg that you are fix #44

@e0ne e0ne force-pushed the host-net branch 5 times, most recently from 7f14ced to 255a38a Compare March 9, 2021 19:33
@e0ne e0ne changed the title [WIP] Add HostDeviceNetwork basic support Add HostDeviceNetwork basic support Mar 9, 2021
@e0ne
Copy link
Collaborator Author

e0ne commented Mar 9, 2021

Removed WIP to get more reviews. It's still required to remove duplicated code and add unit-tests

api/v1alpha1/nicclusterpolicy_types.go Show resolved Hide resolved
config/crd/bases/mellanox.com_hostdevicenetworks.yaml Outdated Show resolved Hide resolved
api/v1alpha1/hostdevicenetwork_types.go Outdated Show resolved Hide resolved
api/v1alpha1/hostdevicenetwork_types.go Outdated Show resolved Hide resolved
api/v1alpha1/hostdevicenetwork_types.go Show resolved Hide resolved
nodeSelector:
beta.kubernetes.io/arch: amd64
tolerations:
- key: node-role.kubernetes.io/master
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in rdma shared DP we got the following tolerations:

      tolerations:
      # Allow this pod to be rescheduled while the node is in "critical add-ons only" mode.
      # This, along with the annotation above marks this pod as a critical add-on.
      - key: CriticalAddonsOnly
        operator: Exists
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      - key: nvidia.com/gpu
        operator: Exists
        effect: NoSchedule

these should be aligned.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also i think we want:

  nodeSelector:
    feature.node.kubernetes.io/pci-15b3.present: "true"

as well.

)

const (
stateHostDeviceNetworkName = "state-HostDevice-Network"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i realize we are not consistent here throughout the states perhaps we should settle on state-word-another-word

later on aligning the rest WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Let's keep consistency in the code

// Either this state was not required to run or an update occurred and we need to remove
// the resources that where created.
// TODO: Support the latter case
// TODO: Support default configuration should be applied
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the default configurations ?

the original intention was that if this is nil then this state should be "undone" i.e removal of all yaml in this state's manifest folder

Copy link
Collaborator

@moshe010 moshe010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't review all. I will continue later

}

// +kubebuilder:object:root=true
// kubebuilder:object:generate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is kubebuilder:object:generate
why you don't +kubebuilder:object:generate= it ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works using both patterns but I agree that it's good to use only on across the project

api/v1alpha1/hostdevicenetwork_types.go Show resolved Hide resolved
}

// +kubebuilder:object:root=true
// kubebuilder:object:generate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

controllers/hostdevicenetwork_controller.go Show resolved Hide resolved
@moshe010 moshe010 mentioned this pull request Mar 12, 2021
34 tasks
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`

Spec *HostDeviceNetworkSpec `json:"spec,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

@e0ne e0ne Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

name: v1alpha1
schema:
openAPIV3Schema:
description: MacvlanNetwork is the Schema for the macvlannetworks API
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/HostDevice/MacvlanNetwork
s/macvlannetwork/hostdevicenetworks

@e0ne
Copy link
Collaborator Author

e0ne commented Mar 12, 2021

Note for reviewers: Most of the comments are addressed and squashed into one commit. HostDeviceNetwork scope and unit-tests will be added as a separate commits in this PR

@e0ne e0ne force-pushed the host-net branch 6 times, most recently from ae437e0 to 09f598a Compare March 15, 2021 10:13
api/v1alpha1/hostdevicenetwork_types.go Show resolved Hide resolved
config/rbac/role.yaml Show resolved Hide resolved
return nil, errors.Wrapf(err, "failed to create Shared Device plugin State")
}
sriovDpState, err := NewStateSriovDp(
k8sAPIClient, scheme, filepath.Join(manifestBaseDir, "stage-sriov-device-plugin"))
if err != nil {
return nil, errors.Wrapf(err, "failed to create Device plugin State")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Device plugin/SR-IOV Device plugin/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@e0ne
Copy link
Collaborator Author

e0ne commented Mar 17, 2021

/retest-nic_operator

type HostDeviceNetworkSpec struct {
// Namespace of the NetworkAttachmentDefinition custom resource
NetworkNamespace string `json:"networkNamespace,omitempty"`
// Endpoint resource name for SR-IOV resource pool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe reword to host device resource pool

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I borrowed it from the SR-IOV network device plugin documentation, but this name it good too

description: Namespace of the NetworkAttachmentDefinition custom resource
type: string
resourceName:
description: Endpoint resource name for SR-IOV resource pool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/SR-IOV/host device

@moshe010
Copy link
Collaborator

Also please add Closes #44 to commit msg

@e0ne
Copy link
Collaborator Author

e0ne commented Mar 17, 2021

Also please add Closes #44 to commit msg

It's already added. I'll add it to the PR description too

@e0ne e0ne force-pushed the host-net branch 4 times, most recently from f2a4e1a to a2322a2 Compare March 18, 2021 11:02
Copy link
Collaborator

@moshe010 moshe010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrianchiris can you please take a look

@e0ne e0ne force-pushed the host-net branch 5 times, most recently from 884366c to 5c73a4e Compare March 25, 2021 09:39
@@ -0,0 +1,86 @@
/*
Copyright 2020 NVIDIA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/2020/2021

@@ -0,0 +1,110 @@
/*
Copyright 2020 NVIDIA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/2020/2021

pkg/state/state_sriov_dp_test.go Show resolved Hide resolved
@@ -0,0 +1,209 @@
#!/bin/bash -e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't seem to be releated to PR also I see network-operator.0.3.1.tgz binary which should be removed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It was accidently added to the PR

This patch adds HostDeviceNetwork CRD support and basic
transformation to NetworkAttachmentDefinition to allow
Multus create secondary network.

SR-IOV Network Device Plugin is deployed using Network Operator
Helm chart with default configuration.

This patch also optimize NetworkAttachmentDef link generation
both for MacvlanNetwork and HostDeviceNetwork CRs.

We should add link for NetworkAttachmentDef only if
MacvlanNetwork or HostDeviceNetwork CR is in a 'Ready' state.
It allows us to make less Kubernetes API calls.

NOTE: temporary added "//nolint:dupl" before code will be
refactored.

Closes: Mellanox#44

Signed-off-by: Ivan Kolodiazhnyi <ikolodiazhny@nvidia.com>
@moshe010 moshe010 merged commit 6788484 into Mellanox:master Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support hostDevice secondary network
3 participants