Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Switchdev for kernel 5.8 and above #75

Merged

Conversation

Mmduh-483
Copy link
Contributor

@Mmduh-483 Mmduh-483 commented Feb 24, 2021

Till kernel 5.7 VF representors are exposed as virtual devices, they are not linked to a PCI address.

Starting from kernel 5.8 the VF representors are linked to their PF PCI address, so /sys/bus/pci/devices//net contains the PF net device and it VF representors net devices.

The new kernel behaviour affects 'tryGetInterfaceName' if PF in switchdev and configure-switchdev script

This commit brings needed changes to cope with new kernel behviour.

@zshi-redhat
Copy link
Collaborator

@Mmduh-483 I don't quite understand the kernel change, do you have reference to the kernel patch?

@moshe010
Copy link
Collaborator

@zshi-redhat see this kernel change https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=123f0f53dd64b67e34142485fe866a8a581f12f1. I think it will effect RH 8.4 and above
@Mmduh-483 - please update the commits msg so it will be clear what you are fixing

@Mmduh-483
Copy link
Contributor Author

@zshi-redhat I updated the commit message and PR description to be more clear

Copy link
Collaborator

@zshi-redhat zshi-redhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick review, I need to check if the kernel change is in RHEL-8.4, otherwise we may need to support old kernel behaviour.

names=()

while read pci_addr num_vfs
do
echo "Set $num_vfs VFs on device $pci_addr"

names+=($(ls /sys/bus/pci/devices/${pci_addr}/net/))
# Create udev rule to save PF name
add_udev_rule_for_sriov_pf $pci_addr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the argument of add_udev_rule_for_sriov_pf being pf name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You r right I missed it before pushing it, it should be like

 cat /etc/udev/rules.d/10-persistent-net.rules 
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", KERNELS=="0000:03:00.1", NAME="enp3s0f1np1"

fi
}

add_udev_rule_for_sriov_pf(){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this will keep the PF name as it is?
On openshift, it expects the PF name to be consistent w/ and w/o switchdev mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the udev rule is to save the name of the PF

@moshe010
Copy link
Collaborator

Just a quick review, I need to check if the kernel change is in RHEL-8.4, otherwise we may need to support old kernel behaviour.

anyway we need to support old kernel. @Mmduh-483 you change support both new kernel and old ones right?

@Mmduh-483
Copy link
Contributor Author

Mmduh-483 commented Feb 25, 2021

Just a quick review, I need to check if the kernel change is in RHEL-8.4, otherwise we may need to support old kernel behaviour.

anyway we need to support old kernel. @Mmduh-483 you change support both new kernel and old ones right?

It is supporting both

@zshi-redhat
Copy link
Collaborator

Starting from kernel 5.8 the VF representors are linked to their PF PCI address, so /sys/bus/pci/devices//net contains the PF net device and it VF representors net devices.

What does the output of /sys/bus/pci/devices/<pf-pci>/net look like?

@Mmduh-483
Copy link
Contributor Author

Starting from kernel 5.8 the VF representors are linked to their PF PCI address, so /sys/bus/pci/devices//net contains the PF net device and it VF representors net devices.

What does the output of /sys/bus/pci/devices/<pf-pci>/net look like?

~# ls /sys/bus/pci/devices/0000:03:00.0/net/
enp3s0f0np0 enp3s0f0np0_0 enp3s0f0np0_1 enp3s0f0np0_2

@zshi-redhat
Copy link
Collaborator

Starting from kernel 5.8 the VF representors are linked to their PF PCI address, so /sys/bus/pci/devices//net contains the PF net device and it VF representors net devices.

What does the output of /sys/bus/pci/devices/<pf-pci>/net look like?

~# ls /sys/bus/pci/devices/0000:03:00.0/net/
enp3s0f0np0 enp3s0f0np0_0 enp3s0f0np0_1 enp3s0f0np0_2

      name=$(ls /sys/bus/pci/devices/${pci_addr}/net/)
      names+=($name)
      # Create udev rule to save PF name
      add_udev_rule_for_sriov_pf $name

When add_udev_rule_for_sriov_pf $name is called, the first argument $1 is pf name, and the rest arguments (vf rep names) are not used?

# reset the pf name
new_name=$(ls /sys/bus/pci/devices/${pci_addr}/net/)
ip link set ${new_name} down
ip link set ${new_name} name ${names[i]}
ip link set ${names[i]} up
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${names[i]} could be a list of pfName + vfRepNames, is it expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, the names where collected in the first loop before the representors created

@Mmduh-483
Copy link
Contributor Author

Mmduh-483 commented Feb 25, 2021

When add_udev_rule_for_sriov_pf $name is called, the first argument $1 is pf name, and the rest arguments (vf rep names) are not used?

The representors are not created yet at this stage

@@ -38,6 +39,8 @@ const (
var InitialState sriovnetworkv1.SriovNetworkNodeState
var ClusterType string

var pfPhysPortNameRe = regexp.MustCompile(`p\d+`)
Copy link
Collaborator

@zshi-redhat zshi-redhat Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regexp matches all the following names: enp3s0f0np0 enp3s0f0np0_0 enp3s0f0np0_1 enp3s0f0np0_2
Maybe put a $ at the end of expression to match the PF name only? regexp.MustCompile(p\d+$)

Copy link
Contributor Author

@Mmduh-483 Mmduh-483 Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regex is for /sys/class/net/<netdev>/phys_port_name not the interface name
For PFs it is p/d+ e.g cat /sys/class/net/enp3s0f0np0/phys_port_name p0
For VF representors it is pf/d+vf/d+ e.g cat /sys/class/net/enp3s0f0np0_2/phys_port_name pf0vf2

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack+

@zshi-redhat
Copy link
Collaborator

overall change looks good to me, I'll run some test on openshift and comment if any issues.

@pliurh pliurh mentioned this pull request Mar 4, 2021
6 tasks
if [ ! -f $input ]; then
echo "File /etc/switchdev.conf not exist."
exit
fi

append_to_file(){
if ! test -f "$2"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be made more readable ? please assign meaningful local vars to $1 $2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Till kernel 5.7 VF representors are exposed as virtual devices,
they are not linked to a PCI address.

Starting from kernel 5.8 the VF representors are linked to their
PF PCI address, so /sys/bus/pci/devices/<pf-pci-address>/net contains
the PF net device and it VF representors net devices.

The new kernel behaviour affects 'tryGetInterfaceName' if PF in switchdev
and configure-switchdev script

This commit brings needed changes to cope with new kernel behviour.

Signed-off-by: Mamduh Alassi <mamduhala@nvidia.com>
@Mmduh-483
Copy link
Contributor Author

@zshi-redhat how did the test go? Are there any comments?

@zshi-redhat
Copy link
Collaborator

@zshi-redhat how did the test go? Are there any comments?

Sorry for the late update, I didn't run into any issue after applying this PR.

@zshi-redhat zshi-redhat merged commit 7ad0f8e into k8snetworkplumbingwg:master Mar 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants