Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation for new systemd configuration method #365

Merged
merged 8 commits into from
Jul 5, 2023

Conversation

SchSeba
Copy link
Collaborator

@SchSeba SchSeba commented Sep 29, 2022

Add a new configuration method called systemd mode.

With this new mode all the VFs are created before kubelet even starts this will allow the operator to have a much better recovery time after reboot.

it will also fix issues like kubernetes/kubernetes#109595

The downside of this solution and the reason why for now it will not be the "default" configuration mode is that every configuration change in sriov will require a node reboot.

After this PR is merged we will need to work on
#225
#226

to improve the configuration/reconfiguration/update time on large clusters

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

1 similar comment
@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@jjuele
Copy link
Contributor

jjuele commented Oct 28, 2022

Can one of the admins verify this patch?

@github-actions
Copy link

github-actions bot commented Nov 7, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

github-actions bot commented Nov 7, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

github-actions bot commented Dec 1, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@SchSeba SchSeba changed the title WIP: support new systemd method Implementation for new systemd configuration method Dec 1, 2022
pkg/host/host.go Outdated Show resolved Hide resolved
pkg/host/host.go Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented Dec 1, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

github-actions bot commented Dec 1, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@coveralls
Copy link

coveralls commented Dec 1, 2022

Pull Request Test Coverage Report for Build 5322664195

  • 138 of 997 (13.84%) changed or added relevant lines in 18 files are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-1.5%) to 24.285%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/service/utils.go 0 1 0.0%
pkg/webhook/webhook.go 0 1 0.0%
pkg/daemon/writer.go 0 2 0.0%
api/v1/helper.go 1 4 25.0%
pkg/daemon/plugin.go 0 4 0.0%
pkg/plugins/generic/generic_plugin.go 3 11 27.27%
pkg/utils/utils.go 0 9 0.0%
pkg/plugins/virtual/virtual_plugin.go 0 10 0.0%
pkg/utils/utils_virtual.go 0 19 0.0%
pkg/utils/mock/mock_command.go 0 24 0.0%
Files with Coverage Reduction New Missed Lines %
pkg/plugins/k8s/k8s_plugin.go 1 0%
pkg/utils/utils.go 2 6.44%
Totals Coverage Status
Change from base Build 5268039487: -1.5%
Covered Lines: 2056
Relevant Lines: 8466

💛 - Coveralls

@SchSeba
Copy link
Collaborator Author

SchSeba commented Feb 19, 2023

waiting for #409

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

api/v1/helper.go Outdated
type ConfigurationModeType string

const (
DaemonConfigurationMode = "daemon"
Copy link
Collaborator

@e0ne e0ne Mar 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to create types consts here:

DaemonConfigurationMode ConfigurationModeType  = "daemon"
SystemdConfigurationMode ConfigurationModeType = "systemd"

@@ -299,7 +319,7 @@ func (r *SriovOperatorConfigReconciler) deleteK8sResource(ctx context.Context, i

func (r *SriovOperatorConfigReconciler) syncK8sResource(ctx context.Context, cr *sriovnetworkv1.SriovOperatorConfig, in *uns.Unstructured) error {
switch in.GetKind() {
case "ClusterRole", "ClusterRoleBinding", "MutatingWebhookConfiguration", "ValidatingWebhookConfiguration":
case "ClusterRole", "ClusterRoleBinding", "MutatingWebhookConfiguration", "ValidatingWebhookConfiguration", machineConfigCRDName:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's move all CRDs name into constant

command:
- /bin/bash
- -c
- mkdir -p /host/var/lib/sriov/ && cp /usr/bin/sriov-network-config-daemon /host/var/lib/sriov/sriov-network-config-daemon && chcon -t bin_t /host/var/lib/sriov/sriov-network-config-daemon | true # Allow systemd to run the file, use pipe true to not failed if the system doesn't have selinux enabled
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selinux => selinux or apparmor

}

if mlx4Err != nil && mlx5Err != nil {
glog.Errorf("tryEnableRdma(): no RDMA capable devices")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about Intel devices? will this check work for them?

return h.enableRDMAOnUbuntuMachine()
}

osName, err := h.getOSPrettyName()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 167 returns result if isRhelSystem
link 176 returns result if isUbuntuSystem

what other OSs are supported to get into this code?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,41 @@
/*
Copyright 2021.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright 2023

@@ -5,6 +5,7 @@ import (
"strings"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it would be good to have more detailed commit message

@github-actions
Copy link

github-actions bot commented May 2, 2023

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

main.go Fixed Show fixed Hide fixed
[Install]
WantedBy=multi-user.target
enabled: true
name: sriov-config.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud, would we want to add templatisation for dependent services? Is it possible that some users may want to add services in the After= field based on their host OS setup?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a great Idea the question is how we will expose that template to the user?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we definately need to have this configurable. Let's think about thi in a follow-up PR

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

Signed-off-by: Sebastian Sch <sebassch@gmail.com>
This is needed so we can static link everything in the binary

Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

Copy link
Collaborator

@e0ne e0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added few comments and nits which could be addressed in a new revision of this PR or in a follow up PR.

I tested these changes on the vanilla k8s only

@@ -69,6 +69,7 @@ func main() {
var metricsAddr string
var enableLeaderElection bool
var probeAddr string

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is not needed

@@ -82,6 +83,7 @@ func main() {
ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts)))

restConfig := ctrl.GetConfigOrDie()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is not needed

@@ -36,7 +36,7 @@ import (
"github.com/k8snetworkplumbingwg/sriov-network-operator/test/util/pod"
)

var waitingTime time.Duration = 20 * time.Minute
var waitingTime = 20 * time.Minute
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is not needed

[Install]
WantedBy=multi-user.target
enabled: true
name: sriov-config.service
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we definately need to have this configurable. Let's think about thi in a follow-up PR

return i, nil
}
}
return sriovnetworkv1.Interface{}, fmt.Errorf("unable to find interface: %v", name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you return 'nil, err' here?

return h.enableRDMAOnUbuntuMachine()
}

osName, err := h.getOSPrettyName()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants