Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for openshift single node #213

Merged

Conversation

SchSeba
Copy link
Collaborator

@SchSeba SchSeba commented Dec 7, 2021

This PR allows to still pause the MCP when running on OCP even if it's a single node with no drain option.

Another change is to the sriovOperatorConfig to mark the DisableDrain when there is only one node in the cluster

This commit allow to still pause the MCP when running on OCP.
This is needed when you have only 1 node in the cluster and the drainSkip is true

Signed-off-by: Sebastian Sch <sebassch@gmail.com>
return reconcile.Result{}, err
}

disableDrain := len(nodeList.Items) == 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check that the the controlPlaneTopology = SingleReplica

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The machineconfiguration.openshift.io/controlPlaneTopology: SingleReplica is a MCO label. I want this feature to work also for plain k8s

@@ -78,6 +78,14 @@ func (r *SriovOperatorConfigReconciler) Reconcile(ctx context.Context, req ctrl.
Name: constants.DEFAULT_CONFIG_NAME, Namespace: namespace}, defaultConfig)
if err != nil {
if errors.IsNotFound(err) {
// Check if we only have one node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error won't be triggered unless user delete the default SriovOperatorConfig explicitly. Is it what we want? The default SriovOperatorConfig is created by OperatorConfig controller: https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/main.go#L154

Do we need to consider the upgrade scenario? e.g. from lower version to 4.10 which contains this fix. If not, I think adding the disableDrain setting in the above 154 line shall be sufficient.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zshi-redhat thanks! I miss that file!

I will like to also work on updates so I put the change in both places.

The reason I put it on both places is a was able to make a race from when I install the operator + a policy and until there is a reconcile to add the skip

@SchSeba SchSeba force-pushed the support_pause_on_SNO branch 2 times, most recently from fb7f39c to cd10a0a Compare December 8, 2021 10:54

// Update the disableDrain field if needed
if len(nodeList.Items) == 1 && !defaultConfig.Spec.DisableDrain {
defaultConfig.Spec.DisableDrain = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add logging here. We need to let users know that configuration is changed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a one way switch, what if the cluster scales down to one then scales up to more that 1 node ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrianchiris do you think we should support this case?

This function is only if the config doesn't exist this means on install time or if the user for some reason removes this file.

If the user change the cluster topology he can

  1. update the sriov operator config
  2. remove the config and let the operator understand the cluster topology

I will not like to make the sriovOperatorConfig reconcile to check the cluster status.
I will not like also to add a trigger for this object on a node object change it will just spam the operator on a large cluster

Copy link
Collaborator

@adrianchiris adrianchiris Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is only if the config doesn't exist this means on install time or if the user for some reason removes this file.

This is not under the if block at L80, so it will be executed every time Reconcile is called.
maybe user put some configuration, then cluster scaled down (to 1) and operator restarted. this logic would then change the user configuration. later on cluster scaled up and we ended up with drain disabled.

So Here are my thoughts:

  1. we document that in case of a single Node cluster, the user should explicitly set DisableDrain
  2. we do this logic only when (re)creating the default object (within if statement block in L80 or in main.go)

im fine with either 1 or 2
2. probably having a better user experience.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer 2 changing the code as requested

main.go Outdated
@@ -232,13 +233,23 @@ func createDefaultOperatorConfig(cfg *rest.Config) error {
if err != nil {
return fmt.Errorf("Couldn't create client: %v", err)
}

// Check if we only have one node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this commit message doesn't align with a code below

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think reading L237 is pretty clear, you can drop this comment if you like.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

done and not removed ? :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry about that

@@ -99,6 +100,25 @@ func (r *SriovOperatorConfigReconciler) Reconcile(ctx context.Context, req ctrl.
return reconcile.Result{}, err
}

// Check if we only have one node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comment a for main.go change. Do we want to move this check to pkg.utils module as 'GetNodesConut' function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done can you please have another look?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think reading L104 is pretty clear, you can drop this comment if you like.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pkg/daemon/daemon.go Outdated Show resolved Hide resolved
@SchSeba
Copy link
Collaborator Author

SchSeba commented Dec 12, 2021

Hi @e0ne @pliurh @adrianchiris can you please give this another look?

done := make(chan bool)
go dn.getDrainLock(ctx, done)
<-done
if utils.ClusterType != utils.ClusterTypeOpenshift {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move (and inverse) this check to daemon.go L510 ? and only call pauseMCP in openshift cluster ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure no problem

}

if len(nodeList.Items) == 1 {
glog.Infof("only one node found in the cluster marking disableDrain as true")
Copy link
Collaborator

@adrianchiris adrianchiris Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i dont think this log is needed.

if you want to keep it, please update the message as disableDrain is now out of context.
(IsSingleNodeCluster doesnt really care what operations may be done due to its invocation)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will like to leave the message makes debug easier :)

I just change the comment

@adrianchiris
Copy link
Collaborator

@SchSeba might be a dumb question, but why do we want to disable drain on a single node cluster ? :)
critical services are daemonsets (usually) and these wont go away right ?

@SchSeba
Copy link
Collaborator Author

SchSeba commented Dec 13, 2021

Hi @adrianchiris there are cases when the user have a 1 node cluster and deploy some applications using pod disruption budget for example that is the case in openshift we have some platform pods using the PDB so the drain will never finish and the operator will just stuck trying to drain the node.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@pliurh
Copy link
Collaborator

pliurh commented Dec 14, 2021

/lgtm

@github-actions github-actions bot added the lgtm label Dec 14, 2021

glog.Infof("nodeStateSyncHandler(): get drain lock for sriov daemon")
done := make(chan bool)
go dn.getDrainLock(ctx, done)
Copy link
Collaborator

@adrianchiris adrianchiris Dec 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in Kubernetes cluster Type we will now always try to get the drain lock, even if disableDrain is enabled.
previously it skipped getting the lock.

not sure its a major issue, as i think disableDrain is used mainly for testing
@zshi-redhat @pliurh WDYT ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a minor issue. If the cluster is a single node cluster, there is no other node that needs to wait for the lock. For multi-node clusters, this flag is mainly for testing. The drawback is that, as the LeaseDuration is 5s, we will have to waste 5s on each node when it requires draining.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack thx @pliurh ! id prefer to condition on (disableDrain or Openshift ) to avoid api calls for the lease and the 5s delay per node.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrianchiris done please have a look I also test it internally

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

Signed-off-by: Sebastian Sch <sebassch@gmail.com>
@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@adrianchiris
Copy link
Collaborator

/test-all

@adrianchiris
Copy link
Collaborator

Failure related to #218

@zshi-redhat
Copy link
Collaborator

/test-all

@zshi-redhat
Copy link
Collaborator

Fix proposed in #219

@adrianchiris
Copy link
Collaborator

/test-all

@SchSeba
Copy link
Collaborator Author

SchSeba commented Dec 22, 2021

Hi @adrianchiris anything we are missing here or we can merge this PR?

Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reminder @SchSeba ! LGTM

@adrianchiris adrianchiris merged commit de3966d into k8snetworkplumbingwg:master Dec 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants