-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel drain #555
Parallel drain #555
Conversation
Thanks for your PR,
To skip the vendors CIs use one of:
|
please first review #553 |
7a7cf01
to
db8e79a
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
Pull Request Test Coverage Report for Build 8249687295Details
💛 - Coveralls |
db8e79a
to
72587d9
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
1 similar comment
Thanks for your PR,
To skip the vendors CIs use one of:
|
c3dc047
to
054ead2
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
054ead2
to
a9fc289
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
a9fc289
to
5d1a68f
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
5d1a68f
to
1f01bce
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
1 similar comment
Thanks for your PR,
To skip the vendors CIs use one of:
|
c4bb38f
to
726a60f
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
726a60f
to
6d02042
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
6d02042
to
7c8cfbb
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
7c8cfbb
to
4527d48
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Thanks for your PR,
To skip the vendors CIs use one of:
|
I think I answered all the comments. please take another look when you have time |
controllers/drain_controller.go
Outdated
// configure logs | ||
reqLogger := log.FromContext(ctx) | ||
reqLogger.Info("Reconciling Drain") | ||
reqLogger.V(2).Info("node locked for drain controller", "nodeName", req.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can drop this log message.
controllers/drain_controller.go
Outdated
} | ||
|
||
func (dr *DrainReconcile) getObject(ctx context.Context, req ctrl.Request, object client.Object) error { | ||
// configure logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to remove this comment :)
controllers/drain_controller.go
Outdated
func (dr *DrainReconcile) getObject(ctx context.Context, req ctrl.Request, object client.Object) error { | ||
// configure logs | ||
reqLogger := log.FromContext(ctx) | ||
reqLogger.Info("checkForNodeDrain():") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong function name in the log message
- apiGroups: [""] | ||
resources: ["pods/eviction"] | ||
verbs: ["create"] | ||
- apiGroups: ["machineconfiguration.openshift.io"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to remove "machineconfiguration.openshift.io" from the role?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for nodes yes please don't need it anymore as it's part of the operator now
pkg/utils/cluster.go
Outdated
@@ -110,3 +111,53 @@ func openshiftControlPlaneTopologyStatus(c client.Client) (configv1.TopologyMode | |||
} | |||
return infra.Status.ControlPlaneTopology, nil | |||
} | |||
|
|||
func ObjectHasAnnotationKey(obj metav1.Object, annoKey string) bool { | |||
// Check if node already contains annotation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this comment to the docstring?
nit: the function can be simplified
func ObjectHasAnnotationKey(obj metav1.Object, annoKey string) bool {
_, hasKey := obj.GetAnnotations()[annoKey]
return hasKey
}
pkg/utils/cluster.go
Outdated
} | ||
|
||
func ObjectHasAnnotation(obj metav1.Object, annoKey string, value string) bool { | ||
// Check if node already contains annotation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this to the docstring?
pkg/daemon/daemon.go
Outdated
} | ||
|
||
if dn.nodeState.GetGeneration() == latest { | ||
if dn.currentNodeState.GetGeneration() == latest && !utils.ObjectHasAnnotation(dn.desiredNodeState, consts.NodeStateDrainAnnotationCurrent, consts.DrainComplete) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move !utils.ObjectHasAnnotation(dn.desiredNodeState, consts.NodeStateDrainAnnotationCurrent, consts.DrainComplete)
part to the helper function with a meaningful name, e.g. isDrainInProgress()
pkg/daemon/daemon.go
Outdated
if !dn.disableDrain { | ||
if err := drain.RunCordonOrUncordon(dn.drainer, dn.node, false); err != nil { | ||
if reqReboot { | ||
log.Log.Info("handleDrain(): apply 'Reboot_Required' label for node") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
annotation
pkg/daemon/daemon.go
Outdated
pausePatch := []byte("{\"spec\":{\"paused\":false}}") | ||
if _, err := dn.platformHelpers.GetMcClient().MachineconfigurationV1().MachineConfigPools().Patch(context.Background(), dn.mcpName, types.MergePatchType, pausePatch, metav1.PatchOptions{}); err != nil { | ||
log.Log.Error(err, "completeDrain(): failed to resume MCP", "mcp-name", dn.mcpName) | ||
log.Log.Info("handleDrain(): apply 'Reboot_Required' label for nodeState") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
annotation
log.Log.Info("handleDrain(): apply 'Reboot_Required' label for node") | ||
err := utils.AnnotateNode(vars.NodeName, consts.NodeDrainAnnotation, consts.RebootRequired, dn.client) | ||
if err != nil { | ||
log.Log.Error(err, "applyDrainRequired(): Failed to annotate node") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong func name in the log message
pkg/daemon/daemon.go
Outdated
} | ||
|
||
if err := dn.annotateNode(vars.NodeName, annoIdle); err != nil { | ||
log.Log.Error(err, "completeDrain(): failed to annotate node") | ||
log.Log.Info("handleDrain(): apply 'Drain_Required' label for nodeState") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
annotation
controllers/drain_controller.go
Outdated
return value, nil | ||
} | ||
|
||
func (dr *DrainReconcile) checkForNodeDrain(ctx context.Context, node *corev1.Node) (*reconcile.Result, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name is a bit confusing for me. Rename to tryDrainNode
?
controllers/drain_controller.go
Outdated
|
||
return selectedNpcl[0], nodeList.Items, nil | ||
} else { | ||
logger.V(2).Info("using \"default\" sriovNetworkPool with MaxUnavailable of one", "pool", *defaultNpcl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to mention default pool in the log message? Maybe we should change the message to say something like: node doesn't belong to any pool, using default drain configuration
Thanks for your PR,
To skip the vendors CIs use one of:
|
Done @ykulazhenkov with the comments thanks! be aware that in the daemon I use the background context as we don't have context there. |
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks @SchSeba for addressing my comments. |
Thanks for your PR,
To skip the vendors CIs use one of:
|
Hi @ykulazhenkov done thanks! |
Thanks for your PR,
To skip the vendors CIs use one of:
|
Hi @zeeke @e0ne @ykulazhenkov when you can please take another look on the PR I will like to try and get this one merge soon. Thanks! |
Thanks, @SchSeba. |
Signed-off-by: Sebastian Sch <sebassch@gmail.com>
Thanks for your PR,
To skip the vendors CIs use one of:
|
Hi @ykulazhenkov I did the requested change for the empty node selector let me know if that works for you :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @SchSeba,
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for all the help reviewing this complicated feature! merging after 3 approvals |
The PR moves the draining from the daemon to the main controller it's based on the work from @e0ne and #552.
Also, the PR introduced a new feature to allow multiple nodes to be drained in parallel
base on #479