Skip to content

Commit

Permalink
OCPBUGS-28647: tuned: operand: add support for deferred updates (#1019)
Browse files Browse the repository at this point in the history
* util: add helpers to deal with the deferred annotation

Add helpers to be used in future commits.
Enable early introduction of e2e tests

Signed-off-by: Francesco Romani <fromani@redhat.com>

* status: report deferred status

add support to report the deferred status in
conditions.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* tuned: controller: profiles extract/repack

rework profilesExtract and helper functions
to make more testable, enable round-tripping,
and add helpers which will be used in the deferred
update functionality.

Read extracted profiles on startup and reconstruct
the node state, to be used to implement the
deferred updates feature.
Please note all the operations are non-destructive
read-only so pretty safe.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* tuned: add changeSyncerPostNodeRestart

The deferred updates feature need to do specific
actions when the node on which the tuned daemon runs is restarted (and only then).
Add hook point with specific code for that.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* tuned: refactor status update

Rework how tuned controller updates status to make
room for the deferred update feature.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* tuned: set the deferred flag

Now that all the pieces are in place, we can set
and use the deferred update flag to implement
the desired behavior.

Note we now need to trigger an event when the recommend file
is written, and always do that even if does not change.
This is needed to react properly when a deferred
update is un-deferred by editing the tuned object.

Without a trigger in this state the k8s object
status wouldn't be correctly updated.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* tuned: fix propagation of the deferred flag

When we compute `TunedProfile`s, we should propagate
the deferred flag only if one of these object is
extracted from a `Tuned` object which had the deferred annotation.

The best option would be to set the deferred flag if any of the `Tuned`
objects relevant for a node has the deferred flag set.
However, this approximation makes the flow actually more predictable,
so we adopt it.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* tuned: operand: review log verbosiness

tune down the log chattiness to reduce the log pressure.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* tuned: operand: log restarts

Log reasons why a tuned reload or restart
is requested. We do this at V>=4, so there's
little risk of adding extra noise in the logs.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* tuned: operand: unified restart and reload handling

Previously, we added special logic to handle TuneD reload
requests when deferred updates are received.
Extend the logic to TuneD restarts for the sake of clarity
and because difform treatment was causing bugs:
profile not correctly reverted after removal.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* e2e: makefile: add targets to build suites

building suites ahead of time enables trivial catching of
compilation error and turns out handy to run subset of tests,
in addition to the usual ginkgo filtering capabilities.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* e2e: deferred: add test cases

add test coverage for the `deferred updates` feature

Signed-off-by: Francesco Romani <fromani@redhat.com>

* e2e: deferred: assertions to avoid unwanted changes

Add assertions to ensure deferred updates don't
affect unrelated nodes.
In the e2e tests we target one of the worker nodes,
so we check control-plane nodes are not affected
by the testing changes.

Checking different MCPs would be equivalent or better,
but this is the simplest way to check because
the worker/control-plane split is always available.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* e2e: deferred: check revert restores node condition

Extract helper code to enable verification of node
state, because we are starting to see the same snippet
(which can't and shouldn't fail) in few places.

Add check to make sure we revert correctly to node state
on profile removal.

Signed-off-by: Francesco Romani <fromani@redhat.com>

* e2e: deferred: check restore after restart

Add test to verify that node state is correctly restored
after a deferred update is applied (requires a restart)
and then removed

Signed-off-by: Francesco Romani <fromani@redhat.com>

* e2e: push down utilities

enable to add another basic rollback test

Signed-off-by: Francesco Romani <fromani@redhat.com>

* e2e: basic: rollback test

add basic rollback test using the same manifests
we do use in deferred tests. This extra test
partially overlap with existing one, and will
help pinpoint deferred update test failures

Signed-off-by: Francesco Romani <fromani@redhat.com>

* makefile: ensure bindata

0_config requires now `pkg/manifests`, so we need
to make sure bindata is generated

Signed-off-by: Francesco Romani <fromani@redhat.com>

* e2e: util: rename ready->wait

rename the "ready" util package to "ready" and change the function
signatures for better readability.

Signed-off-by: Francesco Romani <fromani@redhat.com>

---------

Signed-off-by: Francesco Romani <fromani@redhat.com>
  • Loading branch information
ffromani authored Jul 22, 2024
1 parent 18febd2 commit f00595e
Show file tree
Hide file tree
Showing 27 changed files with 2,911 additions and 170 deletions.
11 changes: 7 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,8 @@ pkg/generated: $(API_TYPES)
$(GOBINDATA_BIN):
$(GO) build -o $(GOBINDATA_BIN) ./vendor/github.com/kevinburke/go-bindata/go-bindata

test-e2e:
for d in core basic reboots reboots/sno; do \
test-e2e: $(BINDATA)
for d in core basic reboots reboots/sno deferred; do \
KUBERNETES_CONFIG="$(KUBECONFIG)" $(GO) test -v -timeout 40m ./test/e2e/$$d -ginkgo.v -ginkgo.no-color -ginkgo.fail-fast || exit; \
done

Expand Down Expand Up @@ -293,13 +293,16 @@ gather-sysinfo-tests: build-gather-sysinfo
render-sync: build
hack/render-sync.sh

build-e2e-%:
@hack/build-test-bin.sh $(shell echo $@ | sed -e 's/^build-e2e-//' )

pao-build-e2e-%:
@hack/build-test-bin.sh $(shell echo $@ | sed -e 's/^pao-build-e2e-//' )
@hack/build-pao-test-bin.sh $(shell echo $@ | sed -e 's/^pao-build-e2e-//' )

.PHONY: pao-build-e2e
pao-build-e2e:
@for suite in $(PAO_E2E_SUITES); do \
hack/build-test-bin.sh $$suite; \
hack/build-pao-test-bin.sh $$suite; \
done

.PHONY: pao-clean-e2e
Expand Down
30 changes: 30 additions & 0 deletions hack/build-pao-test-bin.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash

set -e

PREFIX="pao-build-e2e-"
SUITEPATH="./test/e2e/performanceprofile/functests"
TARGET=$1

if [ -z "$TARGET" ]; then
echo "usage: $0 suite"
echo "example: $0 1_performance"
exit 1
fi

OUTDIR="${2:-_output}"

if [ ! -d "$SUITEPATH/$TARGET" ]; then
echo "unknown suite: $TARGET"
echo -e "must be one of:\n$( ls $SUITEPATH | grep -E '[0-9]+_.*' )"
exit 2
fi

SUITE="${SUITEPATH}/${TARGET}"
SUFFIX=$( echo $TARGET | cut -d_ -f2- )
BASENAME="e2e-pao"
EXTENSION="test"
OUTPUT="${BASENAME}-${SUFFIX}.${EXTENSION}"

echo "${SUITE} -> ${OUTDIR}/${OUTPUT}"
go test -c -v -o ${OUTDIR}/${OUTPUT} ${SUITE}
8 changes: 4 additions & 4 deletions hack/build-test-bin.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@

set -e

PREFIX="pao-build-e2e-"
SUITEPATH="./test/e2e/performanceprofile/functests"
PREFIX="build-e2e-"
SUITEPATH="./test/e2e"
TARGET=$1

if [ -z "$TARGET" ]; then
echo "usage: $0 suite"
echo "example: $0 1_performance"
echo "example: $0 deferred"
exit 1
fi

Expand All @@ -22,7 +22,7 @@ fi

SUITE="${SUITEPATH}/${TARGET}"
SUFFIX=$( echo $TARGET | cut -d_ -f2- )
BASENAME="e2e-pao"
BASENAME="e2e"
EXTENSION="test"
OUTPUT="${BASENAME}-${SUFFIX}.${EXTENSION}"

Expand Down
3 changes: 3 additions & 0 deletions manifests/20-profile.crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ spec:
- jsonPath: .status.conditions[?(@.type=="Degraded")].status
name: Degraded
type: string
- jsonPath: .status.conditions[?(@.type=="Applied")].message
name: Message
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
Expand Down
4 changes: 4 additions & 0 deletions pkg/apis/tuned/v1/tuned_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ const (
// TunedBootcmdlineAnnotationKey is a Node-specific annotation denoting kernel command-line parameters
// calculated by TuneD for the current profile applied to that Node.
TunedBootcmdlineAnnotationKey string = "tuned.openshift.io/bootcmdline"

// TunedDeferredUpdate request the tuned daemons to defer the update of the rendered profile
// until the next restart.
TunedDeferredUpdate string = "tuned.openshift.io/deferred"
)

/////////////////////////////////////////////////////////////////////////////////
Expand Down
7 changes: 6 additions & 1 deletion pkg/operator/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -645,6 +645,7 @@ func (c *Controller) syncProfile(tuned *tunedv1.Tuned, nodeName string) error {
}

klog.V(2).Infof("syncProfile(): Profile %s not found, creating one [%s]", profileMf.Name, computed.TunedProfileName)
profileMf.Annotations = util.ToggleDeferredUpdateAnnotation(profileMf.Annotations, computed.Deferred)
profileMf.Spec.Config.TunedProfile = computed.TunedProfileName
profileMf.Spec.Config.Debug = computed.Operand.Debug
profileMf.Spec.Config.TuneDConfig = computed.Operand.TuneDConfig
Expand Down Expand Up @@ -705,16 +706,20 @@ func (c *Controller) syncProfile(tuned *tunedv1.Tuned, nodeName string) error {
}
}

anns := util.ToggleDeferredUpdateAnnotation(profile.Annotations, computed.Deferred)

// Minimize updates
if profile.Spec.Config.TunedProfile == computed.TunedProfileName &&
profile.Spec.Config.Debug == computed.Operand.Debug &&
reflect.DeepEqual(profile.Spec.Config.TuneDConfig, computed.Operand.TuneDConfig) &&
reflect.DeepEqual(profile.Spec.Profile, computed.AllProfiles) &&
util.HasDeferredUpdateAnnotation(profile.Annotations) == util.HasDeferredUpdateAnnotation(anns) &&
profile.Spec.Config.ProviderName == providerName {
klog.V(2).Infof("syncProfile(): no need to update Profile %s", nodeName)
return nil
}
profile = profile.DeepCopy() // never update the objects from cache
profile.Annotations = anns
profile.Spec.Config.TunedProfile = computed.TunedProfileName
profile.Spec.Config.Debug = computed.Operand.Debug
profile.Spec.Config.TuneDConfig = computed.Operand.TuneDConfig
Expand All @@ -727,7 +732,7 @@ func (c *Controller) syncProfile(tuned *tunedv1.Tuned, nodeName string) error {
if err != nil {
return fmt.Errorf("failed to update Profile %s: %v", profile.Name, err)
}
klog.Infof("updated profile %s [%s]", profile.Name, computed.TunedProfileName)
klog.Infof("updated profile %s [%s] (deferred=%v)", profile.Name, computed.TunedProfileName, util.HasDeferredUpdateAnnotation(profile.Annotations))

return nil
}
Expand Down
Loading

0 comments on commit f00595e

Please sign in to comment.