-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Granular management policies #456
Granular management policies #456
Conversation
d196d36
to
ab78f63
Compare
a834fe4
to
b7445e9
Compare
Discussed with @turkenh to add an allowlist of supported combinations of ManagementPolicies and check that in the reconciler. What we though about is initially supporting: Default:
All - alternative to default
Observe Only
Pause
No late init
No late init, orphan
Orphan
These are some combinations that we thought make sense now. Possibly there are other combinations that we could add to the initial list. One issue with this is that if we want to add a combination, we need to do a release of crossplane-runtime, and then the providers need to be updated and released. But still, it gives a clear picture of what kind of managed policy behaviors we are supporting, and can say that we tested and know they work. |
I have tested migrating from the previous For some reason (also based on some early testing) I expected that after applying the new CRD with the updated OpenAPI spec, that the old policy will get replaced by the default new managementPolicy ( What happens is that the old If I dont find an easy and elegant solution for that I am considering simply renaming the new cc @turkenh |
Nice! I am really looking forward to this! 👍 A couple of late comments, I missed the proposal. If limiting policy combinations, consider supporting immutable resources
|
Thanks for the feedback @chlunde , its really appreciated! 👍
Thanks for the suggestion. The idea is to support valid use cases, and to have confidence that it works properly. One combination I am not sure if its neccessary:
as I dont see a reason why not to add the Observe. Having combinations with LateInitialize and Delete make sense. We dont have to add all valid combinations right away, until we see a use case for it. As this is an alpha feature, IMO its better to start small, and iterate through discovering use cases we want to solve. There could be valid combinations as well which are omitting "Create" for importing cases, but while its not needed, I am hesitant to add it right away and increase the load of what we are supporting. Of course there are also combinations which dont make sense like ["LateInitialize"].
TBH, I also see that going forward it may be that Late initialization gets deprecated. When I started to work on this issue, I also thought about deprecating/removing it alltogather, but as its a feature that has been around for a while, people are already using late initialized fields in patches and such. As we now have
Ehh, naming is hard. For me, Do you have some suggestion for the naming?
I see your point, it would be useful to have a map transform with a list for this case, not only for the management policies, but in general for arrays. I haven't tested this in a composition yet btw. |
Changed the field name from Old resource:
apply new CRD:
If we keep the name the same, what would happen is that the value will stay the same, even though the CRD spec changed: Applying a resource with the old management policies CRD:
applying the new CRD withouth changing the mane:
It stays the same even though the CRD spec changed. I tried triggering an update or something to see if it gets changed, but the edit doesnt even pass because of the validation failing (it now expects an array instead of a string). Of course new resources are created correctly. Decided that its better to go with the name change as it achieves the effect we are looking for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lsviben Apologies, super quick/shallow review because I didn't get around to this until the end of my day and I have to run. I have another reminder to look again tomorrow.
pkg/reconciler/managed/reconciler.go
Outdated
return false | ||
} | ||
return ContainsOnlyManagementAction(policy, xpv1.ManagementActionObserve) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of these functions take a policy and whether management policies are enabled as arguments. Would it make sense to introduce a ManagementPolicyChecker
type? Might result in more succinct calls, and less refactoring when managementPolicyEnabled
goes away (i.e. when its GA).
I'm thinking something like:
type ManagementPolicyChecker interface {
ShouldUpdate() bool
ShouldLateInit() bool
IsObserveOnly() bool // Could this be phrased in some kind of ShouldFoo way?
}
You could then have one implementation that had both enabled and policy, e.g.:
// This is a very dumb name.
type PolicyBasedManagementPolicyChecker struct {
policy xpv1.ManagementPolicy
enabled bool
}
Or one that embeds just xpv1.ManagementPolicy
and a different implementation that is used whenever management policies aren't actually enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this.
The interface could be extended with IsNonDefault
and IsSupported
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lsviben Do you think you'll add IsNonDefault
and IsSupported
to this as well? I like @turkenh's suggestion - it would be nice for this type to be a one-stop-shop for all things management-policy-checking.
If we wanted to get reaaaally fancy it would be pretty clean to have a type that essentially loaded up all the context we need to make branching decisions (e.g management policies, deletion policy, paused annotation, etc etc). Something like:
policy, err := NewManagementPolicyResolver(managed, managementPoliciesEnabled)
if err != nil {
// NewManagementPolicyResolver returns an error if you're doing anything "wrong",
//i.e. if you're using an unsupported management policy, or if you're trying to use
// management policies but the feature flag isn't enabled.
}
type ManagementPolicyResolver interface {
ShouldPause() bool
ShouldOnlyObserve() bool
ShouldCreate() bool
ShouldDelete() bool // Equivalent to !shouldOrphan
ShouldLateInit() bool
ShouldUpdate() bool
}
All the ShouldFoo
functions don't feel quite as elegant as the nice Action
and OnlyAction
methods you have that take a policy as arguments, but I think I still prefer them because they let us push all of the context needed to make "should I do this thing" checks (not only management policies) into the policy resolver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, i can add something like that, also like the Resolver
name. Also like how it hides the checks for supported and the sanity check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added something similar:
// ManagementPoliciesChecker is used to perform checks on management policies
// to determine specific actions are allowed, or if they are the only allowed
// action.
type ManagementPoliciesChecker interface {
// IsSupported returns true if the management policy is supported.
IsSupported() bool
// IsDisabledNonDefault returns true if the management policy is set to a non-default
// value, and the management policy feature is disabled. Used as a sanity
// check.
IsDisabledNonDefault() bool
// IsPaused returns true if the resource is paused based
// on the management policy.
IsPaused() bool
// ShouldOnlyObserve returns true if only the Observe action is allowed.
ShouldOnlyObserve() bool
// ShouldCreate returns true if the Create action is allowed.
ShouldCreate() bool
// ShouldLateInitialize returns true if the LateInitialize action is
// allowed.
ShouldLateInitialize() bool
// ShouldUpdate returns true if the Update action is allowed.
ShouldUpdate() bool
// ShouldDelete returns true if the Delete action is allowed.
ShouldDelete() bool
}
Added the IsPaused to it, but in the end this IsPaused just checks if its paused based on the management policies, and I left the meta.IsPaused to check paused condition based on annotations. Felt wrong to mix annotations in the ManagementPolicies checking tool, so i left them separate.
Checking IsDisabledNonDefault and IsSupported through calls instead of an error in NewManagement...
due to:
- if we want to have IsPaused there, we need to initialize it before it, so to avoid changing the order of checks from
IsPaused->IsDisabledNonDefault->IsSupported
toIsDisabledNonDefault->IsSupported->IsPaused
- simpler for testing
Just noticed that I didnt change naming, ill check that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Felt wrong to mix annotations in the ManagementPolicies checking tool, so i left them separate.
FWIW despite using the name "management policy checker" in my examples I was thinking more of an "all the policies" checker as a single place to calculate the policy based on all the policy inputs, which right now I think are:
- Paused annotation
- Deletion policy
- Management policies
Though to be fair the first two will go away eventually at which point this would purely be checking management policies. I see your point WRT ordering of the paused/disabled/supported calls though. Don't feel super strongly on this one.
Another thought: Could we merge IsDisabledNonDefault
and IsSupported
into a Validate() error
method that returns an error indicating what's wrong with the policy config (i.e. its set despite the flag being enabled, or its set to an invalid value)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged the IsDisabledNonDefault
and IsSupported
to Validate() error
, nice suggestion 👍
For adding the pause annotation, I left it as is for now. If you/we feel that we should merge these, we can do it in some next PR, I dont have a strong feeling about it, for me what it is now is ok.
pkg/reconciler/managed/reconciler.go
Outdated
// not updated when the managed resource is updated, the external resource | ||
// is not deleted when the managed resource is deleted and the | ||
// spec.forProvider is not late initialized. | ||
sets.New[xpv1.ManagementAction](xpv1.ManagementActionObserve, xpv1.ManagementActionCreate), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add Observe + Update ? crossplane-contrib/provider-kubernetes#115
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for updating these comments
I'd prefer to avoid this giant set (no pun intended) of package scoped state. Perhaps we could make it a member of ManagementPolicyActionChecker
and move its declaration into the NewManagementPolicyActionChecker
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved the giant set into an func returning default supported policies and made it injectable for the ManagedPoliciesResolver
, so that its not package-scoped anymore and we can still easily change it for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we don't want to add Observe
+ Update
for now but enable a way for individual providers to extend the supported list? When I check the code, this doesn't seem to be the case though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I want to add Observe + Update
but didnt yet have a chance to add and test it first. Maybe it would be better to add them in a next PR, so that this can be merged?
I havent thought about allowing providers to set their own supported policies, now they are injected just in the resovler which actually just helps with tests and not having the supported list package scoped. But I like the idea, we can inject the list into the Reconciler, so that the providers can change it easier if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@turkenh now the supported policies can be injected into the Reconciler, so if some provider devs would like to test something new or have their own set, its possible 🎉
pkg/reconciler/managed/reconciler.go
Outdated
return false | ||
} | ||
return ContainsOnlyManagementAction(policy, xpv1.ManagementActionObserve) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this.
The interface could be extended with IsNonDefault
and IsSupported
as well.
pkg/reconciler/managed/reconciler.go
Outdated
// not updated when the managed resource is updated, the external resource | ||
// is not deleted when the managed resource is deleted and the | ||
// spec.forProvider is not late initialized. | ||
sets.New[xpv1.ManagementAction](xpv1.ManagementActionObserve, xpv1.ManagementActionCreate), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for updating these comments
I'd prefer to avoid this giant set (no pun intended) of package scoped state. Perhaps we could make it a member of ManagementPolicyActionChecker
and move its declaration into the NewManagementPolicyActionChecker
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but I'd like @turkenh to approve too before we merge.
I left a few comments for possible improvements, but I think we're into optimizations at this point so feel free to take them or leave them.
pkg/reconciler/managed/reconciler.go
Outdated
return false | ||
} | ||
return ContainsOnlyManagementAction(policy, xpv1.ManagementActionObserve) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Felt wrong to mix annotations in the ManagementPolicies checking tool, so i left them separate.
FWIW despite using the name "management policy checker" in my examples I was thinking more of an "all the policies" checker as a single place to calculate the policy based on all the policy inputs, which right now I think are:
- Paused annotation
- Deletion policy
- Management policies
Though to be fair the first two will go away eventually at which point this would purely be checking management policies. I see your point WRT ordering of the paused/disabled/supported calls though. Don't feel super strongly on this one.
Another thought: Could we merge IsDisabledNonDefault
and IsSupported
into a Validate() error
method that returns an error indicating what's wrong with the policy config (i.e. its set despite the flag being enabled, or its set to an invalid value)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, great work @lsviben 💪
pkg/reconciler/managed/reconciler.go
Outdated
// not updated when the managed resource is updated, the external resource | ||
// is not deleted when the managed resource is deleted and the | ||
// spec.forProvider is not late initialized. | ||
sets.New[xpv1.ManagementAction](xpv1.ManagementActionObserve, xpv1.ManagementActionCreate), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we don't want to add Observe
+ Update
for now but enable a way for individual providers to extend the supported list? When I check the code, this doesn't seem to be the case though.
Signed-off-by: lsviben <sviben.lovro@gmail.com>
def385a
to
73a675c
Compare
Hey @lsviben, thank you for your work.
Changing that to:
Throws another error:
Finally, If I remove all entries for
Am I doing something wrong or the initProvider is not passing the parameters to the MR? |
Note that the parameter in the Autoscaling group is called desiredCapacity the |
Hey, @rbrunan, thanks for the feedback. This PR does not introduce the I am currently working on introducing As for the example in the design doc, sorry for that, I handwrote them and probably missed the map/array thing. So for now, only examples which do not use In practice that means that if we want to ignore fields which are |
Thank you, now everything makes sense. |
Description of your changes
Implements new granular management policies based on the accepted design doc.
In essence, it replaces the old
spec.managementPolicy
that was introduced with ObserveOnly with an new array of granular management policiesspec.managementPolicies
In addition as there are more options now, the various reconciler steps are now conditioned based on weather the management policies feature is in use and the combination of actions.
NOTE: As it is an alpha feature, we are just replacing the old
managementPolicy
field with the new one.Fixes #453
I have:
make reviewable test
to ensure this PR is ready for review.How has this code been tested
Deployed with updated upjet (PR TBD) and provider-aws in a local kind cluster.
run
Because of the controller-runtime version change, make a small change in the code at internal/controller/providerconfig/config.go:31
Create the AWS credentials secret
Run the provider with
--enable-management-policies
.You can update the
run
target in the Makefile as belowand run with:
Apply some of the test cases below.
NOTE
I used the monolith Upbound AWS provider for simplicity, if your machine is having issues with the ~900 CRDs then consider running the smaller provider families for testing.
No management policy set - defaults to ["*"]
Pause
Create a standard VPC (check above) and then edit the
spec.managementPolicies
to[]
The resource should be paused, you can test by trying to make it reconcile, by for example removing an annotation.
Observe Only test
Create an external resource and make sure you use its name as the MR name. For AWS, we also need to provide some indentifier data (
forProvider.region
)Orphan through managementPolicies
As the
spec.managementPolicies
does not have the Delete action, after creating the resource below, waiting for it to be Readyand then deleting it, the external resource will be orphaned.
Orphan through deletionPolicy
Same process as above, but here we check that the
spec.deletionPolicy: Orphan
is the one determining the external resource deletion as thespec.managementPolicies
is a default value.Delete through managementPolicies, deletionPolicy is orphan
Same setup as above, but in this case, both
spec.deletionPolicy
andspec.managementPolicies
are non-default sospec.managementPolicies
trumps thespec.deletionPolicy
and the external resource gets deleted.No late initialization
Tested using a use-case from the design doc:
Prerequisite:
The Autoscaling group needs a Launch Configuration or a Launch Template to function, in this case I opted for the Launch Configuration. When its ready, create an AutoscalingGroup
The resources are taken from the provider-aws examples.
When the resource is created and ready, we can see that the
spec.forProvider.desiredCapacity
is not being late-initialized. We can check its value instatus.atProvider.desiredCapacity
. This allows thedesiredSize
to be managed by an external controller (autoscaler).