Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the amount of allocations in policy use by trying a cached resolver first #11028

Merged
merged 4 commits into from
Oct 9, 2016

Conversation

smarterclayton
Copy link
Contributor

Simplified RuleResolver to be closer to upstream. Used a cached RuleResolver
in the NoEscalation checks and when fetching the binding on update. Remove a use of deepequal on a common path.

Not passing all tests yet, but @deads2k please take a look at the general structure.

This drops policy pretty far down the list in test-cmd - resolving #11003 for good
hopefully.

@smarterclayton
Copy link
Contributor Author

Wrinkle is we can no longer give the nice 'reason' for which rule granted us. I noticed upstream doesn't have it - various ways to solve it including returning two lists or adding a source field to PolicyRule or switching to Bindings earlier.

if kapi.Semantic.Equalities.DeepEqual(oldAllowAllPolicyRule, *obj) && obj.APIGroups == nil {
// match the old allow all rule, but only if API groups is nil (not specified in the incoming JSON)
oldAllowAllRule := obj.APIGroups == nil &&
// avoid calling the very expensive DeepEqual by inlining specific checks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever

len(obj.Resources) == 1 && obj.Resources[0] == internal.ResourceAll &&
len(obj.AttributeRestrictions.Raw) == 0 && len(obj.ResourceNames) == 0 &&
len(obj.NonResourceURLsSlice) == 0 &&
// semantic equalities will ignore nil vs empty for other fields as a safety
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a scary comment here about not removing this check.

@deads2k
Copy link
Contributor

deads2k commented Sep 21, 2016

structure looks reasonable.

@smarterclayton
Copy link
Contributor Author

[test]

@smarterclayton smarterclayton force-pushed the reduce_covers branch 2 times, most recently from 0680d71 to da79a0b Compare September 29, 2016 03:16
@smarterclayton
Copy link
Contributor Author

smarterclayton commented Sep 29, 2016 via email

@smarterclayton
Copy link
Contributor Author

@smarterclayton smarterclayton force-pushed the reduce_covers branch 4 times, most recently from 2e65fc9 to befa107 Compare September 30, 2016 18:55
@smarterclayton
Copy link
Contributor Author

@derekwaynecarr from https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9552/

/data/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:134
Sep 30 15:55:55.842: Couldn't delete ns: "e2e-tests-limitrange-7yrtq": namespace e2e-tests-limitrange-7yrtq was not deleted with limit: timed out waiting for the condition, namespace is empty but is not yet removed (&errors.errorString{s:"namespace e2e-tests-limitrange-7yrtq was not deleted with limit: timed out waiting for the condition, namespace is empty but is not yet removed"})
/data/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:338

[test]

@smarterclayton
Copy link
Contributor Author

Ok, now this is ready for review.

On Sep 30, 2016, at 7:29 PM, OpenShift Bot notifications@github.com wrote:

continuous-integration/openshift-jenkins/test SUCCESS (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9562/)


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#11028 (comment),
or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABG_p1B0tSlPmA9ugzumqAo3VduODFveks5qvZtVgaJpZM4KCRxk
.

@@ -248,13 +252,23 @@ func (m *VirtualStorage) updateRoleBinding(ctx kapi.Context, name string, objInf
return updatedRoleBinding, false, nil
}

// modifyingRoleForEscalation tries to use the CachedRuleResolver if available to avoid expensive checks
func (m *VirtualStorage) modifyingRoleForEscalation(binding authorizationinterfaces.RoleBinding) (authorizationinterfaces.Role, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the method, but not the name. Looks like I made the name initially, but it sucks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roleForEscalationCheck?

if err := addValerie.AddRole(); !kapierror.IsNotFound(err) {
// try to add Valerie to a non-existent role, looping until it is true due to
// the policy cache taking time to react
if err := wait.Poll(time.Second, 2*time.Minute, func() (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is now necessary, it suggests we've gotten slower because the cache on the covers check falls through to live on a miss.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this is waiting for the cache to reflect the role deletion above (before it would have failed the live check immediately). 2 minutes seems egregious, though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen integration tests pause for 30-60 seconds.

@smarterclayton
Copy link
Contributor Author

AWS broke [test]

On Mon, Oct 3, 2016 at 2:02 PM, OpenShift Bot notifications@github.com
wrote:

continuous-integration/openshift-jenkins/test FAILURE (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9586/)


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#11028 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_py--2Q-w3wio-4a-lU3fW-gK8SCEks5qwUM7gaJpZM4KCRxk
.

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 6, 2016
@smarterclayton
Copy link
Contributor Author

Finish your review! Or I'll turn on ReallyCrash

On Oct 5, 2016, at 9:17 PM, OpenShift Bot notifications@github.com wrote:

Origin Action Required: Pull request cannot be automatically merged, please
rebase your branch from latest HEAD and push again


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#11028 (comment),
or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABG_p8BG7joaVBJW3-_dxV9xWFdOSItIks5qxGg7gaJpZM4KCRxk
.

@deads2k
Copy link
Contributor

deads2k commented Oct 7, 2016

Do you have two of these? I swear I said this looked good.

@smarterclayton
Copy link
Contributor Author

Hrm, never saw it. Bizarre.

On Oct 7, 2016, at 9:50 AM, David Eads notifications@github.com wrote:

Do you have two of these? I swear I said this looked good.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#11028 (comment),
or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABG_p-Vio-cHJzpmY6GhLSVsEeq3oQRcks5qxmongaJpZM4KCRxk
.

Mostly consistent with upstream - did not remove the reason field
Significantly reduces the impact of allocations during policy alteration
Only in certain cases do we need to default the rule.
@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 9ea0477

@smarterclayton
Copy link
Contributor Author

Updated and rebased on top of the change to all the same method signatures...

@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 9, 2016
@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9782/)

@smarterclayton
Copy link
Contributor Author

[merge]

On Sat, Oct 8, 2016 at 10:55 PM, OpenShift Bot notifications@github.com
wrote:

continuous-integration/openshift-jenkins/test SUCCESS (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9782/)


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#11028 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p7RKk3doEZnNMzc18GZf35X-6fgIks5qyFesgaJpZM4KCRxk
.

@openshift-bot
Copy link
Contributor

openshift-bot commented Oct 9, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/9782/) (Image: devenv-rhel7_5156)

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 9ea0477

@openshift-bot openshift-bot merged commit 6c5d083 into openshift:master Oct 9, 2016
@enj enj mentioned this pull request Oct 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants