Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default network readiness enhancement #98

Merged

Conversation

dougbtv
Copy link
Member

@dougbtv dougbtv commented Jul 17, 2018

Uses the alternate network readiness methodology, section 6.1.1 of the de-facto standard.

In short:

If the implementation chooses this method, it must then block any pod network attachment/detachment operations until the cluster-wide default network is ready, or until a suitable timeout is reached. The recommended timeout value is 2 minutes.

This method waits in a loop for 2 minutes (by default) looking for a configuration file from the default network and is otherwise configurable (e.g. for which default network configuration file to look for an for the timeout length).

More details available in README.md changes.

@dougbtv dougbtv self-assigned this Jul 17, 2018
@rkamudhan rkamudhan added this to the Multus v2.2 milestone Jul 19, 2018
@rkamudhan rkamudhan removed this from the Multus v3.0 milestone Jul 27, 2018
@dougbtv dougbtv closed this Jul 31, 2018
@dougbtv dougbtv deleted the dev/npwg-network-readiness branch July 31, 2018 16:51
@dougbtv dougbtv restored the dev/npwg-network-readiness branch July 31, 2018 16:54
@dougbtv dougbtv reopened this Jul 31, 2018
@dougbtv dougbtv changed the base branch from dev/network-plumbing-working-group-crd-change to master August 2, 2018 18:31
@dougbtv
Copy link
Member Author

dougbtv commented Aug 2, 2018

@rkamudhan -- do you have a suggestion for a retry library to use?

I looked into using the one you had used for the annotation re-try, but, apparently it seems to be for conflicting writes, or at least that's what I found looking at this method: https://github.com/kubernetes/client-go/blob/master/util/retry/util.go#L45-L47

I have a rebase, but, I don't have a refactor here yet.

@dougbtv
Copy link
Member Author

dougbtv commented Aug 2, 2018

Note to self: spec updated to say "not a default 2 minute wait" -- we'll just have an exponential back-off, and keep on running.

@rkamudhan
Copy link
Member

rkamudhan commented Aug 3, 2018

package type

import (
        ......
	"time"

	"k8s.io/apimachinery/pkg/util/wait"
        .......
)

.... bla bla code...

var DefaultReadinessBackoff = wait.Backoff{
      // We can change the values here
	Steps:    4, 
	Duration: 10 * time.Millisecond,
	Factor:   5.0,
	Jitter:   0.1,
}

.... bla bla code...


err := wait.ExponentialBackoff(backoff, func() (bool, error) {
		err := findmeDefaultNetwork(n.DefaultFilename)
		switch {
		case err == nil:
			return true, nil
		default:
			return false, err
		}
	})
	return err

func youFoundmeDefaultNetwork() {
}

.... bla bla code...

Refer: https://godoc.org/k8s.io/apimachinery/pkg/util/wait

multus/multus.go Outdated
// is present (or until a user-defined timeout is reached)
func waitForDefaultNetwork(indicatorFile string, waitSeconds int) error {
// If there's no file to wait for, then, this is essentially disabled.
if len(indicatorFile) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet! Kural thank you for the suggestion, and the awesome code example -- really appreciated.

@dougbtv
Copy link
Member Author

dougbtv commented Aug 3, 2018

Thanks again on input for the wait library in apimachinery. Happy to take any input on the default backoff parameters, and also if any should be exposed in the configurations.

@rkamudhan
Copy link
Member

yes yes. We can take backoff time from user as well.. But it is advance feature, we can do it later.. no issues..

types/conf.go Outdated
@@ -179,6 +180,10 @@ func LoadNetConf(bytes []byte) (*NetConf, error) {
netconf.BinDir = defaultBinDir
}

if netconf.DefaultNetworkFile == "" {
netconf.DefaultNetworkFile = defaultDefaultNetworkFile
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dougbtv how the delegate information is transferred from the default file to RawDelegates ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have defaultNetworkFile, then the delegate is not a must right ?

Copy link
Member

@rkamudhan rkamudhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is WIP

multus/multus.go Outdated
var defaultReadinessBackoff = wait.Backoff{
Steps: 4,
Duration: 10 * time.Millisecond,
Factor: 5.0,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea is to set 250 millis, and factor of 4, for .25 seconds, 1 second, 4 second, 16 secons....

@dougbtv
Copy link
Member Author

dougbtv commented Aug 16, 2018

Latest commit pushed has decided on back off.

@dougbtv
Copy link
Member Author

dougbtv commented Aug 16, 2018

rename defaultnetworkfile to readinessindicatorfile

@s1061123 s1061123 added this to the v3.1 milestone Aug 16, 2018
@dougbtv dougbtv force-pushed the dev/npwg-network-readiness branch 2 times, most recently from e961722 to 6cb2e8c Compare August 16, 2018 15:30
@rkamudhan rkamudhan removed this from the v3.1 milestone Aug 16, 2018
Copy link
Member

@s1061123 s1061123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, Doug! Thank you for your update!

@dougbtv
Copy link
Member Author

dougbtv commented Oct 10, 2018

Idea: We run this as a startup option -- that is, this happens to look for a file (or CRD object) before Multus runs, so you say run multus as ./multus --is-default-ready /target/file.name or ./multus --is-default-ready my-crd-name and it reports back saying that it's ready or not. So that we don't have the entrypoint script runs first before this happens, this might be better for configuration management, especially relative to an operator-style management of cni plugin configuration.

cc: @s1061123 @dcbw

@dougbtv
Copy link
Member Author

dougbtv commented Oct 10, 2018

@s1061123 -- give this a look, I have it rebased to fix up the conflicts! Thanks for letting me know about it.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 416

  • 8 of 10 (80.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.4%) to 46.465%

Changes Missing Coverage Covered Lines Changed/Added Lines %
multus/multus.go 5 7 71.43%
Totals Coverage Status
Change from base Build 415: 0.4%
Covered Lines: 414
Relevant Lines: 891

💛 - Coveralls

@s1061123 s1061123 merged commit 21cdfe5 into k8snetworkplumbingwg:master Oct 11, 2018
@s1061123
Copy link
Member

@dougbtv , done! Thank you for your long-time work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants