Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenShift-SDN failed when work as default multus cni plugin #149

Closed
flyinghawkren opened this issue Sep 6, 2018 · 11 comments
Closed

OpenShift-SDN failed when work as default multus cni plugin #149

flyinghawkren opened this issue Sep 6, 2018 · 11 comments

Comments

@flyinghawkren
Copy link

flyinghawkren commented Sep 6, 2018

I use multus 3.1 and my multus conf is as follow:

{
  "name": "multus-cni-network",
  "type": "multus",
  "delegates": [
    {
      "type": "openshift-sdn",
      "name": "openshift-sdn",
      "masterplugin": true
    }
  ],
  "kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig",
  "LogFile": "/var/log/multus.log"
}

And when I create a Pod without any network annotation, I got such errors:

Sep 06 21:36:06 app1.example.com origin-node[1149]: E0906 21:36:06.500016    1149 cni.go:259] Error adding network: Multus: Err in tearing down failed plugins: Multus: error in invoke Delegate add - "openshift-sdn": OpenShift SDN network process is not (yet?) available
Sep 06 21:36:06 app1.example.com origin-node[1149]: E0906 21:36:06.500247    1149 cni.go:227] Error while adding to cni network: Multus: Err in tearing down failed plugins: Multus: error in invoke Delegate add - "openshift-sdn": OpenShift SDN network process is not (yet?) available
Sep 06 21:36:06 app1.example.com oci-systemd-hook[32158]: systemdhook <debug>: 302e3de728bb: Skipping as container command is /usr/bin/pod, not init or systemd
Sep 06 21:36:06 app1.example.com oci-umount[32159]: umounthook <debug>: 302e3de728bb: only runs in prestart stage, ignoring
Sep 06 21:36:06 app1.example.com dockerd-current[1297]: time="2018-09-06T21:36:06.520562109+08:00" level=error msg="containerd: deleting container" error="exit status 1: \"container 302e3de728bb3b1b8864fe36f9ad20277045ed46937eb75fb6a48bf4840bff03 does not exist\\none or more of the container deletions failed\\n\""
Sep 06 21:36:06 app1.example.com origin-node[1149]: E0906 21:36:06.523106    1149 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "multus-test_test5g" network: Multus: Err in tearing down failed plugins: Multus: error in invoke Delegate add - "openshift-sdn": OpenShift SDN network process is not (yet?) available
Sep 06 21:36:06 app1.example.com origin-node[1149]: E0906 21:36:06.523136    1149 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "multus-test_test5g(84687767-b1d9-11e8-9b07-005056b1f8fe)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "multus-test_test5g" network: Multus: Err in tearing down failed plugins: Multus: error in invoke Delegate add - "openshift-sdn": OpenShift SDN network process is not (yet?) available
Sep 06 21:36:06 app1.example.com origin-node[1149]: E0906 21:36:06.523147    1149 kuberuntime_manager.go:647] createPodSandbox for pod "multus-test_test5g(84687767-b1d9-11e8-9b07-005056b1f8fe)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "multus-test_test5g" network: Multus: Err in tearing down failed plugins: Multus: error in invoke Delegate add - "openshift-sdn": OpenShift SDN network process is not (yet?) available
Sep 06 21:36:06 app1.example.com origin-node[1149]: E0906 21:36:06.523193    1149 pod_workers.go:186] Error syncing pod 84687767-b1d9-11e8-9b07-005056b1f8fe ("multus-test_test5g(84687767-b1d9-11e8-9b07-005056b1f8fe)"), skipping: failed to "CreatePodSandbox" for "multus-test_test5g(84687767-b1d9-11e8-9b07-005056b1f8fe)" with CreatePodSandboxError: "CreatePodSandbox for pod \"multus-test_test5g(84687767-b1d9-11e8-9b07-005056b1f8fe)\" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod \"multus-test_test5g\" network: Multus: Err in tearing down failed plugins: Multus: error in invoke Delegate add - \"openshift-sdn\": OpenShift SDN network process is not (yet?) available"

Can you give me some tips? Thanks!

@dougbtv
Copy link
Member

dougbtv commented Sep 6, 2018

Hi @flyinghawkren ! Cool, I'm excited that you want to use Multus CNI with OpenShift.

There's still only preliminary work done in terms of integration of Multus CNI into OpenShift -- but still work underway there.

Something I think you Might like to reference is this ansible playbook that's in my fork of OpenShift-Ansible -- https://github.com/dougbtv/openshift-ansible/tree/multus-developer-preview/playbooks/openshift-multinetwork (edit: maybe you've seen this, because your config looks quite similar!)

There's a README there with information on how I've been trialing Multus with OpenShift 3.10.

Possibly the most concerning thing to me is this line:

Sep 06 21:36:06 app1.example.com origin-node[1149]: E0906 21:36:06.500247    1149 cni.go:227] Error while adding to cni network: Multus: Err in tearing down failed plugins: Multus: error in invoke Delegate add - "openshift-sdn": OpenShift SDN network process is not (yet?) available

It seems like OpenShift SDN is not ready. This is something that we have a pull request open for, #98 -- that I will be implementing to detect if the OpenShift SDN is ready under OpenShift.

In my particular playbook, it's designed to be run after a full deployment of OpenShift with openshift-ansible, and then it checks to see that there's.... pods that are running (a hacky work-around for now) so, a weak attempt at saying "is something happening, yet?" -- then it creates an alphabetically first config file. In the future -- it'll wait for a semaphore from the openshift node.go code (which is basically just a config file that gets placed in /etc/cni/net.d/ which then tells the kubelet "hey this node is ready"). In the short term implementation, what we'll be telling Multus to look for)

Can you list the contents of the /etc/cni/net.d/ directory on the app1.example.com node? I'm curious if it created the openshift SDN configuration yet or not.

Additionally -- what version of OpenShift are you running?

Thanks!

@flyinghawkren
Copy link
Author

flyinghawkren commented Sep 7, 2018

Hi @dougbtv ! Thanks for your advice.

I am running OpenShift 3.9. And before I deploy multus daemonset, I run the following commands:

$ oc delete clusternetwork --all
$ oc delete hostsubnets --all
$ oc delete netnamespaces --all

And I have manually removed the files in /etc/cni/net.d/. Maybe that cause the openshift sdn config go wrong.
My contents in /etc/cni/net.d/ is:

[root@app1 ~]# tree /etc/cni/net.d/
/etc/cni/net.d/
├── 70-multus.conf
└── multus.d
    └── multus.kubeconfig

1 directory, 2 files

I have tried to copy an original 80-openshift-network.conf into /etc/cni/net.d, Multus still report same error.

@flyinghawkren
Copy link
Author

Another supplement: I have changed the OpenShift master config in /etc/origin/master/master-config.yaml and node config in /etc/origin/node/node-config.yaml. I change the networkPluginName
from redhat/openshift-ovs-subnet to cni.

@dougbtv
Copy link
Member

dougbtv commented Sep 7, 2018

It might be a little bit before I can make an attempt with OpenShift 3.9. Currently, I have only made some preliminary tests in OpenShift 3.10 -- the README in the link I sent over mentions some of the limitations.

Generally, what my approach has been is to:

  1. Use openshift-ansible to bring up the entire cluster
  2. Wait until pods are ready
  3. Deploy Multus (I'd recommend the daemonset in the linked openshift-ansible fork)
  4. Create a pod without an annotation
  5. If 4 works, then create a CRD object
  6. Bring up a pod with an annotation.

I didn't worry about specifically removing these you had:

$ oc delete clusternetwork --all
$ oc delete hostsubnets --all
$ oc delete netnamespaces --all

I located where that error is coming from, it's apparently in the cniserver.go in this ReadConfig function.

My best guess is that it's being called by openshift-sdn_linux.go

This may be requiring that there be a file named /var/run/openshift-sdn/config.json -- according to cniserver.go

What I'd currently recommend if you can -- is to spin up a fresh 3.9 cluster, and attempt my steps 1-6 above. That should generally emulate what I would be trying in my own lab.

We do have some work going forward regarding Multus in OpenShift, however, it targets newer versions. This shouldn't mean it's impossible in v3.9, however, I haven't specifically tried it / found the pitfalls.

@pliurh
Copy link
Contributor

pliurh commented Sep 10, 2018

@flyinghawkren I've managed to run Multus on 3.9 without deleting the clusternetwork and hostsubnets. According to your log, it seems the openshift-sdn was not well installed. You can try to redeploy it and make sure openshift-sdn can function normally without Multus.

However I suggest you try Multus on 3.10 or 3.11, where you can deploy multus with Doug's playbook. On 3.9, the deployment of Multus needs some extra steps.

@flyinghawkren
Copy link
Author

@pliurh @dougbtv Thank you very much. I will redeploy cluster and try Multus on 3.10 or 3.11.

@dougbtv
Copy link
Member

dougbtv commented Sep 10, 2018

@pliurh -- do you have any docs on how you accomplished it on 3.9?

(edit: +1 regarding verifying that openshift-sdn is in working order before a Multus install)

@pliurh
Copy link
Contributor

pliurh commented Sep 11, 2018

I've created a pull request to Doug's Openshift-ansible fork, which explains the extra steps needed for Openshift 3.9.

@000vicente
Copy link

000vicente commented Sep 14, 2018

Obrigado pelo comentários.

@flyinghawkren
Copy link
Author

I have tried in OpenShift 3.10 and everything is OK.
This issue can be closed. Thanks!

@DanyC97
Copy link

DanyC97 commented May 26, 2019

@dougbtv have you tried sending a PR to the upstream openshift-ansible repo ? i tried to look it up and couldn't find it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants