-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker plugin fails to start after upgrade to 20.10+ #507
Comments
I have attempted the upgrade again from
|
Hi @djesernik
I'm able to start the plugin with every version. I'm also able to see the NetApp volumes with "docker volume ls", but I'm not able to mount any volume inside my container. Error message: Not sure if this is the same problem. Are you able to use any fresh-installed version above 20.04? Maybe this is just a problem on my side and I need to create a case. docker-ce client 20.10.5 |
Hi @xd999e, There is a NetApp support case open on this issue and the team is investigating it. If you need immediate assistance you can also contact NetApp support. We will update this issue when we have more information. |
Trident 20.10 changed from 20.07 in that it is now based on distroless instead of Alpine, which has at least two consequences:
|
We are running Ubuntu 18.04/20.04 and we use the config file name at install. (without path)
When browsing into the directory, I can find everything up to "propagated-mount". This directory is not present. (everything else is.) This issue appears when I try to use the volume with a container. (not while creating the volume, this is working) |
Same issue here on 2 production hosts. Currently we're unable to upgrade to anything higher than 20.07 because of this issue. |
This issue is fixed with commit ce346f3 and is included in the Trident 21.04 release. |
I think this should be reopened. Just tried again, updating one host from 20.07 to 21.04, but I keep getting the same error. Trident does not start after upgrade. Same error message as with 20.10 or 21.01 versions before:
After downgrading to 20.07 it works. |
Yep, still the same error too. |
Can add that 21.07 is the exact same. Still throwing a config error on trying to re-enable the plugin. Do we have a workaround here? I don't think we're going to be able to "uninstall/reinstall" with existing netapp volumes in use by containers right? To "fix" this would be a complete shutdown of the cluster / containers to uninstall/reinstall the driver? |
Hello, My system:
I have two systems, which act differently. With the version 21.07 of the plugin
However, I went down to the version 20.07 on Machine A, and it works like intended. The error messages:
Where it actually mounted:
|
@engineering can you please advice on this issue ?
|
@jayooin We're not seeing the exact same behavior (I assume by your statement of "creates /plugins/xxx at root level" you actually mean /plugins/xxx off of / and not /var/lib/docker/plugins/xxx off of /var/lib/docker right?) but it is interesting to note that you're not having problems with a box who's /var/lib/docker is NOT it's own mount point from the root drive...our setup is ALSO /var/lib/docker as it's own mount. We do not see additional directories created off the / path, /var/lib/docker/plugins is a directory that exists on our systems and under that we get a hash of the netapp plugin path. I haven't dug in to the information under that much, except for the times where the netapp driver goes off the rails and stops writing to NFS and starts writing to Local Host, but that's an ENTIRELY different problem than what we're discussing here. And yes, I can confirm that 20.07 is the last version of the plugin to be functional. |
Hi @rgadwagner, we're looking into this issue. Can you let us know which version of Docker and which Operating System and version you are using? |
Plenty of various versions from 18.07 all the way up to 20.10 of docker. The OS is almost exclusively CentOS7. The last time we tried updating past 20.07 the OS Test Bed was the most recent patch (at the time) of Docker 20.10 on a fully patched (at that time) CentOS7. Please keep in mind that it LOOKS like this only exists with installations that have been brought up from prior versions. To truly test this you're going to want to install a version of the driver at like 18.04 or something like that, run containers off of it and then update to beyond 20.07 netapp...from what i'm understanding the issue doesn't occur with a fresh install or from an installation upgraded with an installation past a certain point. I can tell you we've had this driver installed since before it became Trident (though we've uninstalled/reinstalled on various machines) and there's a memory at the back of my mind that says there was some major architectural change during one of the versions a few years back. |
@rgadwagner, thanks for the detailed information. Establishing the scenario that needs to be tested is very helpful. You're right that there have been a few architectural changes over the years. Thanks for sticking with Trident all this time. |
This issue is fixed with commit 7943641 and will be included in the Trident 22.04 release. |
Describe the bug
After upgrading the docker netapp/trident-plugin to version 20.10, it fails to be re-enabled with an error of
Error response from daemon: dial unix /run/docker/plugins/2311ab6e7b3b1461539ddc3c783ac65ad9a179d44fffd76b53aff9dd844ae1c6/netapp.sock: connect: no such file or directory
Upon following the troubleshooting steps in https://netapp-trident.readthedocs.io/en/stable-v20.10/docker/troubleshooting.html and checking the logs with journalctl, the following messages are logged:
Environment
Provide accurate information about the environment to help us reproduce the issue.
docker plugin install --grant-all-permissions --alias netapp netapp/trident-plugin:19.10 config=/etc/netappdvp/config.json
from the original install, though this is an upgradeTo Reproduce
Steps to reproduce the behavior:
With a previous version of the netapp/trident-plugin installed (I tested with 19.10, 20.04, and 20.07) run through the steps in https://netapp-trident.readthedocs.io/en/latest/docker/use/managing.html#updating-trident
Expected behavior
Expected a response of
Additional context
In researching the error that is reported I found the corresponding code at https://github.com/NetApp/trident/blob/stable/v20.10/main.go#L129-L132 suggesting that the configPath may not be set. When the docker plugin was installed with
docker plugin install
the following parameter was passed thoughconfig=/etc/netappdvp/config.json
and I can confirm that it is set by runningdocker plugin inspect netapp
and seeing the output for Config.Env.config Value: "config.json" as well as this section of the output:I was able to downgrade back to 20.07 which functions as expected.
The text was updated successfully, but these errors were encountered: