Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sonic image downgrade failure from latest public master image #7518

Closed
sujinmkang opened this issue May 5, 2021 · 5 comments · Fixed by sonic-net/sonic-utilities#1591
Closed
Assignees

Comments

@sujinmkang
Copy link
Collaborator

Description

SONiC image downgrade fails from the latest public images.
The failures are related to the sonic-package-manager PR.
sonic-net/sonic-utilities#1527

Steps to reproduce the issue:

  1. install latest public master image and reboot to the new image
  2. try to install 202012 or 201911 image
  3. the installation fails while migrating the packages in sonic-installer.

Describe the results you received:

Here is the output example from public master to 201911 branch image.
In this case, docker.sh is not available so sonic-installer install fails and it also leaves the unmounted image file system under /tmp/ directory and the following installation with same image will fails again but even earlier when the image file system is mounted.

Installed SONiC base image SONiC-OS successfully

Command: grub-set-default --boot-directory=/host 0

Command: config-setup backup
Taking backup of curent configuration

Command: mkdir -p /tmp/image-20191130.70-fs
Command: mount -t squashfs /host/image-20191130.70/fs.squashfs /tmp/image-20191130.70-fs
Command: sonic-cfggen -d -y /tmp/image-20191130.70-fs/etc/sonic/sonic_version.yml -t /tmp/image-20191130.70-fs/usr/share/sonic/templates/sonic-environment.j2
Command: umount -r -f /tmp/image-20191130.70-fs
Command: rm -rf /tmp/image-20191130.70-fs
Command: mkdir -p /tmp/image-20191130.70-fs
Command: mount -t squashfs /host/image-20191130.70/fs.squashfs /tmp/image-20191130.70-fs
Command: mkdir -p /host/image-20191130.70/rw
Command: mkdir -p /host/image-20191130.70/work
Command: mkdir -p /tmp/image-20191130.70-fs
Command: mount overlay -t overlay -o rw,relatime,lowerdir=/tmp/image-20191130.70-fs,upperdir=/host/image-20191130.70/rw,workdir=/host/image-20191130.70/work /tmp/image-20191130.70-fs
Command: mkdir -p /tmp/image-20191130.70-fs/var/lib/docker
Command: mount --bind /host/image-20191130.70/docker /tmp/image-20191130.70-fs/var/lib/docker
Command: chroot /tmp/image-20191130.70-fs mount proc /proc -t proc
Command: chroot /tmp/image-20191130.70-fs mount sysfs /sys -t sysfs
Command: chroot /tmp/image-20191130.70-fs /usr/lib/docker/docker.sh start
chroot: failed to run command ‘/usr/lib/docker/docker.sh’: No such file or directory
Command: chroot /tmp/image-20191130.70-fs /usr/lib/docker/docker.sh stop
chroot: failed to run command ‘/usr/lib/docker/docker.sh’: No such file or directory

Here is the output of installation from public master to previous master branch image.
In this case, the sonic-package-manager is not available but docker.sh is available so mounted image file system can be un-mounted but the installation still fails with following error.

Command: chroot /tmp/image-internal-20210430-0230.post-merge-fs /usr/lib/docker/docker.sh start
mount: /sys/fs/cgroup/cpu: cgroup already mounted on /sys/fs/cgroup.
mount: /sys/fs/cgroup/cpuacct: cgroup already mounted on /sys/fs/cgroup.
Command: cp /var/lib/sonic-package-manager/packages.json /tmp/image-internal-20210430-0230.post-merge-fs/tmp/packages.json
Command: touch /tmp/image-internal-20210430-0230.post-merge-fs/tmp/docker.sock
Command: mount --bind /var/run/docker.sock /tmp/image-internal-20210430-0230.post-merge-fs/tmp/docker.sock
Command: chroot /tmp/image-internal-20210430-0230.post-merge-fs sonic-package-manager migrate /tmp/packages.json --dockerd-socket /tmp/docker.sock -y
chroot: failed to run command ‘sonic-package-manager’: No such file or directory
Command: chroot /tmp/image-internal-20210430-0230.post-merge-fs /usr/lib/docker/docker.sh stop
Stopping Docker: docker.

Command: umount -f -R /tmp/image-internal-20210430-0230.post-merge-fs
Command: umount -r -f /tmp/image-internal-20210430-0230.post-merge-fs
Command: rm -rf /tmp/image-internal-20210430-0230.post-merge-fs
Traceback (most recent call last):
  File "/usr/local/bin/sonic_installer", line 8, in <module>
    sys.exit(sonic_installer())
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/sonic_installer/main.py", line 433, in install
    migrate_sonic_packages(bootloader, binary_image_version)
  File "/usr/local/lib/python3.7/dist-packages/sonic_installer/main.py", line 356, in migrate_sonic_packages
    "-y"])
  File "/usr/local/lib/python3.7/dist-packages/sonic_installer/common.py", line 41, in run_command_or_raise
    raise SonicRuntimeException("Failed to run command '{0}'".format(argv))
sonic_installer.exception.SonicRuntimeException: Failed to run command '['chroot', '/tmp/image-internal-20210430-0230.post-merge-fs', 'sonic-package-manager', 'migrate', '/tmp/packages.json', '--dockerd-socket', '/tmp/docker.sock', '-y']'

Describe the results you expected:

something similar to this.

Installed SONiC base image SONiC-OS successfully

Command: grub-set-default --boot-directory=/host 0

Command: config-setup backup
Taking backup of curent configuration

Command: mkdir -p /tmp/image-HEAD.704-4f2b54e7-fs
Command: mount -t squashfs /host/image-HEAD.704-4f2b54e7/fs.squashfs /tmp/image-HEAD.704-4f2b54e7-fs
Command: sonic-cfggen -d -y /tmp/image-HEAD.704-4f2b54e7-fs/etc/sonic/sonic_version.yml -t /tmp/image-HEAD.704-4f2b54e7-fs/usr/share/sonic/templates/sonic-environment.j2
Command: umount -rf /tmp/image-HEAD.704-4f2b54e7-fs
Command: rm -rf /tmp/image-HEAD.704-4f2b54e7-fs
Command: sync;sync;sync

Command: sleep 3

Done

Output of show version:

admin@sonic:~$ show ver

SONiC Software Version: SONiC.HEAD.704-4f2b54e7
Distribution: Debian 10.9
Kernel: 4.19.0-12-2-amd64
Build commit: 4f2b54e7
Build date: Mon May  3 20:38:13 UTC 2021
Built by: johnar@jenkins-worker-22

Platform: x86_64-dell_s6100_c2538-r0
HwSKU: Force10-S6100
ASIC: broadcom
ASIC Count: 1
Serial Number: CN0F6N2RCES007AC0004
Uptime: 01:32:47 up  4:10,  1 user,  load average: 3.54, 4.59, 4.67

Docker images:
REPOSITORY                    TAG                 IMAGE ID            SIZE
docker-dhcp-relay             HEAD.704-4f2b54e7   9674e837d579        415MB
docker-dhcp-relay             latest              9674e837d579        415MB
docker-sonic-mgmt-framework   HEAD.704-4f2b54e7   a4791310dfdf        628MB
docker-sonic-mgmt-framework   latest              a4791310dfdf        628MB
docker-sonic-telemetry        HEAD.704-4f2b54e7   a66bf0ef819a        498MB
docker-sonic-telemetry        latest              a66bf0ef819a        498MB
docker-orchagent              HEAD.704-4f2b54e7   7e9df54fe923        437MB
docker-orchagent              latest              7e9df54fe923        437MB
docker-macsec                 HEAD.704-4f2b54e7   defbbc343224        422MB
docker-macsec                 latest              defbbc343224        422MB
docker-fpm-frr                HEAD.704-4f2b54e7   2836537ad30b        437MB
docker-fpm-frr                latest              2836537ad30b        437MB
docker-nat                    HEAD.704-4f2b54e7   b8d5405b46eb        422MB
docker-nat                    latest              b8d5405b46eb        422MB
docker-sflow                  HEAD.704-4f2b54e7   14a99f65b630        420MB
docker-sflow                  latest              14a99f65b630        420MB
docker-teamd                  HEAD.704-4f2b54e7   52d917d4b5bb        419MB
docker-teamd                  latest              52d917d4b5bb        419MB
docker-platform-monitor       HEAD.704-4f2b54e7   de546717f508        622MB
docker-platform-monitor       latest              de546717f508        622MB
docker-syncd-brcm             HEAD.704-4f2b54e7   7b6089123509        700MB
docker-syncd-brcm             latest              7b6089123509        700MB
docker-snmp                   HEAD.704-4f2b54e7   d538e569be4b        450MB
docker-snmp                   latest              d538e569be4b        450MB
docker-router-advertiser      HEAD.704-4f2b54e7   453d1745e8b9        408MB
docker-router-advertiser      latest              453d1745e8b9        408MB
docker-database               HEAD.704-4f2b54e7   60c25eaae5fe        408MB
docker-database               latest              60c25eaae5fe        408MB
docker-lldp                   HEAD.704-4f2b54e7   67cb5d49cc7f        448MB
docker-lldp                   latest              67cb5d49cc7f        448MB

Output of show techsupport:

N/A
(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

@stepanblyschak
Copy link
Collaborator

@sujinmkang @lguohan @liat-grozovik Is donwgrade supported in SONiC? I had an assumption that it is not. If this part of sonic-installer needs to pass when downgrading you can use "--skip-package-migration" or I can make a change that this part will never fail even if it is not successful.

@lguohan
Copy link
Collaborator

lguohan commented May 5, 2021

@stepanblyschak , we did not install any package in master, why do we need to explicitly pass the --skip-package-migration option here? if there is no custom package being installed on the system, it should make no difference and the installation should not fail.

@yxieca
Copy link
Contributor

yxieca commented May 5, 2021

@stepanblyschak we consistently install an older image then install the target image on DUT in nightly tests to avoid unintended change with stale images. We also has testbed test images in multiple branches. For these use cases, we really need to have the installer solid and forgiving.

@sujinmkang
Copy link
Collaborator Author

@stepanblyschak Not only skip_package_migration but also skip_migration shouldn't fail either. skip_migration fails during 201811 or 201911 image downgrade and it leaves mounted new image in /tmp directory.
One more thing on package migration, can you make the changes to create the mount directory with random name instead of using next image version? So that the installation fails in the middle for some reason, it won't fail because of the staled image mount point when the installation retries without any reboot?

@stepanblyschak
Copy link
Collaborator

Please check sonic-net/sonic-utilities#1591

liat-grozovik pushed a commit to sonic-net/sonic-utilities that referenced this issue May 6, 2021
- What I did
Do not fail when user is doing downgrade. Fix sonic-net/sonic-buildimage#7518

- How I did it
Ignoring failures.

- How to verify it
On master image install 202012 image.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
gitsabari pushed a commit to gitsabari/sonic-utilities that referenced this issue Jun 15, 2021
- What I did
Do not fail when user is doing downgrade. Fix sonic-net/sonic-buildimage#7518

- How I did it
Ignoring failures.

- How to verify it
On master image install 202012 image.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
malletvapid23 added a commit to malletvapid23/Sonic-Utility that referenced this issue Aug 3, 2023
- What I did
Do not fail when user is doing downgrade. Fix sonic-net/sonic-buildimage#7518

- How I did it
Ignoring failures.

- How to verify it
On master image install 202012 image.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants