Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release-1.23] - k3s.service enters restart loop after running k3s secrets-encrypt prepare #5726

Closed
dereknola opened this issue Jun 15, 2022 · 1 comment
Assignees
Milestone

Comments

@dereknola
Copy link
Member

Backport fix for k3s.service enters restart loop after running k3s secrets-encrypt prepare

@rancher-max
Copy link
Contributor

Environment Details

Infrastructure

  • Cloud
  • Hosted

Node(s) CPU architecture, OS, and Version:

$ uname -a
Linux ip-172-31-3-249 5.4.0-1009-aws #9-Ubuntu SMP Sun Apr 12 19:46:01 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Cluster Configuration:

3 servers

Config.yaml:

write-kubeconfig-mode: "0644"
tls-san:
  - <redacted>
token: <redacted>
secrets-encryption: true
server: https://<redacted>:6443

Replication Steps

  • Install K3S:
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.22.8+k3s1 sh -
  • On a server that was not the initial server, run: k3s secrets-encrypt prepare && systemctl restart k3s && journalctl -u k3s -f
  • Restart the other two servers: systemctl restart k3s

Results:

Logs loop perpetually on the server that the command was run with:

Jun 21 20:39:39 ip-172-31-0-242 systemd[1]: Started Lightweight Kubernetes.
Jun 21 20:39:39 ip-172-31-0-242 k3s[30666]: time="2022-06-21T20:39:39Z" level=info msg="Starting k3s v1.22.8+k3s1 (21fed356)"
Jun 21 20:39:39 ip-172-31-0-242 k3s[30666]: time="2022-06-21T20:39:39Z" level=info msg="Managed etcd cluster bootstrap already complete and initialized"
Jun 21 20:39:39 ip-172-31-0-242 k3s[30666]: time="2022-06-21T20:39:39Z" level=info msg="Reconciling bootstrap data between datastore and disk"
Jun 21 20:39:39 ip-172-31-0-242 k3s[30666]: time="2022-06-21T20:39:39Z" level=fatal msg="/var/lib/rancher/k3s/server/cred/encryption-config.json, /var/lib/rancher/k3s/server/cred/encryption-state.json newer than datastore and could cause a cluster outage. Remove the file(s) from disk and restart to be recreated from datastore."
Jun 21 20:39:39 ip-172-31-0-242 systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Jun 21 20:39:39 ip-172-31-0-242 systemd[1]: k3s.service: Failed with result 'exit-code'.
Jun 21 20:39:44 ip-172-31-0-242 systemd[1]: k3s.service: Scheduled restart job, restart counter is at 1610.
Jun 21 20:39:44 ip-172-31-0-242 systemd[1]: Stopped Lightweight Kubernetes.

Note this does not happen every time, but sometimes it does get stuck like this even after restarting the two other nodes.

Validation Steps

  • Install K3S:
curl -sfL https://get.k3s.io | INSTALL_K3S_COMMIT=4dda76b0a9870a6724a3c0846578384380c430d6 sh -
  • On a server that was not the initial server, run: k3s secrets-encrypt prepare && systemctl restart k3s && journalctl -u k3s -f
  • Restart the other two servers: systemctl restart k3s

Results:

Logs only loop like above initially, but after restarting the other two servers, it reconciles and k3s starts normally.

Additional context / logs:
Note that this log shows up in the first place because there is an update call to the etcd cluster. This causes bootstrap data on the node to get (re)written, which is then out of sync with the etcd leader node. The other nodes need the k3s process itself to restart to retrieve the bootstrap data and put that all in sync, but the etcd data should all match regardless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants