Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster creation fails with microOS #441

Open
EnDjeee opened this issue Sep 7, 2024 · 2 comments · May be fixed by #450
Open

Cluster creation fails with microOS #441

EnDjeee opened this issue Sep 7, 2024 · 2 comments · May be fixed by #450

Comments

@EnDjeee
Copy link

EnDjeee commented Sep 7, 2024

Trying to create a cluster on microOS nodes fails. Installer logs these lines:

[Control plane] Generating the kubeconfig file to /home/matteo/dev/tests/hetznerk3s/kubeconfig...
error: no context exists with the name: "test-master2"
[Control plane] ...kubeconfig file generated as /home/matteo/dev/tests/hetznerk3s/kubeconfig/kubeconfig.
Unhandled exception in spawn: timeout after 00:00:30 (Tasker::Timeout)
  from /usr/lib/crystal/core/channel.cr:453:10 in 'timeout'
  from /home/runner/work/hetzner-k3s/hetzner-k3s/src/kubernetes/installer.cr:124:7 in 'run'
  from /usr/lib/crystal/core/fiber.cr:143:11 in 'run'
  from ???

After some debugging i think i understand why it is failing.

When the k3s installation script is downloaded and executed on the master node, the script terminates by enabling the k3s systemd service but DOESN'T start it. As a result, the /etc/rancher/k3s/k3s.yaml kubeconfig DOESN'T get created, and the whole installation process fails.

I've taken a look a the k3s installation script, and i could be wrong on this, but i think the reason why the script doesn't start the systemd process is because at some point it sets the INSTALL_K3S_SKIP_START variable to true, and when the it reaches the following line -> [ "${INSTALL_K3S_SKIP_ENABLE}" = true ] && return, the script exits without starting the systemd service.

I think the point where the script sets INSTALL_K3S_SKIP_ENABLE to true is here ->

case ${3} in
       sle)
           rpm_installer="zypper --gpg-auto-import-keys"
           if [ "${TRANSACTIONAL_UPDATE=false}" != "true" ] && [ -x /usr/sbin/transactional-update ]; then
               transactional_update_run="transactional-update --no-selfupdate -d run"
               rpm_installer="transactional-update --no-selfupdate -d run ${rpm_installer}"
               : "${INSTALL_K3S_SKIP_START:=true}"
           fi

Again, i could be wrong, so take it just as an intuition on what could be the problem here.

To make the installation process work, i had to hardcode INSTALL_K3S_SKIP_START=false into master_install_script.sh ->

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="{{ k3s_version }}" K3S_TOKEN="{{ k3s_token }}" {{ datastore_endpoint }} INSTALL_K3S_SKIP_START=false INSTALL_K3S_EXEC=...

and generate a new binary.

I think having the possibility to customize the k3s installation script by passing extra variables is needed to resolve this particular issue.

@vitobotta
Copy link
Owner

Hi, nice that you found the problem. Would you mind making a PR with your change? :)

@EnDjeee
Copy link
Author

EnDjeee commented Sep 14, 2024

Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants