Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRBD extension not working on arm / rock64 #251

Open
Ulrar opened this issue Oct 25, 2023 · 5 comments
Open

DRBD extension not working on arm / rock64 #251

Ulrar opened this issue Oct 25, 2023 · 5 comments

Comments

@Ulrar
Copy link

Ulrar commented Oct 25, 2023

Hi,

Apologies I don't really know how to debug this, but on my rock64 when using the DRBD extension I seem to be missing the drbd module :

talosctl --talosconfig talosconfig -n mynode list /lib/modules/6.1.58-talos/extras
NODE   NAME
1 error occurred:
 rpc error: code = Unknown desc = lstat /lib/modules/6.1.58-talos/extras: no such file or directory

The same config deployed on x86 machines does have that directory populated with the .ko files as expected.
I tried using the tag and also the specific arm hash from here, to be sure but no luck when "upgrading" to the same version to rebuild the initramfs.

I can't access the display for that node, not sure which service log might explain why this is failing ?
Thanks

@smira
Copy link
Member

smira commented Oct 26, 2023

There isn't enough information in the ticket. Is the drbd extension installed? Does it match Talos version?

@Ulrar
Copy link
Author

Ulrar commented Oct 26, 2023

There isn't enough information in the ticket. Is the drbd extension installed? Does it match Talos version?

Since the directory isn't present on the host I assume it's not, but I don't know how else to check. Linstor definitely isn't finding the DRBD module in any case, so it's not just a path issue.

It is the same (latest) version yes :

image: ghcr.io/siderolabs/drbd:9.2.4-v1.5.4

As stated the exact same config on two other x86 nodes does work fine, the issue is only on the rock64 which is arm64.

@smira
Copy link
Member

smira commented Oct 26, 2023

you have talosctl get extensions to see what extensions are installed

@smira
Copy link
Member

smira commented Oct 26, 2023

you can check yourself that the extension does contain the files, so the problem is somewhere probably on your end:

$ crane export ghcr.io/siderolabs/drbd:9.2.4-v1.5.4@sha256:908a2e1129ae6434c5af887b9f3ba7fde039b635e471cef2be808e017d464275 - | tar tv
-rw-r--r-- 0/0             272 2022-01-20 22:35 manifest.yaml
drwxr-xr-x 0/0               0 2022-01-20 22:35 rootfs
drwxr-xr-x 0/0               0 2022-01-20 22:35 rootfs/lib
drwxr-xr-x 0/0               0 2022-01-20 22:35 rootfs/lib/modules
drwxr-xr-x 0/0               0 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos
drwxr-xr-x 0/0               0 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/extras
-rw-r--r-- 0/0         1141122 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/extras/drbd.ko
-rw-r--r-- 0/0           88162 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/extras/drbd_transport_rdma.ko
-rw-r--r-- 0/0           49410 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/extras/drbd_transport_tcp.ko
-rw-r--r-- 0/0              74 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.alias
-rw-r--r-- 0/0              48 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.alias.bin
-rw-r--r-- 0/0           58621 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.builtin
-rw-r--r-- 0/0           42432 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.builtin.alias.bin
-rw-r--r-- 0/0           64021 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.builtin.bin
-rw-r--r-- 0/0          362817 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.builtin.modinfo
-rw-r--r-- 0/0             107 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.dep
-rw-r--r-- 0/0             191 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.dep.bin
-rw-r--r-- 0/0               0 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.devname
-rw-r--r-- 0/0            2058 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.order
-rw-r--r-- 0/0              55 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.softdep
-rw-r--r-- 0/0             611 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.symbols
-rw-r--r-- 0/0             752 2022-01-20 22:35 rootfs/lib/modules/6.1.58-talos/modules.symbols.bin

@Ulrar
Copy link
Author

Ulrar commented Oct 26, 2023

Alright, after a lot of digging I think I figured it out. The issue is the rock64 doesn't really have enough memory to schedule much, and certainly not the piraeus-operator. Even without that the upgrade command just silently kills the node unless I use --stage, I'm guessing because there's not enough memory to run the installer + the whole stack at the same time.

Using --stage I did manage to get drbd installed correctly, but that doesn't leave enough ram to schedule the piraeus-operator (it brings the node up to 107% usage).

Nevermind, I'll get rid of that node, thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants