Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nexus fails to deploy with cloud-init failure (Failed to import key from key server) #2785

Closed
craddm opened this issue Oct 26, 2022 · 6 comments · Fixed by #2818
Closed

Nexus fails to deploy with cloud-init failure (Failed to import key from key server) #2785

craddm opened this issue Oct 26, 2022 · 6 comments · Fixed by #2818
Assignees
Labels
bug Something isn't working

Comments

@craddm
Copy link

craddm commented Oct 26, 2022

Describe the bug
I deployed the most recent release (v0.6.0) of AzureTRE from the AzureTRE-Deployment repo. Any Linux VM - I've tested both the available images - I create in a workspace is unreachable via Guacamole. Attempts to connect are simply rejected with the message "The remote desktop server is currently unreachable". The Linux VMs are definitely up and running. In contrast, Windows VMs function as expected - no issues connecting.

Steps to reproduce

  1. Create any Linux VM in a workspace
  2. Attempt to connect to it through Guacamole
@craddm craddm added the bug Something isn't working label Oct 26, 2022
@marrobi
Copy link
Member

marrobi commented Oct 26, 2022

Hi @craddm . We use cloud-init scripts to install packages - such as the desktop - to the Linux VMs. I imagine these are either taking a while to run, or have failed. In production we see people using custom VM images which means these scripts can often be avoided.

Is nexus configured in the environment? The Linux VMs have a dependency on Nexus for apt. Thinking about it we used to install it by default so might need adding to the docs.

Otherwise I would check the cloud-init logs via serial console - https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/serial-console-overview

This blog is useful for debugging cloud init - https://blog.gripdev.xyz/2019/02/19/debugging-cloud-init-on-ubuntu-in-azure-or-anywhere/

@craddm
Copy link
Author

craddm commented Oct 26, 2022

Ah, so yes - Nexus wasn't fully set up. Turns out the Nexus deployment isn't working. From the cloud-init logs from the Nexus VM I can see that it's failing to import the MS signing GPG key while setting up sources -

('apt-configure', ValueError("Failed to import key 'BC528686B50D79E339D3721CEB3E94ADBE1229CF' from keyserver 'hkp://keyserver.ubuntu.com:80' after 3 tries: Unexpected error while running command.\nCommand: ['gpg', '--no-tty', '--keyserver=hkp://keyserver.ubuntu.com:80', '--recv-keys', 'BC528686B50D79E339D3721CEB3E94ADBE1229CF']\nExit code: 2\nReason: -\nStdout: \nStderr: gpg: keyserver receive failed: No data",))

@jjgriff93 jjgriff93 self-assigned this Oct 27, 2022
@jjgriff93
Copy link
Collaborator

Hello @craddm - I've had this same issue and it was transient, seems the keyserver was unreachable. Could you try the following steps to fix your Nexus instance and let me know how you get on?

Run the following in the Nexus VM's shell:

  1. sudo cloud-init clean --logs
  2. sudo cloud-init init --local
  3. sudo cloud-init init
  4. sudo cloud-init modules --mode=config
  5. sudo cloud-init modules --mode=final

This should re-run all stages of cloud-init and hopefully you won't hit the gpg failure (I've just ran this successfully so fingers crossed!)

@craddm
Copy link
Author

craddm commented Oct 27, 2022

Hi @jjgriff93

This indeed worked, and I could successfully create accessible Linux VMs afterwards. But the original failure isn't due to intermittent unreachability of the keyserver. I'd deleted the service and reinstalled it several times before reporting the issue, and it failed the same way every time. So, just to check, I deleted the service and redeployed it, right after having it working, and it failed again in the same way. Possibly firewall rule changes haven't occurred when cloud-init is first running?

@jjgriff93
Copy link
Collaborator

Thanks for investigating @craddm - I think that's a very likely theory in that case. Let me try to replicate on my end to confirm

@jjgriff93 jjgriff93 changed the title Linux VMs unreachable through Guacamole Nexus fails to deploy with cloud-init failure (Failed to import key from key server) Oct 31, 2022
@jjgriff93
Copy link
Collaborator

Morning @craddm - just an update on this, I was able to replicate and think I have a fix. You're spot on that the firewall rules aren't being applied before the installation starts which is a recent regression. Shall update you when a fix is PR'd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: Done
3 participants