Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All Joyent machines are offline #3104

Closed
targos opened this issue Dec 4, 2022 · 14 comments
Closed

All Joyent machines are offline #3104

targos opened this issue Dec 4, 2022 · 14 comments

Comments

@targos
Copy link
Member

targos commented Dec 4, 2022

image
image

@targos targos added the incident label Dec 4, 2022
@richardlau
Copy link
Member

@bahamat Any ideas? I'm unable to ssh into any of them.

@richardlau
Copy link
Member

Backup, grafana and unencrypted infra servers are also affected and offline.

@bahamat
Copy link

bahamat commented Dec 4, 2022

These hosts are provided as a courtesy by Equinix Metal now, and I don't have access to the project. But I'll try to get in touch with someone at Equinix to see what I can figure out.

@bahamat
Copy link

bahamat commented Dec 4, 2022

@mhdawson AFAIK, you're the OpenJS POC on these servers in Equinix Metal. Do you know anything about this? Would you be able to give me access so I can help manage the project?

@sxa555
Copy link
Contributor

sxa555 commented Dec 4, 2022 via email

@richardlau
Copy link
Member

@sxa555 Those were the Jenkins workspace machines. The Joyent labelled ones were not in any of the Equinix accounts that the wider Build WG had access to (I wasn't even aware they were being hosted in Equinix Metal). Equinix did close their older data facilities at the end of November (#3028) which might mean these are gone 😱.

@bahamat
Copy link

bahamat commented Dec 4, 2022

Joyent hasn't hosted any infra over a year. When Joyent's infra shut down I migrated those services into an Equinix Metal. Some time after that the machines were moved into an account/org managed by Equinix themselves. This was done for billing purposes (because Equinix is/was providing them free of charge). When the machines were to the new account, I wasn't granted access to it.

According to some emails I have, @mhdawson was designated the POC by Equinix. That's the last thing I've ever heard on it.

@sxa
Copy link
Member

sxa commented Dec 5, 2022

I've sent an email to @vielmetti copying @mhdawson @richardlau and @bahamat to see what the deal is with the hosting of those machines since it seems they weren't part of the planned migration of the other Equinix project which Richard has been migrating other systems from as part of #3028.

Linking this to #2552 which is the issue covering the origin Joyent -> Equinix migration in February 2021.

@vielmetti
Copy link

Thanks all.

Two systems had their network ports taken offline. That has been corrected, and they are back online. It may take intervention on your part to bring all of the services back, but you should have full access now.

Once the systems are back up we can tackle the migration part of this.

@sxa
Copy link
Member

sxa commented Dec 5, 2022

Two systems had their network ports taken offline. That has been corrected, and they are back online. It may take intervention on your part to bring all of the services back, but you should have full access now.

Thanks @vielmetti for being so responsive and getting them back to us! Very much appreciated.

Once the systems are back up we can tackle the migration part of this.

👍🏻 We'll need to find someone who has access to the old account, or just start work on creating them in the one that @richardlau and I have access to I guess

@vielmetti
Copy link

vielmetti commented Dec 5, 2022

The old account is listed to Sean Johnson (sean at joyent) as the only contact. What's confusing to me is that there's a record in the logs as well as the project being related to "MNX Systems". I know there was a transition at some point but I can't explain in a few words how it is supposed to be (and clearly the situation now is not correct).

@bahamat
Copy link

bahamat commented Dec 5, 2022

I've been in touch with Sean today and we've now got me set up as an owner on the NodeCore organization. We'll be taking over the administrative management on the Equinix side. I think it would also be good to add one or more members of the build team to get notifications related to server hardware that may need action on the build side.

Anybody who is interested can reach out to me on the OpenJS slack and we can work out getting you set up.

@vielmetti
Copy link

Thanks @bahamat . The two systems are "s1.xlarge" which are going away. When we ran through this exercise the last time that was the best instance type pick, but now I suspect that the "s3.large" would be best (if you need lots and lots of storage).

There is a server type table at https://deploy.equinix.com/product/servers/ to give you some additional choices.

@sxa
Copy link
Member

sxa commented Dec 7, 2022

I'm going to close this since the initial "offline" problem has now been resolved and we will track the activities to migrate to a more suitable data center under #3108

@sxa sxa closed this as completed Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants