Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KernelCI API/Pipeline production planning #381

Open
1 of 6 tasks
nuclearcat opened this issue Oct 12, 2023 · 0 comments
Open
1 of 6 tasks

KernelCI API/Pipeline production planning #381

nuclearcat opened this issue Oct 12, 2023 · 0 comments

Comments

@nuclearcat
Copy link
Member

nuclearcat commented Oct 12, 2023

Production deployment planning

We need to plan the deployment of the production instance of KernelCI API/Pipeline.
This includes selecting the cloud provider, Kubernetes cluster, database, storage and ingress controller, which we need to evaluate and compare.
We might have a follow-up issue for the deployment technical details after we have selected the options.

Kubernetes cluster options

  1. Azure Kubernetes Service (AKS)

  2. Google Kubernetes Engine (GKE)
    Blockers: We have only fixed-configuration cluster for building kernel on spot instances which is not suitable for infrastructure deployment

  3. Self-hosted Kubernetes
    Blockers: Unlikely we have enough sysadmin resources to maintain self-hosted Kubernetes cluster on bare-metal

Database options

  1. Atlas MongoDB
  2. Self-hosted MongoDB

Concerns:

  • Can we afford Atlas?
  • Backup procedures, snapshotting for rollback is different on Atlas and self-hosted

Action items:

  • Estimate the cost of Atlas by mocking the database size required for approx X days of data
  • Proper backup strategy on each option (3-2-1)
  • Migration from Atlas to MongoDB and vice-versa

Storage options

  1. Azure Files Storage
    Concerns: Not compatible with caching services (LAVA caching over nginx)

  2. S3-compatible storage

Solutions:

  • Minio in VM/Docker/Kubernetes backed by persistent storage
  • S3-compatibility layer on Google Storage
  • s3proxy might provide a solution for Azure

Action items:

  • Investigate how s3proxy with Azure Files behave with LAVA caching

Concerns:

  • s3proxy is Java 🤮

Ingress controller options

  1. Nginx
  2. Traefik
  3. Kong
  4. HAProxy

Action items:

  • Investigate deployment of SSL certificates for multiple domains/endpoints
  • Investigate how we can retrieve real user IPs (HTTP headers?)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress
Development

No branches or pull requests

1 participant