Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Batch - Infrastructure #814

Open
avrohomgottlieb opened this issue Jul 26, 2024 · 0 comments
Open

AWS Batch - Infrastructure #814

avrohomgottlieb opened this issue Jul 26, 2024 · 0 comments
Assignees
Labels

Comments

@avrohomgottlieb
Copy link
Contributor

avrohomgottlieb commented Jul 26, 2024

Context

We are in the process of transitioning our load_data workflow towards an integration of AWS Batch in order to speed up file computation and archival by means of parallelization. This Epic will begin this transition process.

Next Steps

The first step will be to flesh out the Batch infrastructure, which will be broken up as follows:

  • Develop a batch.tf terraform file to be able to declare Batch compute resources
    • Open questions: what type of compute instances do we want to use?
  • Work through the container based complexities of porting scpca_portal to the intended instances
    • Uploading containers to ecr, using fargate or ec2 with ecs, etc
  • Make sure that batch.tf is referenced inside of our start up scripts when we spin up our architecture in local/staging/prod environments
  • Analyze other infrastructure dependencies which will relate to Batch

A later epic will handle the actual implementation of Batch inside of the load_data command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant