COVID-19: End-To-End Analytics With AWS Glue, Athena And Superset

Introduction

Build a pipeline for open COVID-19 dataset. The dataset from corona-virus-report. It shows the cumulative confirmed, recovered and deaths figures by day, country and province. Latitude and longitude coordinates are also added at the country level. The pipeline complete the ETL of raw data and give visualization of entire world spread.

The pipeline as below

Save messed-up dataset to the covid-19-raw-data S3 bucket.
Run the AWS Glue Crawler on covid-19-raw-data S3 bucket to parse JSONs and create the covid-19-raw-data table in the Glue Data Catalog.
Run the Glue ETL Job on covid-19-raw-data table to:

clean the data
save ETL JSON result to the covid_19_output_data S3 bucket.

Run the AWS Glue Crawler on covid-19-output-data S3 bucket to parse JSONs and create the covid-19-output-data table in the Glue Data Catalog.
Query the covid-19-output-data table in Amazon Athena. Remove duplicates and create the final covid19_app_data_athena table in the Glue Data Catalog.
Connect Apache Superset to the covid19_app_data_athena table and build visualization dashboard.

Detail step by step guide

Reference

covid-19-end-to-end-analytics-with-aws-glue-athena-and-quicksight A public data lake for analysis of COVID-19 data

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
dataset		dataset
media		media
script		script
Athena-Superset-China.md		Athena-Superset-China.md
Covid_19_ETL.md		Covid_19_ETL.md
Install_Superset.md		Install_Superset.md
Public-Covid19-Datalake-cn.md		Public-Covid19-Datalake-cn.md
Public-Covid19-Datalake.md		Public-Covid19-Datalake.md
PyAthena-sample.md		PyAthena-sample.md
README.md		README.md
Superset-LDAP.md		Superset-LDAP.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COVID-19: End-To-End Analytics With AWS Glue, Athena And Superset

Introduction

Detail step by step guide

Reference

About

Releases

Packages

Languages

liangruibupt/covid_19_report_end2end_analytics

Folders and files

Latest commit

History

Repository files navigation

COVID-19: End-To-End Analytics With AWS Glue, Athena And Superset

Introduction

Detail step by step guide

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages