Skip to content

Latest commit

 

History

History
23 lines (15 loc) · 414 Bytes

course_outline.md

File metadata and controls

23 lines (15 loc) · 414 Bytes

Course Outline

1. MapReduce Paradigm (up to 25%)

  • Jimmy Lin's Book
  • Solve Big Data problems using MapReduce functions:
    • map()
    • combine()
    • reduce()

2. PySpark and Spark (up to 65%)

  • Solve Big Data problems using Spark/PySpark
  • Mahmoud Parsian's book: Data Algorithms with Spark

3. Data Partitioning and SQL Queries (up to 10%)

  • Amazon Athena
  • Google BigQuery