In this course data engineers optimize and automate Extract, Transform, Load (ETL) workloads using stream processing, job recovery strategies, and automation strategies like REST API integration. By the end of this course you will schedule highly optimized and robust ETL jobs, debugging problems along the way.
- Perform an ETL job on a streaming data source
- Parameterize a code base and manage task dependencies
- Submit and monitor jobs using the REST API or Command Line Interface
- Design and implement a job failure recovery strategy using the principle of idempotence
- Optimize ETL queries using compression and caching best practices with optimal hardware choices
- ETL Part 1 self-paced course (optional, but strongly encouraged)
- ETL Part 2 self-paced course (optional, but strongly encouraged)