In this course data engineers optimize and automate Extract, Transform, Load (ETL) workloads using stream processing, job recovery strategies, and automation strategies like REST API integration. By the end of this course you will schedule highly optimized and robust ETL jobs, debugging problems along the way.
Perform an ETL job on a streaming data source
Parameterize a code base and manage task dependencies
Submit and monitor jobs using the REST API or Command Line Interface
Design and implement a job failure recovery strategy using the principle of idempotence
Optimize ETL queries using compression and caching best practices with optimal hardware choices
ETL Part 1 self-paced course (optional, but strongly encouraged)
ETL Part 2 self-paced course (optional, but strongly encouraged)
Course Overview and Setup
Supported platforms include Azure Databricks and AWS Databricks. Note: This course will not run on Databricks Community Edition.
If you're planning to use the course on Azure Databricks, select the "Azure Databricks" Platform option.
If you're planning to use the course on Databricks Community Edition or on a non-Azure version of Databricks, select the "Other Databricks" Platform option.
The course is a series of six self-paced lessons available in both Scala and Python. A final capstone project involves refactoring a batch ETL job to a streaming pipeline. In the process, students run the workload as a job and monitor it. Each lesson includes hands-on exercises.
https://www.exitcertified.com/it-training/databricks/etl-part3-production-56304-detail.htmlETL3-PROD-SELFETL Part 3 - Productionhttps://assets.exitcertified.com/assets/CourseImages/dac089e758/AdobeStock_201833011__FitMaxWzEwMDAsMTAwMF0.jpg75.00USDInStock/Training/DatabricksIn this course data engineers optimize and automate Extract, Transform, Load (ETL) workloads using stream processing, job...75.00DatabricksSelf Paced2019-03-21T09:59:31+00:00USD