When does class start/end?
Classes begin promptly at 9:00 am, and typically end at 5:00 pm.
This intensive hands-on Data Engineering training course teaches the students how to apply Python to the practical aspects of data engineering and introduces the students to the popular Python...
Read MoreThis intensive hands-on Data Engineering training course teaches the students how to apply Python to the practical aspects of data engineering and introduces the students to the popular Python libraries used in the field, including NumPy, pandas, Matplotlib, scikit-learn, and Apache Spark.
Developers, Software Engineers, Data Scientists, and IT Architects.
Participants must have practical experience coding in one or more modern programming languages. Knowledge of Python is desirable but not necessary. Students are expected to be able to learn the new material quickly, reinforce their knowledge by doing programming exercises (labs) and then apply their knowledge in data engineering mini-projects.
Chapter 1. Defining Data Engineering
Chapter 2. Distributed Computing Concepts for Data Engineers
Chapter 3. Data Processing Phases
Chapter 4. Quick Introduction to Python for Data Engineers
Chapter 5. Practical Introduction to NumPy
Chapter 6. Practical Introduction to Pandas
Chapter 7. Descriptive Statistics Computing Features in Python
Chapter 8. Data Grouping and Aggregation with pandas
Chapter 9. Repairing and Normalizing Data
Chapter 10. Data Visualization in Python using matplotlib
Chapter 11. Parallel Data Processing with PySpark
Chapter 12. Python as a Cloud Scripting Language
Lab Exercises