When does class start/end?
Classes begin promptly at 9:00 am, and typically end at 5:00 pm.
This four-day hands-on training course delivers the key concepts and expertise developers need to use Apache Spark to develop high-performance parallel applications. Participants will learn how to use Spark SQL to query structured data and Spark Streaming to perform real-time processing on streaming data from a variety of sources. Developers will also practice writing applications that use core Spark to perform ETL processing and iterative algorithms. The course covers how to work with “big data” stored in a distributed file system, and execute Spark applications on a Hadoop cluster. After taking this course, participants will be prepared to face real-world challenges and build applications to execute faster decisions, better decisions, and interactive analysis, applied to a wide variety of use cases, architectures, and industries.
This course is designed for developers and engineers who have programming experience, but prior knowledge of Hadoop and/or Spark is not required.
This course is designed for developers and engineers who have programming experience, but prior knowledge of Spark and Hadoop is not required. Apache Spark examples and hands-on exercises are presented in Scala and Python. The ability to program in one of those languages is required. Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful.
1. Introduction
2. Introduction to Apache Hadoop and the Hadoop Ecosystem
3. Apache Hadoop File Storage
4. Distributed Processing on an Apache Hadoop Cluster
5. Apache Spark Basics
6. Working with DataFrames and Schemas
7. Analyzing Data with DataFrame Queries
8. RDD Overview
9. Transforming Data with RDDs
10. Aggregating Data with Pair RDDs
11. Querying Tables and Views with SQL
12. Working with Datasets in Scala
13. Writing, Configuring, and Running Spark Applications
14. Spark Distributed Processing
15. Distributed Data Persistence
16. Common Patterns in Spark Data Processing
17. Introduction to Structured Streaming
18. Structured Streaming with Apache Kafka
19. Aggregating and Joining Streaming DataFrames
20. Conclusion
A. Message Processing with Apache Kafka
Classes begin promptly at 9:00 am, and typically end at 5:00 pm.
Lunch is normally an hour long and begins at noon. Coffee, tea, hot chocolate and juice are available all day in the kitchen. Fruit, muffins and bagels are served each morning. There are numerous restaurants near each of our centers, and some popular ones are indicated on the Area Map in the Student Welcome Handbooks - these can be picked up in the lobby or requested from one of our ExitCertified staff.
If someone should need to contact you while you are in class, please have them call the center telephone number and leave a message with the receptionist.
Most courses are conducted in English, unless otherwise specified. Some courses will have the word "FRENCH" marked in red beside the scheduled date(s) indicating the language of instruction.
GTR stands for Guaranteed to Run; if you see a course with this status, it means this event is confirmed to run. View our GTR page to see our full list of Guaranteed to Run courses.
We have training locations across the United States and Canada. View a full list of classroom training locations.
At ExitCertified we offer training that is Instructor-Led, Online, Virtual and Self-Paced.
Yes, we provide training for groups, individuals and private on sites. View our group training page for more information.
Yes, we provide training for groups, individuals, and private on sites. View our group training page for more information.
Eric was a superb instructor that effortlessly explained complex topics in a simple and a fun way.
The instructor is one of the best I believe.
He has a great sense of humor, and he can always explain hard questions with an easy way.
I hope I can learn other courses from him.
Thank you.
I came to learn and get some high level understanding of Hadoop and Spark. I feel confident about all Spark can do and can apply this to my work
It was very pleasant experience in going through this training with Charles Hardin. It is truly amazing to see Charles passion in teaching and technology.
The combination of Eric's lecture and the class materials enabled me to learn and practice the concepts went over in this training course.
2 options available