Summer-Ready Savings: Find the Training Course You Need at a Price You'll Love

closeClose

Apache Spark Overview

  • Tuition USD $1,000
  • Reviews star_rate star_rate star_rate star_rate star_half 347 Ratings
  • Course Code DB100
  • Duration 1 day
  • Available Formats Classroom, Virtual

This 1-day course is for data engineers, analysts, architects, data scientist, software engineers, IT operations, and technical managers interested in a brief hands-on overview of Apache Spark.

The course provides an introduction to the Spark architecture, some of the core APIs for using Spark, SQL and other high-level data access tools, as well as Spark’s streaming capabilities and machine learning APIs. The class is a mixture of lecture and hands-on labs.

Each topic includes lecture content along with hands-on labs in the Databricks notebook environment. Students may keep the notebooks and continue to use them with the free Databricks Community Edition offering after the class ends; all examples are guaranteed to run in that environment.

Skills Gained

After taking this class, students will be able to:

  • Use a subset of the core Spark APIs to operate on data.
  • Articulate and implement simple use cases for Spark
  • Build data pipelines and query large data sets using Spark SQL and DataFrames
  • Create Structured Streaming jobs
  • Understand how a Machine Learning pipeline works
  • Understand the basics of Spark’s internals

Who Can Benefit

Data engineers, analysts, architects, data scientist, software engineers, and technical managers who want a quick introduction into how to use Apache Spark to streamline their big data processing, build production Spark jobs, and understand and debug running Spark applications.

Prerequisites

Some familiarity with Apache Spark is helpful but not required. Knowledge of SQL is helpful. Basic programming experience in an object-oriented or functional language is highly recommended but not required. The class can be taught concurrently in Python and Scala.

Course Details

Topics

Spark Overview

Introduction to Spark SQL and DataFrames, including:

  • Reading & Writing Data
  • The DataFrames/Datasets API
  • Spark SQL
  • Caching and caching storage levels

Overview of Spark internals

  • Cluster Architecture
  • How Spark schedules and executes jobs and tasks
  • Shuffling, shuffle files, and performance
  • The Catalyst query optimizer

Spark Structured Streaming

  • Sources and sinks
  • Structured Streaming APIs
  • Windowing & Aggregation
  • Checkpointing & Watermarking
  • Reliability and Fault Tolerance

Overview of Spark’s MLlib Pipeline API for Machine Learning

  • Transformer/Estimator/Pipeline API
  • Perform feature preprocessing
  • Evaluate and apply ML models

How do I enroll?

A comprehensive listing of ExitCertified courses can be found here. You can register directly for the required course/location when you select "register". If you have any questions or prefer to speak with an ExitCertified education consultant directly, please submit your query here. A representative will contact you shortly.

How do I pay for a class?

You can pay at the time of registration using credit card (Mastercard/Visa/American Express) cheque or PO.

What if I have training credits?

ExitCertified honors all savings programs from the partners we work with. ExitCertified also offers training credits across multiple partners through our FLEX Account.

When does class start/end?

Classes begin promptly at 9:00 am, and typically end at 5:00 pm.

Lunchtime?

Lunch is normally an hour long and begins at noon. Coffee, tea, hot chocolate and juice are available all day in the kitchen. Fruit, muffins and bagels are served each morning. There are numerous restaurants near each of our centers, and some popular ones are indicated on the Area Map in the Student Welcome Handbooks - these can be picked up in the lobby or requested from one of our ExitCertified staff.

How can someone reach me during class?

If someone should need to contact you while you are in class, please have them call the center telephone number and leave a message with the receptionist.

What languages are used to deliver training?

Most courses are conducted in English, unless otherwise specified. Some courses will have the word "FRENCH" marked in red beside the scheduled date(s) indicating the language of instruction.

Instructor was knowledgeable and made the course fun. He knew how to convey the new material to students in simple and understandable manner. Used terminologies and apologies understood by all students in the v.classroom.

Simply great training provider that I can go for updating/acquiring my skill sets.

Very prompt during an issue that occurred, and accidentally contacted an individual on their vacation, but they quickly pointed me to another individual who got my issue resolved.

The format and presentation using the iMVP software worked as well.

I had a good and comfortable remote training experience. Look forward to more such trainings.

I found the IBM HACMP course to be very good. The instructor was top notch!

26 options found

undo
  • GTR Jul 22, 2020 Jul 22, 2020 (1 day)
    Location
    iMVP
    Language
    English
    Time
    9:00AM 5:00PM PDT
    Enroll
    Enroll
  • GTR Jul 22, 2020 Jul 22, 2020 (1 day)
    Location
    MVP Studio M, NJ
    Language
    English
    Time
    12:00PM 8:00PM EDT
    Enroll
    Enroll
  • GTR Jul 27, 2020 Jul 27, 2020 (1 day)
    Location
    iMVP
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
  • GTR Aug 26, 2020 Aug 26, 2020 (1 day)
    Location
    iMVP
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
  • Sep 14, 2020 Sep 14, 2020 (1 day)
    Location
    San Francisco, CA
    Language
    English
    Time
    9:00AM 5:00PM PDT
    Enroll
    Enroll
  • Sep 14, 2020 Sep 14, 2020 (1 day)
    Location
    MVP Sacramento, CA
    Language
    English
    Time
    9:00AM 5:00PM PDT
    Enroll
    Enroll
  • Sep 14, 2020 Sep 14, 2020 (1 day)
    Location
    iMVP
    Language
    English
    Time
    9:00AM 5:00PM PDT
    Enroll
    Enroll
  • Sep 30, 2020 Sep 30, 2020 (1 day)
    Location
    MVP King of Prussia, PA
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
  • Sep 30, 2020 Sep 30, 2020 (1 day)
    Location
    McLean, VA
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
  • Sep 30, 2020 Sep 30, 2020 (1 day)
    Location
    MVP Edison, NJ
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
  • Sep 30, 2020 Sep 30, 2020 (1 day)
    Location
    iMVP
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
  • Oct 26, 2020 Oct 26, 2020 (1 day)
    Location
    MVP King of Prussia, PA
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
  • Oct 26, 2020 Oct 26, 2020 (1 day)
    Location
    McLean, VA
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
  • Oct 26, 2020 Oct 26, 2020 (1 day)
    Location
    MVP Edison, NJ
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
  • Oct 26, 2020 Oct 26, 2020 (1 day)
    Location
    iMVP
    Language
    English
    Time
    9:00AM 5:00PM EDT
    Enroll
    Enroll
Contact Us 1-800-803-3948
Contact Us Live Chat
FAQ Get immediate answers to our most frequently asked qestions. View FAQs arrow_forward