Training
Programming
Introduction to Python and PySpark

7878 Reviews star_rate star_rate star_rate star_rate star_half

Introduction to Python and PySpark

This Python and PySpark training course teaches learners the fundamentals of Python, including data types, variables, functions, and classes. Studetns also learn how to use Python to create powerful...

By Request

$2,090 USD

Course Code WA2914

Duration 3 days

Available Formats Classroom

Enter your Email to Download Full Course Details

Overview

Schedule

FAQ

Reviews

Skills Gained

Code in Python
Create Python Scripts
Create and use variables in Python
Work with Python Collections
Write and use control statements and loops in Python
Define and use functions in Python
Read and write text files in Python
Learn about functional programming in Python
Use the Databricks Community Cloud Lab Environment
Use pandas and seaborn for data visualization and EDA
Use the PySpark Shell Environment
Understand Spark DataFrames
Learn the PySpark DataFrame API
Repair and normalize data in PySpark
Use Spark SQL with PySpark

Who Can Benefit

Developers and/or Data Analysts

Prerequisites

Programming and/or scripting experience in a modern programming language.

Course Details

Outline

Chapter 1 - Introduction to Python

What is Python
Uses of Python
Installing Python
Python Package Manager (PIP)
Using the Python Shell
Python Code Conventions
Importing Modules
The Help(object) Command
The Help Prompt
Summary

Chapter 2 - Python Scripts

Executing Python Code
Python Scripts
Writing Scripts
Running Python Scripts
Self Executing Scripts
Accepting Command-Line Parameters
Accepting Interactive Input
Retrieving Environment Settings
Summary

Chapter 3 - Data Types and Variables

Creating Variables
Displaying Variables
Basic Concatenation
Data Types
Strings
Strings as Arrays
String Methods
Combining Strings and Numbers
Numeric Types
Integer Types
Floating Point Types
Boolean Types
Checking Data Type
Summary

Chapter 4 - Python Collections

Python Collections
List Type
Modifying Lists
Sorting a List
Tuple Type
Python Sets
Modifying Sets
Dictionary (Map) Type
Dictionary Methods
Sequences
Summary

Chapter 5 - Control Statements and Looping

If Statement
elif Keyword
Boolean Conditions
Single Line If Statements
For-in Loops
Looping over an Index
Range Function
Nested Loops
While Loops
Exception Handling
Built-in Exceptions
Exceptions thrown by Built-In Functions
Summary

Chapter 6 - Functions in Python

Defining Functions
Naming Functions
Using Functions
Function Parameters
Named Parameters
Variable Length Parameter List
How Parameters are Passed
Variable Scope
Returning Values
Docstrings
Best Practices
Single Responsibility
Returning a Value
Function Length
Pure and Idempotent Functions
Summary

Chapter 7 - Working With Data in Python

Data Type Conversions
Conversions from other Types to Integer
Conversions from other Types to Float
Conversions from other Types to String
Conversions from other Types to Boolean
Converting Between Set, List and Tuple Data Structures
Modifying Tuples
Combining Set, List and Tuple Data Structures
Creating Dictionaries from other Data Structures
Summary

Chapter 8 - Reading and Writing Text Files

Opening a File
Writing a File
Reading a File
Appending to a File
File Operations Using the With Statement
File and Directory Operations
Reading JSON
Writing JSON
Summary

Chapter 9 - Functional Programming Primer

What is Functional Programming?
Benefits of Functional Programming
Functions as Data
Using Map Function
Using Filter Function
Lambda expressions
List.sort() Using Lambda Expression
Difference Between Simple Loops and map/filter Type Functions
Additional Functions
General Rules for Creating Functions
Summary

Chapter 10 - Introduction to Apache Spark

What is Apache Spark
A Short History of Spark
Where to Get Spark?
The Spark Platform
Spark Logo
Common Spark Use Cases
Languages Supported by Spark
Running Spark on a Cluster
The Driver Process
Spark Applications
Spark Shell
The spark-submit Tool
The spark-submit Tool Configuration
The Executor and Worker Processes
The Spark Application Architecture
Interfaces with Data Storage Systems
Limitations of Hadoop's MapReduce
Spark vs MapReduce
Spark as an Alternative to Apache Tez
The Resilient Distributed Dataset (RDD)
Datasets and DataFrames
Spark Streaming (Micro-batching)
Spark SQL
Example of Spark SQL
Spark Machine Learning Library
GraphX
Spark vs R
Summary

Chapter 11 - The Spark Shell

The Spark Shell
The Spark v.2 + Command-Line Shells
The Spark Shell UI
Spark Shell Options
Getting Help
Jupyter Notebook Shell Environment
Example of a Jupyter Notebook Web UI (Databricks Cloud)
The Spark Context (sc) and Spark Session (spark)
Creating a Spark Session Object in Spark Applications
The Shell Spark Context Object (sc)
The Shell Spark Session Object (spark)
Loading Files
Saving Files
Summary

Chapter 12 - Spark RDDs

The Resilient Distributed Dataset (RDD)
Ways to Create an RDD
Supported Data Types
RDD Operations
RDDs are Immutable
Spark Actions
RDD Transformations
RDD Transformations
Other RDD Operations
Chaining RDD Operations
RDD Lineage
The Big Picture
What May Go Wrong
Checkpointing RDDs
Local Checkpointing
Parallelized Collections
More on parallelize() Method
The Pair RDD
Where do I use Pair RDDs?
Example of Creating a Pair RDD with Map
Example of Creating a Pair RDD with keyBy
Miscellaneous Pair RDD Operations
RDD Caching
RDD Persistence
Summary

Chapter 13 - Parallel Data Processing with Spark

Running Spark on a Cluster
Data Partitioning
Data Partitioning Diagram
Single Local File System RDD Partitioning
Multiple File RDD Partitioning
Special Cases for Small-sized Files
Parallel Data Processing of Partitions
Parallel Data Processing of Partitions
Spark Application, Jobs, and Tasks
Stages and Shuffles
The "Big Picture"
Summary

Chapter 14 - Shared Variables in Spark

Shared Variables in Spark
Broadcast Variables
Creating and Using Broadcast Variables
Example of Using Broadcast Variables
Problems with Global Variables
Example of the Closure Problem
Accumulators
Creating and Using Accumulators
Example of Using Accumulators (Scala Example)
Example of Using Accumulators (Python Example)
Custom Accumulators
Summary

Chapter 15 - Introduction to Spark SQL

What is Spark SQL?
What is Spark SQL?
Uniform Data Access with Spark SQL
Using JDBC Sources
Hive Integration
What is a DataFrame?
Creating a DataFrame in PySpark
Creating a DataFrame in PySpark (Cont'd)
Creating a DataFrame in PySpark (Cont'd)
Commonly Used DataFrame Methods and Properties in PySpark
Commonly Used DataFrame Methods and Properties in PySpark (Cont'd)
Grouping and Aggregation in PySpark
The "DataFrame to RDD" Bridge in PySpark
The SQLContext Object
Examples of Spark SQL / DataFrame (PySpark Example)
Converting an RDD to a DataFrame Example
Example of Reading / Writing a JSON File
Performance, Scalability, and Fault-tolerance of Spark SQL
Summary

Chapter 16 - Repairing and Normalizing Data

Repairing and Normalizing Data
Dealing with the Missing Data
Sample Data Set
Getting Info on Null Data
Dropping a Column
Interpolating Missing Data in pandas
Replacing the Missing Values with the Mean Value
Scaling (Normalizing) the Data
Data Preprocessing with scikit-learn
Scaling with the scale() Function
The MinMaxScaler Object
Summary

Chapter 17 - Data Grouping and Aggregation in Python

Data Aggregation and Grouping
Sample Data Set
The pandas.core.groupby.SeriesGroupBy Object
Grouping by Two or More Columns
Emulating SQL's WHERE Clause
The Pivot Tables
Cross-Tabulation
Summary

Lab Exercises

Lab 1. Introduction to Python
Lab 2. Creating Scripts
Lab 3. Variables in Python
Lab 4. Collections
Lab 5. Control Statements and Loops
Lab 6. Functions in Python
Lab 7. Reading and Writing Text Files
Lab 8. Functional Programming
Lab 9. Learning the Databricks Community Cloud Lab Environment
Lab 10. Data Visualization and EDA with pandas and seaborn
Lab 11. Learning PySpark Shell Environment
Lab 12. Understanding Spark DataFrames
Lab 13. Learning the PySpark DataFrame API
Lab 14. Data Repair and Normalization in PySpark
Lab 15. Spark SQL with PySpark

Read Less

0 options available

There are currently no scheduled dates for this course. If you are interested in this course, request a course date with the links above. We can also contact you when the course is scheduled in your area.

Request Other Date Request On-site Course

When does class start/end?

Classes begin promptly at 9:00 am, and typically end at 5:00 pm.

Does the course schedule include a Lunchbreak?

Lunch is normally an hour long and begins at noon. Coffee, tea, hot chocolate and juice are available all day in the kitchen. Fruit, muffins and bagels are served each morning. There are numerous restaurants near each of our centers, and some popular ones are indicated on the Area Map in the Student Welcome Handbooks - these can be picked up in the lobby or requested from one of our ExitCertified staff.

How can someone reach me during class?

If someone should need to contact you while you are in class, please have them call the center telephone number and leave a message with the receptionist.

What languages are used to deliver training?

Most courses are conducted in English, unless otherwise specified. Some courses will have the word "FRENCH" marked in red beside the scheduled date(s) indicating the language of instruction.

What does GTR stand for?

GTR stands for Guaranteed to Run; if you see a course with this status, it means this event is confirmed to run. View our GTR page to see our full list of Guaranteed to Run courses.

How do I find an ExitCertified training location?

We have training locations across the United States and Canada. View a full list of classroom training locations.

Which delivery formats are available?

At ExitCertified we offer training that is Instructor-Led, Online, Virtual and Self-Paced.

Does ExitCertified deliver group training?

Yes, we provide training for groups, individuals and private on sites. View our group training page for more information.

What does vendor-authorized training mean?

As a vendor-authorized training partner, we offer a curriculum that our partners have vetted. We use the same course materials and facilitate the same labs as our vendor-delivered training. These courses are considered the gold standard and, as such, are priced accordingly.

Is the training too basic, or will you go deep into technology?

It depends on your requirements, your role in your company, and your depth of knowledge. The good news about many of our learning paths, you can start from the fundamentals to highly specialized training.

How up-to-date are your courses and support materials?

We continuously work with our vendors to evaluate and refresh course material to reflect the latest training courses and best practices.

Are your instructors seasoned trainers who have deep knowledge of the training topic?

ExitCertified instructors have an average of 27 years of practical IT experience. They have also served as consultants for an average of 15 years. To stay up to date, instructors will at least spend 25 percent of their time learning new emerging technologies and courses.

Do you provide hands-on training and exercises in an actual lab environment?

Lab access is dependent on the vendor and the type of training you sign up for. However, many of our top vendors will provide lab access to students to test and practice. The course description will specify lab access.

Will you customize the training for our company’s specific needs and goals?

We will work with you to identify training needs and areas of growth. We offer a variety of training methods, such as private group training, on-site of your choice, and virtually. We provide courses and certifications that are aligned with your business goals.

How do I get started with certification?

Getting started on a certification pathway depends on your goals and the vendor you choose to get certified in. Many vendors offer entry-level IT certification to advanced IT certification that can boost your career. To get access to certification vouchers and discounts, please contact customerexp@exitcertified.com.

Will I get access to content after I complete a course?

You will get access to the PDF of course books and guides, but access to the recording and slides will depend on the vendor and type of training you receive.

How to request a W9 for ExitCertified LLC?

View our filing status and how to request a W9.

This was effective way to provide a ton of information in a short time period.

ExitCertified Student

ExitCertified

Class was very informative, although one lab didnt but will try again later

ExitCertified Student

ExitCertified

this class was informative, made me think about certifying for the suse manager cert.

Reuben R

ExitCertified

Overall it was a good bootcamp. A lot to cover so it is understandable that the pace had to be a little fast.

ExitCertified Student

ExitCertified

I liked the pace of the course. I like that I have more than instance to use the lab.

Austin O

ExitCertified

Introduction to Python and PySpark

Overview

Schedule

FAQ

Reviews

Skills Gained

Who Can Benefit

Prerequisites

Course Details

Outline

When does class start/end?

Does the course schedule include a Lunchbreak?

How can someone reach me during class?

What languages are used to deliver training?

What does GTR stand for?

How do I find an ExitCertified training location?

Which delivery formats are available?

Does ExitCertified deliver group training?

What does vendor-authorized training mean?

Is the training too basic, or will you go deep into technology?

How up-to-date are your courses and support materials?

Are your instructors seasoned trainers who have deep knowledge of the training topic?

Do you provide hands-on training and exercises in an actual lab environment?

Will you customize the training for our company’s specific needs and goals?

How do I get started with certification?

Will I get access to content after I complete a course?

How to request a W9 for ExitCertified LLC?

Alert!

Modal Title

Error!

Default Title

Prompt

Confirm

Login

Introduction to Python and PySpark

Overview

Schedule

FAQ

Reviews

Skills Gained

Who Can Benefit

Prerequisites

Course Details

Outline

Upcoming Course Dates

Drag & Drop a File Here

When does class start/end?

Does the course schedule include a Lunchbreak?

How can someone reach me during class?

What languages are used to deliver training?

What does GTR stand for?

How do I find an ExitCertified training location?

Which delivery formats are available?

Does ExitCertified deliver group training?

What does vendor-authorized training mean?

Is the training too basic, or will you go deep into technology?

How up-to-date are your courses and support materials?

Are your instructors seasoned trainers who have deep knowledge of the training topic?

Do you provide hands-on training and exercises in an actual lab environment?

Will you customize the training for our company’s specific needs and goals?

How do I get started with certification?

Will I get access to content after I complete a course?

How to request a W9 for ExitCertified LLC?

Alert!

Modal Title

Error!

Default Title

Prompt

Confirm

Login