Scalable Machine Learning with Apache Spark

  • Tuition USD $1,500
  • Reviews star_rate star_rate star_rate star_rate star_half 1131 Ratings
  • Course Code DB301-2
  • Duration 2 days
  • Available Formats Classroom

  • This course guides students through the process of building machine learning solutions using Spark. You will build and tune ML models with SparkML using transformers, estimators, and pipelines. This course highlights some of the key differences between SparkML and single-node libraries such as scikit-learn. Furthermore, you will reproduce your experiments and version your models using MLflow.
  • You will also integrate 3rd party libraries into Spark workloads, such as XGBoost. In addition, you will leverage Spark to scale inference of single-node models and parallelize hyperparameter tuning. This course includes hands-on labs and concludes with a collaborative capstone project. All of the notebooks are available in Python, and in Scala as well where available.

Skills Gained

  • Create data processing pipelines with Spark
  • Build and tune machine learning models with SparkML
  • Track, version, and deploy models with MLflow
  • Perform distributed hyperparameter tuning with Hyperopt
  • Use Spark to scale the interence of single-node models

Who Can Benefit

  • Data scientist
  • Machine learning engineer

Prerequisites

Intermediate experience with Python Beginning experience with the PySpark DataFrame API (or have taken the Apache Spark Programming with Databricks class) Working knowledge of machine learning and data science

When does class start/end?

Classes begin promptly at 9:00 am, and typically end at 5:00 pm.

Does the course schedule include a Lunchbreak?

Lunch is normally an hour long and begins at noon. Coffee, tea, hot chocolate and juice are available all day in the kitchen. Fruit, muffins and bagels are served each morning. There are numerous restaurants near each of our centers, and some popular ones are indicated on the Area Map in the Student Welcome Handbooks - these can be picked up in the lobby or requested from one of our ExitCertified staff.

How can someone reach me during class?

If someone should need to contact you while you are in class, please have them call the center telephone number and leave a message with the receptionist.

What languages are used to deliver training?

Most courses are conducted in English, unless otherwise specified. Some courses will have the word "FRENCH" marked in red beside the scheduled date(s) indicating the language of instruction.

What does GTR stand for?

GTR stands for Guaranteed to Run; if you see a course with this status, it means this event is confirmed to run. View our GTR page to see our full list of Guaranteed to Run courses.

Does ExitCertified deliver group training?

Yes, we provide training for groups, individuals and private on sites. View our group training page for more information.

Does ExitCertified deliver group training?

Yes, we provide training for groups, individuals, and private on sites. View our group training page for more information.

Very organized and provides a ton of information both in class and for self study. I highly recommend ExitCertified for AWS training.

Great learning center. My first online class experience -- went better than I anticipated.

Good course that covers all aspects of DB2. Gives very good starting point to enable anyone to work efficiently within DB2 environment.

Great training that provided pro-tips a plenty that were immediately put to use.

Good job coping with the "new" training, of having video learning in a group setting.

0 options available

There are currently no scheduled dates for this course. If you are interested in this course, request a course date with the links above. We can also contact you when the course is scheduled in your area.

Contact Us 1-800-803-3948
Contact Us Live Chat
FAQ Get immediate answers to our most frequently asked qestions. View FAQs arrow_forward