Cloudera Data Science Workbench Training prepares learners to complete data science and machine learning projects using Cloudera Data Science Workbench (CDSW).
Through narrated demonstrations and hands-on exercises, learners achieve proficiency in CDSW and develop the skills required to:
- Navigate CDSW’s options and interfaces with confidence
- Create projects in CDSW and collaborate securely with other users and teams
- Develop and run reproducible Python and R code
- Customize projects by installing packages and setting environment variables
- Connect to a secure (Kerberized) Cloudera or Hortonworks cluster
- Work with large-scale data using Apache Spark 2 with PySpark and sparklyr
- Perform end-to-end machine learning workflows in CDSW using Python or R (read, inspect, transform, visualize, and model data)
- Measure, track, and compare machine learning models using CDSW’s Experiments capability
- Deploy models as REST API endpoints serving predictions using CDSW’s Models capability
- Work collaboratively using CDSW together with Git
Who Can Benefit
- This course is designed for learners at organizations using CDSW under an enterprise license or a trial license. The learner must have access to a CDSW environment on a Cloudera or Hortonworks cluster running Apache Spark 2. Some experience with data science using Python or R is helpful but not required. No prior knowledge of Spark or other Hadoop ecosystem tools is required.