microsoft partner logo color
8245  Reviews star_rate star_rate star_rate star_rate star_half

Data Analytics and Machine Learning Solutions with Azure Databricks

This course teaches learners how to leverage Azure Databricks for data analytics and machine learning using real-world use cases. Learners get hands-on experience with key concepts, tools, and...

Read More
Course Code DP-3011-14
Duration 2 days
Available Formats Classroom

This course teaches learners how to leverage Azure Databricks for data analytics and machine learning using real-world use cases. Learners get hands-on experience with key concepts, tools, and techniques for analyzing data, building data pipelines, training ML models, and managing machine learning workflows in a cloud-native environment. Note: This course is a combination of the following Microsoft courses:

  • DP-3011: Implementing a Data Analytics Solution with Azure Databricks
  • DP-3014: Implementing a Machine Learning solution with Azure Databricks

Skills Gained

  • Understand core concepts and workloads in Azure Databricks
  • Explore and analyze data using DataFrames and Apache Spark
  • Learn how to manage data using Delta Lake for ACID transactions and versioning
  • Build and deploy data pipelines with Delta Live Tables and Azure Databricks Workflows
  • Train, optimize, and manage machine learning models, including deep learning models, using MLflow, Hyperopt, AutoML, and distributed training techniques
  • Evaluate and fine-tune machine learning models for optimal performance in production environments

Prerequisites

  • Experience with Python and SQL is required
  • Basic knowledge of data engineering concepts
  • Familiarity with machine learning fundamentals
  • Familiarity with any framework, like Scikit-Learn, PyTorch, or Tensorflow

Course Details

Explore Azure Databricks

  • Get started with Azure Databricks
  • Identify Azure Databricks workloads
  • Understand key concepts of Azure Databricks
  • Explore data governance using Unity Catalog and Microsoft Purview
  • Hands-on exercise: Explore Azure Databricks features and tools

Perform data analysis with Azure Databricks

  • Ingest data into Azure Databricks from various sources
  • Use data exploration tools in Azure Databricks
  • Perform data analysis using DataFrame APIs
  • Clean and preprocess data for analysis
  • Hands-on exercise: Explore data using Azure Databricks

Use Apache Spark in Azure Databricks

  • Introduction to Apache Spark and its capabilities
  • Create and configure a Spark cluster in Azure Databricks
  • Use Spark in notebooks for data analysis
  • Load and process data files with Spark
  • Visualize data using Spark DataFrames and plots
  • Hands-on exercise: Use Apache Spark in Azure Databricks

Manage data with Delta Lake

  • Introduction to Delta Lake and its benefits
  • Manage ACID transactions in Delta Lake
  • Implement schema enforcement in Delta Lake
  • Use data versioning and time travel features in Delta Lake
  • Ensure data integrity with Delta Lake
  • Hands-on exercise: Work with Delta Lake in Azure Databricks

Build data pipelines with Delta Live Tables

  • Explore Delta Live Tables and its features
  • Ingest and integrate data into Delta Live Tables pipelines
  • Process real-time data with Delta Live Tables
  • Build scalable and automated data pipelines
  • Hands-on exercise: Create a data pipeline with Delta Live Tables

Deploy workloads with Azure Databricks Workflows

  • Introduction to Azure Databricks Workflows and its components
  • Understand the key components and benefits of Databricks Workflows
  • Deploy workloads using Azure Databricks Workflows
  • Automate repetitive tasks and processes
  • Hands-on exercise: Create and deploy a workflow in Azure Databricks

Train a machine learning model in Azure Databricks

  • Understand the principles and components of machine learning
  • Machine learning in Azure Databricks and its integration with Spark
  • Prepare and preprocess data for machine learning
  • Train and evaluate a machine learning model in Azure Databricks
  • Hands-on exercise: Train a machine learning model

Use MLflow in Azure Databricks

  • Overview of MLflow and its capabilities
  • Run and manage machine learning experiments with MLflow
  • Register and serve machine learning models with MLflow
  • Track model performance and manage experiment metadata
  • Hands-on exercise: Use MLflow for model management in Azure Databricks

Tune hyperparameters in Azure Databricks

  • Introduction to hyperparameter optimization with Hyperopt
  • Optimize hyperparameters for machine learning models
  • Review Hyperopt trials and results
  • Scale Hyperopt trials for better performance
  • Hands-on exercise: Optimize hyperparameters using Hyperopt

Use AutoML in Azure Databricks

  • Introduction to AutoML and its use cases
  • Use the Azure Databricks AutoML user interface for model training
  • Run AutoML experiments with code
  • Evaluate AutoML models and select the best one
  • Hands-on exercise: Use AutoML for model training in Azure Databricks

Train deep learning models in Azure Databricks

  • Understand deep learning concepts and frameworks
  • Train deep learning models using PyTorch in Azure Databricks
  • Distribute PyTorch training using TorchDistributor
  • Use GPUs for efficient deep learning model training
  • Hands-on exercise: Train deep learning models on Azure Databricks

Manage machine learning in production with Azure Databricks

  • Automate data transformations in machine learning workflows
  • Explore model development best practices
  • Learn deployment strategies for machine learning models
  • Implement model versioning and lifecycle management
  • Hands-on exercise: Manage a machine learning model in production