Machine Learning Essentials with Python is a foundation-level, three-day hands-on course that teaches students core skills and concepts in modern machine learning practices. This course is geared for attendees experienced with Python, but new to machine learning, who need introductory level coverage of these topics, rather than a deep dive of the math and statistics behind Machine Learning. Students will learn basic algorithms from scratch. For each machine learning concept, students will first learn about and discuss the foundations, its applicability and limitations, and then explore the implementation and use, reviewing and working with specific use cases.
Skills Gained
- Getting Started & Optional Python Quick Refresher
- Statistics and Probability Refresher and Python Practice
- Probability Density Function; Probability Mass Function; Naive Bayes
- Predictive Models
- Machine Learning with Python
- Recommender Systems
- KNN and PCA
- Reinforcement Learning
- Dealing with Real-World Data
- Experimental Design / ML in the Real World
- Time Permitting: Deep Learning and Neural Networks
Who Can Benefit
This course is geared for attendees with solid Python skills who wish to learn and use basic machine learning algorithms and concepts.
Prerequisites
Students should have attended or have incoming skills equivalent to those in this course:
- Basic Python Skills. Attendees without Python background may view labs as follow along exercises or team with others to complete them. (NOTE: This course is also offered in R or Scala – please inquire for details)
- Good foundational mathematics skills in Linear Algebra and Probability, to start learning about and using basic machine learning algorithms and concepts
- Basic Linux skills, including familiarity with command-line options such as ls, cd, cp, and su
Course Agenda
Getting Started
- Installation: Getting Started and Overview
- LINUX jump start: Installing and Using Anaconda & Course Materials (or reference the default container)
- Python Refresher
- Introducing the Pandas, NumPy and Scikit-Learn Library
Statistics and Probability Refresher and Python Practice
- Types of Data
- Mean, Median, Mode
- Using mean, median, and mode in Python
- Variation and Standard Deviation
Probability Density Function; Probability Mass Function; Naive Bayes
- Common Data Distributions
- Percentiles and Moments
- A Crash Course in matplotlib
- Advanced Visualization with Seaborn
- Covariance and Correlation
- Conditional Probability
- Naive Bayes: Concepts
- Bayes’ Theorem
- Naive Bayes
- Spam Classifier with Naive Bayes
Predictive Models
- Linear Regression
- Polynomial Regression
- Multiple Regression, and Predicting Car Prices
- Logistic Regression
- Logistic Regression
- LDA : Linear Discriminant Analysis
Machine Learning with Python
- Supervised vs. Unsupervised Learning, and Train/Test
- Using Train/Test to Prevent Overfitting
- Understanding a Confusion Matrix
- Measuring Classifiers (Precision, Recall, F1, AUC, ROC)
- K-Means Clustering
- K-Means: Clustering People Based on Age and Income
- Measuring Entropy
- LINUX: Installing GraphViz
- Decision Trees: Concepts
- Decision Trees: Predicting Hiring Decisions
- Ensemble Learning
- Support Vector Machines (SVM) Overview
- Using SVM to Cluster People using scikit-learn
Recommender Systems
- User-Based Collaborative Filtering
- Item-Based Collaborative Filtering
- Finding Similar Movie
- Better Accuracy for Similar Movies
- Recommending movies to People
- Improving your recommendations
KNN and PCA
- K-Nearest-Neighbors: Concepts
- Using KNN to Predict a Rating for a Movie
- Dimensionality Reduction; Principal Component Analysis (PCA)
- PCA with the Iris Data Set
Reinforcement Learning
- Reinforcement Learning with Q-Learning and Gym
Dealing with Real-World Data
- Bias / Variance Tradeoff
- K-Fold Cross-Validation
- Data Cleaning and Normalization
- Cleaning Web Log Data
- Normalizing Numerical Data
- Detecting Outliers
- Feature Engineering and the Curse of Dimensionality
- Imputation Techniques for Missing Data
- Handling Unbalanced Data: Oversampling, Undersampling, and SMOTE
- Binning, Transforming, Encoding, Scaling, and Shuffling
Experimental Design / ML in the Real World
- Deploying Models to Real-Time Systems
- A/B Testing Concepts
- T-Tests and P-Values
- Hands-on With T-Tests
- Determining How Long to Run an Experiment
- A/B Test Gotchas
Capstone Project
- Group Project & Presentation or Review
Optional: Time Permitting
Deep Learning and Neural Networks
- Deep Learning Prerequisites
- The History of Artificial Neural Networks
- Deep Learning in the TensorFlow Playground
- Deep Learning Details
- Introducing TensorFlow
- Using TensorFlow
- Introducing Keras
- Using Keras to Predict Political Affiliations
- Convolutional Neural Networks (CNN’s)
- Using CNN’s for Handwriting Recognition
- Recurrent Neural Networks (RNN’s)
- Using an RNN for Sentiment Analysis
- Transfer Learning
- Tuning Neural Networks: Learning Rate and Batch Size Hyperparameters
- Deep Learning Regularization with Dropout and Early Stopping
- The Ethics of Deep Learning
- Learning More about Deep Learning