When does class start/end?
Classes begin promptly at 9:00 am, and typically end at 5:00 pm.
Business success in the information age is predicated on the ability of organizations to convert raw data coming from various sources into high-grade business information. To stay competitive,...
Read MoreBusiness success in the information age is predicated on the ability of organizations to convert raw data coming from various sources into high-grade business information. To stay competitive, organizations have started adopting new approaches to data processing and analysis. For example, data scientists are turning to Apache Spark for processing massive amounts of data using Spark’s distributed compute capability along with its built-in machine learning library, or switching from proprietary and costly solutions to the free R programming language.
Data Scientists, Software Developers, IT Architects, and Technical Managers
Participants should have general knowledge of statistics and programming.
Chapter 1 - Data Science Algorithms and Analytical Methods
Chapter 2 - Getting Started with R
Chapter 3 - Text Mining
Chapter 4 - Introduction to Functional Programming
Chapter 5 - What is NoSQL?
Chapter 6 - MapReduce Overview
Chapter 7 - Hadoop Overview
Chapter 8 - Hadoop Distributed File System Overview
Chapter 9 - MapReduce with Hadoop
Chapter 10 - Apache Pig Scripting Platform
Chapter 11 - Apache Pig Relational and Eval Operators
Chapter 12 - Hive
Chapter 13 - Hive Command-line Interface
Chapter 14 - Hive Data Definition Language
Chapter 15 - Apache Sqoop
Chapter 16 - Introduction to Apache Spark
Chapter 17 - The Spark Shell
Chapter 18 - Spark RDDs
Chapter 19 - Parallel Data Processing with Spark
Chapter 20 - Shared Variables in Spark
Chapter 21 - Introduction to Spark SQL
Chapter 23 - The Spark Machine Learning Library
Chapter 24 - Machine Learning with BigML