Live Webinar - Cybersecurity Career Advancement & Protecting Organizations

closeClose

Mastering R for Data Scientists

Course Details
Code: TTC01S5-R
Tuition (USD): $2,595.00 • Classroom (5 days)

R is a functional programming environment for business analysts and data scientists. It's a language that many non-programmers can easily work with, naturally extending a skill set that is common to high-end Excel users. It's the perfect tool for when the analyst has a statistical, numerical, or probabilities-based problem based on real data and they've pushed Excel past its limits. Geared for data scientists or engineers with potentially light technical background or experience, Mastering R for Data Scientists is a hands-on R course that explores common scenarios that are encountered in analysis, and presents practical solutions to those challenges. Throughout the course, special attention is paid to data science theory including AI grouping theory. A discussion of using R with AI libraries like Madlib is also included. Students who want a shorter, more basic introduction to R might consider our 3-day Introduction to R for Data Scientists.

  • Introduction to the R Environment
  • Going from Excel to R
  • Simple math with R
  • How and when to use and apply vectors
  • Manipulating text
  • Formatting dates; manipulating time and operations
  • How to work with multiple dimensions
  • Working with R with Madlib / AI libraries
  • Techniques in Data Visualization
  • Overview of Hadoop and related technologies, and where R plays a role
  • Rule Systems in the Enterprise; ESBs, working with Drools & more

Who Can Benefit

While there are no specific technical prerequisites, students should have had prior exposure to working with statistics and probability, as well as good hands-on working knowledge of Excel. We will collaborate with you to design the best solution to ensure your needs are met, whether we customize the material, or devise a different educational path to help your team best prepare for this training.

Prerequisites

Take Instead: We offer other courses that provide different levels of knowledge or focus:

  • TTCR01-BA35 R for Business Analysts (3 to 5 days)
  • TTCR01-DS3 Introduction to R for Data Scientists (3 days)
  • TTCR01-DS5 Mastering R for Data Scientists (5 days) (Superset of the 3 day with additional topics and labs)

Course Details

Session: From Excel to R

  • Common problems with Excel
  • The R Environment
  • Hello, R

Session: R Basics

  • Simple Math with R
  • Working with Vectors
  • Functions
  • Comments and Code Structure
  • Using Packages

Session: Vectors

  • Vector Properties
  • Creating, Combining, and Iterating
  • Passing and Returning Vectors in Functions
  • Logical Vectors

Session: Reading and Writing

  • Text Manipulation
  • Factors

Session: Dates

  • Working with Dates
  • Date Formats and formatting
  • Time Manipulation and Operations

Session: Multiple Dimensions

  • Adding a second dimension
  • Indices and named rows and columns in a Matrix
  • Matrix calculation
  • n-Dimensional Arrays
  • Data Frames
  • Lists

Session: R in Data Science

  • AI Grouping Theory
  • K-means
  • Linear Regression
  • Logistic Regression
  • Elastic Net

Session: R with MadLib

  • Importing and Exporting static Data (CSV, Excel)
  • Using Libraries with CRAN
  • K-means with Madlib
  • Regression with Madlib
  • Other libraries

Session: Data Visualization

  • Powerful Data through Visualization: Communicating the Message
  • Techniques in Data Visualization
  • Data Visualization Tools
  • Examples

Session: R with Hadoop

  • Overview of Hadoop
  • Overview of Distributed Databases
  • Overview of Pig
  • Overview of Mahout
  • Exploiting Hadoop clusters with R
  • Hadoop, Mahout, and R

Session: Business Rule Systems

  • Rule Systems in the Enterprise
  • Enterprise Service Busses
  • Drools
  • Using R with Drools