7904  Reviews star_rate star_rate star_rate star_rate star_half

Big Data Analytics With Hadoop

Apache Hadoop is a popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads into the traditional BI analytics world. This course will...

Read More
Course Code ES-BIGDATA-HADOOP
Duration 2 days
Available Formats Classroom

Apache Hadoop is a popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads into the traditional BI analytics world. This course will introduce an analyst to the core components of the Hadoop ecosystem and its analytics

Skills Gained

  • Understanding Hadoop ecosystem
  • Data storage using HDFS
  • Data warehousing and querying using Hive

Who Can Benefit

  • Business Analysts, Developers

Prerequisites

  • programming background with databases / SQL
  • basic knowledge of Linux

Course Details

Hadoop ecosystem

Hadoop overview

  • distributions
  • high level architecture
  • hardware / software
  • Labs : first look at Hadoop

HDFS Overview

  • concepts (horizontal scaling, replication, data locality)
  • architecture (Namenode, Data node)
  • Demo : Interacting with HDFS

YARN Overview

  • YARN operating system
  • Demo : Running applications on YARN program

Hive

  • hive concepts & architecture
  • SQL support in Hive
  • Data warehousing in Hive
  • data types
  • table creation and queries
  • partitions
  • joins
  • modern data formats
  • text analytics
  • Hive performance
  • labs (multiple)