Oracle Big Data Fundamentals (Training On Demand)

Course Details
Code: D100484GC20
Tuition (USD): $3,400.00 $3,230.00 • Self Paced (5 days)
Generate a quote
This course is available in other formats
Instructor-Led Classroom & Virtual
Oracle Big Data Fundamentals Ed 2 (D86898GC20)

In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big Data Solution to acquire, process, integrate and analyze big data. In this course, you will be introduced to Oracle Big Data Cloud Service.

Learn To:

  • Define Big Data.
  • Describe Oracle's Integrated Big Data Solution and its components.
  • Define the Hadoop Ecosystem and Cloudera's Distribution Including Apache Hadoop (CDH).
  • Use the Hadoop Distributed File System (HDFS)to store, distribute, and replicate data across the nodes in the Hadoop cluster.
  • Acquire big data using the HDFS Command Line Interface, Flume, and Oracle NoSQL Database.
  • Use MapReduce and YARN for distributed processing of the data stored in the Hadoop cluster.
  • Process big data using MapReduce, YARN, Hive, Pig, Oracle XQuery for Hadoop, Solr, and Spark.
  • Integrate big data and warehouse data using Scoop, Oracle Big Data Connectors, Copy to BDA, Oracle Big Data SQL, Oracle Data Integrator, and Oracle GoldenGate.
  • Analyze big data using Oracle Big Data SQL, Oracle Advanced Analytics technologies, and Oracle Big Data Discovery.
  • Use and manage Oracle Big Data Appliance.
  • Secure your data.
  • Understand Oracle Big Data Cloud Service: Key Features & Benefits.

Benefits To You

Increase your Big Data technology portfolio by learning to use a wide range of big data acquisition, processing, integration, and analysis techniques. In addition, you learn about Oracle’s engineered systems for Big Data, which provide a variety of data integration and analysis capabilities. Analysis options include Oracle Big Data SQL, Oracle Data Mining, Oracle R Enterprise, and Oracle Big Data Discovery.

Benefit from a hands-on, case-study approach while learning about Oracle’s Integrated Big Data Solution.

Skills Gained

  • Review Oracle’s Big Data Management Architecture and Engineered Systems
  • Define Big Data
  • Identify Big Data Use Cases
  • Define the Hadoop ecosystem and its components
  • Examine MapReduce programs and balance MapReduce jobs
  • Use Oracle NoSQL Database
  • Use Oracle XQuery for Hadoop
  • Understand Oracle Big Data Cloud Service: Key Features & Benefits
  • Install, use, and administer the Oracle Big Data Appliance
  • Provide data security and enable resource management
  • Examine MapReduce programs and balance MapReduce jobs
  • Use the Oracle BigDataLite Virtual Machine

Who Can Benefit

  • Administrator
  • Database Administrator
  • Developer

Course Details

Topics

  • Introduction
  • Big Data and the Oracle Information Management System
  • Using Oracle Big Data Lite Virtual Machine
  • Introduction to the Big Data Ecosystem
  • Introduction to the Hadoop Distributed File System (HDFS)
  • Acquire Data using CLI, Fuse-DFS, and Flume
  • Using and Administering Oracle NoSQL Database
  • Introduction to MapReduce
  • Using YARN to Manage Resources
  • Overview of Apache Hive and Apache Pig
  • Overview of Cloudera Impala, Solr, and Apache Spark
  • Using Oracle XQuery for Hadoop
  • Options for Integrating Your Big Data
  • Using Oracle Big Data SQL
  • Using Oracle Advanced Analytics
  • Introducing Oracle Big Data Discovery
  • Using the Oracle Big Data Appliance (BDA)
  • Managing the Oracle Big Data Appliance
  • Balancing MapReduce Jobs
  • Securing Your Data on the BDA
  • Introduction to Oracle Big Data Cloud Service (BDCS)
  • Questions About You
  • Course Objectives
  • Course Road Map
  • Oracle Big Data Lite (BDLite) Virtual Machine (VM) Home Page
  • Starting the Oracle BDLite VM and accessing the Practice Files
  • Reviewing the Available Big Data Documentation, Tutorials, and Other Resources

Introducing Oracle Big Data Strategy

  • Characteristics of Big Data
  • Importance of Big Data
  • Big Data Opportunities: Some Examples
  • Big Data Challenges
  • Big Data implementation examples
  • Oracle strategy for Big Data: combining Big Data Processing Engines: Hadoop / NoSQL / RDBMS

Using Oracle Big Data Lite Virtual Machine and Movieplex Application

  • Oracle Big Data Lite VM Used in this Course
  • Oracle Big Data Lite VM Home Page Sections
  • Reviewing the Deployment Guide
  • Downloading and installing Oracle VM VirtualBox and its Extension Pack
  • Downloading and Running 7-zip Files to create Virtual Box Appliance File
  • Importing the Appliance File
  • Staring the Big Data Lite VM and Starting and Stopping Services
  • Introducing the Oracle Movieplex Case Study

Introduction to the Big Data Ecosystem

  • Computer Clusters and Distributed Computing
  • Apache Hadoop
  • Types of Analysis That Use Hadoop
  • Types of Data Generated
  • Apache Hadoop Core Components: HDFS, MapReduce (MR1), and YARN (MR2)
  • Apache Hadoop Ecosystem
  • Cloudera’s Distribution Including Apache Hadoop (CDH)
  • CDH Architecture and Components

Introduction to the Hadoop Distributed File System

  • Hadoop Distributed Filesystem (HDFS) Design Principles, Characteristics, and Key Definitions
  • Sample Hadoop High Availability (HA) Cluster
  • HDFS Files and Blocks
  • Active and Standby Daemons (Services) Functions
  • DataNodes (DN) Daemons Functions
  • Writing a File to HDFS: Example
  • Interacting With Data Stored in HDFS: Hue, Hadoop Client, WebHDFS, and HttpFS

Acquire Data using CLI, Fuse, Flume, and Kafka

  • Reviewing the Command Line Interface (CLI)
  • Viewing File System Contents Using the CLI
  • FS Shell Commands
  • Loading Data Using the CLI
  • Overview of FuseDFS
  • What is Flume
  • Kafka topics
  • Additional Resources

Acquire and Access Data Using Oracle NoSQL Database

  • What is a NoSQL Database
  • RDBMS Compared to NoSQL
  • HDFS Compared to NoSQL
  • Define Oracle NoSQL Database
  • Oracle NoSQL models: Key-Value and Table
  • Acquiring and Accessing Data in a NoSQL DB
  • Accessing the CLIs (Data, Admin, SQL)
  • Accessing the KVStore

Introduction to MapReduce and YARN Processing Frameworks

  • MapReduce Framework Features, Benefits, and Jobs
  • Parallel Processing with MapReduce
  • Word Count Examples
  • Data Locality Optimization in Hadoop
  • Submitting and Monitoring a MapReduce Job
  • YARN Architecture, Features, and Daemons
  • YARN Application Workflow
  • Hadoop Basic Cluster: MapReduce 1 Versus YARN (MR 2)

Resource Management Using Yarn

  • Job Scheduling in YARN
  • First In, First Out (FIFO) Scheduler, Capacity Scheduler, and Fair Scheduler
  • Cloudera Manager Resource Management Features
  • Static Service Pools
  • Working with the Fair Scheduler
  • Cloudera Manager Dynamic Resource Management: Example
  • Submitting and Monitoring a MapReduce Job Using YARN
  • Using the YARN application Command

Overview of Apache Spark

  • Benefits of Using Spark
  • Spark Architecture
  • Spark Application Components: Driver, Master, Cluster Manager, and Executors
  • Running a Spark Application on YARN (yarn-cluster Mode)
  • Resilient Distributed Dataset (RDD)
  • Spark Interactive Shells: spark-shell and pyspark
  • Word Count Example by Using Interactive Scala
  • Monitoring Spark Jobs Using YARN's ResourceManager Web UI

Overview of Apache Hive

  • What is Hive
  • Use Case: Storing Clickstream Data
  • Hadoop Architecture
  • How is Data Stored in HDFS
  • Organizing and Describing Data With Hive
  • Big Data SQL on Top of Hive Data
  • Defining Tables Over HDFS
  • Hive Queries

Overview of Cloudera Impala

  • Overview of Cloudera Impala
  • Hadoop: Some Data Access/Processing Options
  • Cloudera Impala
  • Cloudera Impala: Key Features
  • Cloudera Impala: Supported Data Formats
  • Cloudera Impala: Programming Interfaces
  • How Impala Fits Into the Hadoop Ecosystem
  • How Impala Works with Hive

Overview of Solr

  • Overview of Solr
  • Apache Solr (Cloudera Search)
  • Cloudera Search: Key Capabilities
  • Cloudera Search: Features
  • Cloudera Search Tasks
  • Indexing in Cloudera Search
  • Types of Indexing
  • The solrctl Command

Integrating Your Big Data

  • Unifying Data: A Typical Requirement
  • Comparing Big Data Processing Engines
  • Introducing Data Unification Options
  • When To Use These Options

Batch Loading Options

  • Apache Sqoop
  • Oracle Loader for Hadoop
  • Oracle Copy to Hadoop

Using Oracle Data Integrator and Oracle GoldenGate for Big Data

  • ETL and Synchronization: Oracle Data Integrator
  • ODI’s Declarative Design
  • ODI Knowledge Modules (KMs)Simpler Physical Design / Shorter Implementation Time
  • Using ODI with Big Data Heterogeneous Integration with Hadoop Environments
  • Using ODI Studio
  • ODI Studio Components: Overview
  • ODI Studio: Big Data Knowledge Modules
  • Oracle GoldenGate for Big Data

Using Oracle Big Data SQL

  • Barriers to Effective Big Data Adoption
  • Overcoming Big Data Barriers
  • Oracle Big Data SQL: The Hybrid Solution
  • Benefits: Virtualizes data access across Oracle Database, Hadoop and NoSQL stores
  • Using Oracle Big Data SQL
  • Query Performance Overview
  • Deployment Options

Using Oracle Big Data Spatial and Graph

  • Graph and Spatial Analysis: All About Relationships
  • What is Oracle Big Data Spatial and Graph (BDSG)
  • Strategy (supported platforms, etc)
  • BDSG: Graph Analysis
  • Oracle BDSG: Spatial Analysis
  • Multimedia Analytics Framework
  • Deployment Options for Oracle BDSG
  • Additional Resources

Using Oracle Advanced Analytics

  • Oracle Advanced Analytics (OAA)
  • OAA: Oracle Data Mining
  • OAA: Oracle R Enterprise

Oracle Big Data Deployment Options

  • Introduction to the Oracle Big Data Appliance
  • Running the Oracle BDA Configuration Generation Utility
  • Oracle BDA Mammoth Software Deployment Bundle
  • Using the Oracle BDA mammoth Utility
  • BDA Hardware and Integrated and Optional Software
  • Administering and Securing the Oracle BDA
  • Introduction to the Oracle Big Data Cloud Service
  • Introduction to the Oracle Big Data Cloud Service – Compute Edition

Using Oracle Advanced Analytics

  • Oracle Advanced Analytics (OAA)
  • OAA: Oracle Data Mining
  • OAA: Oracle R Enterprise

Oracle Big Data Deployment Options

  • Introduction to the Oracle Big Data Appliance
  • Running the Oracle BDA Configuration Generation Utility
  • Oracle BDA Mammoth Software Deployment Bundle
  • Using the Oracle BDA mammoth Utility
  • BDA Hardware and Integrated and Optional Software
  • Administering and Securing the Oracle BDA
  • Introduction to the Oracle Big Data Cloud Service
  • Introduction to the Oracle Big Data Cloud Service – Compute Edition
Contact Us 1-800-803-3948
Contact Us Live Chat
FAQ Get immediate answers to our most frequently asked qestions. View FAQs arrow_forward