The Extended Learning page for this course includes the option to purchase Virtual Lab time to practice.
- Move data into the Hadoop ecosystem.
- Use Hive to design a data warehouse in Hadoop.
- Perform data analysis using Hive Query Language.
- Join data sources.
- Perform extract, load, and transformation.
- Organize data in Hadoop by usage.
- Perform analysis on unstructured data using Apache Pig.
- Join massive data sets using Pig.
- Use user-defined functions (UDFs).
- Analyze big data in Hadoop using Hive and Pig.
- Use SAS programming to submit Hive and Pig programs that execute in Hadoop and store results in Hadoop or return results to SAS.
- Use SAS programming to move data between the SAS server and the Hadoop Distributed File System (HDFS).
- Construct SAS Data Integration Studio jobs that integrate with Hive and Pig processes and the HDFS.
Who Can Benefit
- Data scientists and programmers, database administrators, applications developers, and ETL developers who are looking for an in-depth technical overview of data management and extraction for big data and the Hadoop ecosystem
- A basic understanding of and experience with UNIX and SQL is preferred. For advanced topics such as user-defined functions, prior programming experience is necessary.