This course gives business analysts and data scientists a seamless platform to profile, integrate, cleanse, and move big data without writing code in a Hadoop environment using an intuitive web-based interface.
- move data in and out of Hadoop
- interrogate and profile data for quality issues
- transform, transpose, and join data that is fit-for-purpose
- cleanse and integrate data suitable for analysis and reporting
- perform master data management activities of record clustering and survivorship in Hadoop
- load data into the SAS In-Memory Analytics Server for analytics and exploration
- execute custom SAS and HiveQL code inside the Hadoop cluster
- chain custom-built data management flows into re-useable jobs.
Who Can Benefit
- Business users who interact with data, perform data discovery, query data, and ensure that data is in the proper place and format for other users; data analysts, data scientists, and statisticians who review results of data discovery activities, create new tables, create new data elements, change the format/structure of data tables to view them in a variety of ways, manipulate and score data elements, and load data for use by other users; and data management specialists who apply enterprise standards to the data, ensure data quality throughout the enterprise, move data into and out of the Hadoop cluster, and optimize code running in the Hadoop cluster
- There are currently no prerequisites for this course.