In this course you will learn about performing data analysis using Spark SQL and Hive. It is one in a series of courses that prepares learners for exam 70-775: Perform Data Engineering on Microsoft Azure HDInsight.
Learning Objectives
Data Analysis using Spark SQL
- start the course
- describe Jupyter and Apache Zeppelin
- merge DataFrames using Spark SQL
- describe Apache Parquet
- manage interactive Livy sessions
Data Analysis using Hive
- describe what interactive querying is and how its used with Hive
- use Ambari Views
- use HiveOL
- describe how to parse files such as CSV files with Hive
- use ORC for caching
- use Hive tables
- use Zeppelin to visualize data
Practice: Using Spark Data Analysis
- use data analysis for Spark SQL