• Online, Self-Paced
Course Description

Explore the Microsoft Analytics Platform System and using Hive to manage data from a data warehouse perspective.

Learning Objectives

Data Warehousing with Hadoop: Microsoft Analytics Platform System and Hive

  • illustrate capabilities, features, and objectives of the Microsoft Analytics Platform System
  • specify how to manage data using PolyBase and the various essential benefits provided by PolyBase
  • identify the role of parallel data warehousing architecture in Microsoft Analytics Platform System
  • recall the various data exploration architectures that can be implemented using HDInsight and the Microsoft Analytics Platform System
  • describe the role of Hive as a data warehouse system for Hadoop
  • describe the architectural composition of Hive in HDInsight
  • set up the development environment for Hive using the Azure HDInsight tool for VSCode
  • connect and submit queries to HDInsight clusters using VSCode
  • specify the various clauses that can be used in Hive Query Language to manage objects and query data
  • work with Azure PowerShell and Beeline to execute Hive Query Language queries
  • create a database, tables, and load data to Hive tables from the Azure Blob Storage and SQL Servers
  • work with partition tables and manage Hive data formats
  • demonstrate how to install Hue and manage Hive queries from the Hue interface
  • demonstrate the approaches involved in retrieving Hive data and creating visualization on Power BI
  • work with HIVE as an ETL tool
  • compare HBase and Hive from the data modeling perspective
  • create a Hive table and load data from an external SQL Server

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.