Discover how to work with Spark and its in-memory capabilities of data management. How to manage and troubleshoot HDInsight clusters using Ambari and the Azure CLI tool is also covered.
Data Warehousing with Hadoop: Spark, HDInsight and Cluster Management
- specify the essential capabilities of Spark and its essential architectural components
- list the data structures along with the RDD and lineage concepts that are used in Spark
- set up Spark clusters using PowerShell and Azure Resource Manager template
- describe the relationship between Spark SQL and Hive
- specify the essential concepts of Spark SQL and DataFrame
- demonstrate the approach of customizing HDInsight clusters using bootstrap
- install Hadoop applications on Azure HDInsight
- illustrate the usage of Ambari as a tool in order to manage clusters
- manage Hadoop clusters in HDInsight using Azure CLI
- specify the approach of troubleshooting and tuning HDInsight clusters
- monitor Hadoop clusters in HDInsight to collect metrics for analysis
- set up Spark clusters and manage the clusters using Ambari GUI
If you would like to provide feedback for this course, please e-mail the NICCS SO at NICCS@hq.dhs.gov.