• Online, Self-Paced
Course Description

There are important decisions you must make to ensure network, disks, and hosts are configured correctly when deploying a Hadoop Cluster. This course will walk you through all of the steps to install Hadoop in a pseudo-distributed mode and the set up of some of the common open source software used to create a Hadoop Ecosystem. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.

Learning Objectives

Configuration Management Tools

  • start the course
  • describe the configurations management tools
  • simulate a configuration management tool

Create Configuration Items

  • build an image for a baseline server
  • build an image for a DataServer
  • build an image for a master server

Setup a CM Environment

  • provision an admin server

Deploy a Hadoop Cluster

  • describe the layout and structure of the Hadoop cluster
  • provision a Hadoop cluster
  • distribute configuration files and admin scripts
  • use init scripts to start and stop a Hadoop cluster
  • configure a Hadoop cluster
  • configure logging for the Hadoop cluster
  • build images for required servers in the Hadoop cluster
  • configure a MySQL database
  • build the Hadoop clients
  • configure Hive daemons
  • test the functionality of Flume, Sqoop, HDFS, and MapReduce
  • test the functionality of Hive and Pig
  • configure Hcatalog daemons
  • configure Oozie
  • configure Hue and Hue users

Practice: Configure the Admin Server

  • install Hadoop on to the admin server

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.

Feedback

If you would like to provide feedback for this course, please e-mail the NICCS SO at NICCS@hq.dhs.gov.