• Online, Self-Paced
Course Description

Clusters are used to store and analyze large volumes of data in a distributed computer environment. This course outlines the best practices to follow when implementing clusters in Hadoop.

Learning Objectives

Clusters

  • start the course
  • configure an Ubuntu server for ssh and Java for Hadoop
  • set up Hadoop on a single node
  • set up Hadoop on four nodes
  • describe the different cluster configurations, including single-rack deployments, three-rack deployments, and large-scale deployments
  • add a new node to an existing Hadoop cluster
  • format HDFS and configure common options
  • run an example mapreduce job to perform a word count

Practice: Clusters in Hadoop

  • start a Hadoop cluster and run a mapreduce job

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.