Clusters are used to store and analyze large volumes of data in a distributed computer environment. This course outlines the best practices to follow when implementing clusters in Hadoop.
- start the course
- configure an Ubuntu server for ssh and Java for Hadoop
- set up Hadoop on a single node
- set up Hadoop on four nodes
- describe the different cluster configurations, including single-rack deployments, three-rack deployments, and large-scale deployments
- add a new node to an existing Hadoop cluster
- format HDFS and configure common options
- run an example mapreduce job to perform a word count
Practice: Clusters in Hadoop
- start a Hadoop cluster and run a mapreduce job
The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.
If you would like to provide feedback for this course, please e-mail the NICCS SO at NICCS@hq.dhs.gov.