Clusters are used to store and analyze large volumes of data in a distributed computer environment. This course outlines the best practices to follow when implementing clusters in Hadoop.
start the course
configure an Ubuntu server for ssh and Java for Hadoop
set up Hadoop on a single node
set up Hadoop on four nodes
describe the different cluster configurations, including single-rack deployments, three-rack deployments, and large-scale deployments
add a new node to an existing Hadoop cluster
format HDFS and configure common options
run an example mapreduce job to perform a word count
Practice: Clusters in Hadoop
start a Hadoop cluster and run a mapreduce job
The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.