Hadoop Clusters from Skillsoft

Online, Self-Paced

Clusters are used to store and analyze large volumes of data in a distributed computer environment. This course outlines the best practices to follow when implementing clusters in Hadoop.

Learning Objectives

Clusters

start the course
configure an Ubuntu server for ssh and Java for Hadoop
set up Hadoop on a single node
set up Hadoop on four nodes
describe the different cluster configurations, including single-rack deployments, three-rack deployments, and large-scale deployments
add a new node to an existing Hadoop cluster
format HDFS and configure common options
run an example mapreduce job to perform a word count

Practice: Clusters in Hadoop