Data Lake: Architectures & Data Management Principles from Skillsoft

Online, Self-Paced

Discover how to implement data lakes for real-time data management. Explore data ingestion, data processing, and data life-cycle management using AWS and other open-source ecosystem products.

Learning Objectives

Data Lake: Architectures & Data Management Principles

implement Lambda and Kappa architectures to manage real-time big data
identify the benefits of adopting Zaloni data lake reference architecture
describe data ingestion approaches and compare Avro and Parquet file format benefits
demonstrate how to ingest data using Sqoop
describe the data processing strategies provided by MapReduce V2, Hive, Pig, and Yam for processing data with data lakes
recognize how to derive value from data lakes and describe the benefits of critical roles
describe the steps involved in the data life cycle and the significance of archival policies
implement an archival policy to transition between S3 and Glacier, depending on adopted policies
ingest data using Sqoop and implement an archival policy to transition from S3 to adopted policies