Apache Beam, Cloud Dataflow, and Cloud Dataprep can be used to create data pipelines. In this course, you will learn how areas of Beam, Apache Beam SDK, Cloud Dataflow, and Cloud Dataprep assist in pipeline management.
Expressing Data Pipelines
start the course
define Apache Beam concepts and SDKs
describe the Python SDK and its connection with data pipelines
describe the Java SDK and its connection with data pipelines
initialize Cloud Dataprep
demonstrate how to ingest data into a pipeline
create recipes in a Cloud Dataprep pipeline
work with the import/export process and demonstrate how to run Dataflow jobs in Cloud Dataprep
Big Data Processing
describe MapReduce and the benefits of Cloud Dataflow over MapReduce
outline serverless architecture and some of the GCP products supporting data analytics
Practice: Create and Manage Pipelines
describe the use of Apache Beam, Cloud Dataflow, and Cloud Dataprep in GCP to create and manage pipelines
The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.