Automating Data Pipelines & Workflow

Learn how to automate your entire pipeline using an automation tool. In this course students will learn how to programatically author, schedule and monitor their workflows. Students will also learn how to create an environment to containerize, replicate and deploy a pipeline.

8 hours of instruction

Learn how to automate your entire pipeline using an automation tool. In this course students will learn how to programatically author, schedule and monitor their workflows. Students will also learn how to create an environment to containerize, replicate and deploy a pipeline.

OBJECTIVES

  1. Navigate through the growing landscape of automation tools for DataOps and MLOps
  2. Acquire foundational knowledge of Airflow components
  3. Set up a simple Airflow pipeline
  4. Make distinctions between various Airflow set up options on cloud platforms
  5. Create and test a data pipeline using Airflow with Docker on a compute instance

PREREQUISITES

Introduction to MLOps Theory

SYLLABUS & TOPICS COVERED

  1. Exploring Dev/Data/MLOps and Apache Airflow concepts
    • Identify steps of the data science life cycle for automation
    • Describe DevOps
    • DataOps and MLOps
    • Name open source data pipeline automation tools
    • Describe Apache Airflow and its use cases
    • Learning components of Airflow and automating a simple workflow locally
  2. Explain DAGs and operators in Apache Airflow
    • Describe the main components of Apache Airflow
    • Setup an environment for Airflow
    • Create the first DAG
    • Further explore Airflow UI and access the task logs
    • Working with different types of Executors
  3. Run a DAG with SequentialExecutor
    • Run a DAG with LocalExecutor
    • Run a DAG with CeleryExecutor
    • Monitor clusters for CeleryExecutor using Flower
    • Running Airflow in Docker
  4. Describe Docker and its components
    • Explore Docker image and container
    • Work with docker-compose

SOFTWARE REQUIREMENTS

Apache Airflow, Docker, Terminal, VS Code

About Instructor

OpenTeams

56 Courses

Not Enrolled
This course is currently closed