Airflow CLI

Airflow CLI cheatsheet — trigger DAGs, backfill runs, pause/unpause, check task state. airflow dags trigger, airflow tasks run — every command with examples.

7 min read

What it is

The Airflow command-line interface (CLI) for managing and interacting with your Airflow environment, scheduling DAGs, and monitoring task execution.

Installation

Linux/macOS

Install using pip:

pip install apache-airflow

If you need a specific database backend (e.g., PostgreSQL):

pip install apache-airflow[postgres]

Windows

Airflow is primarily developed and tested on Unix-like systems. While it can run on Windows, it’s not officially supported and may lead to unexpected issues. It’s highly recommended to use a Linux environment (e.g., WSL, Docker) for Airflow development and production.

If you must install on Windows:

  1. Install Python.
  2. Install pip.
  3. Open Command Prompt or PowerShell as Administrator.
  4. Install Airflow:
    pip install apache-airflow
    
  5. You may encounter issues with certain dependencies; consult Airflow documentation for platform-specific troubleshooting.

Core Concepts

  • DAG (Directed Acyclic Graph): A collection of tasks with defined dependencies, organized in a way that reflects their relationships and execution order.
  • Task: A single unit of work within a DAG.
  • Task Instance: A specific run of a task for a specific DAG run.
  • DAG Run: A specific execution of a DAG for a given logical date.
  • Operator: A template for a type of task (e.g., BashOperator, PythonOperator, PostgresOperator).
  • Connection: A set of credentials and parameters Airflow uses to connect to external services.
  • Variable: A key-value store for storing configuration or dynamic information.

Commands / Usage

DAG Management

List DAGs

airflow dags list

List all DAGs available in your Airflow environment.

airflow dags list --subdir /path/to/your/dags

List DAGs from a specific directory.

Show DAG Information

airflow dags show my_dag_id

Display information about a specific DAG, including its structure and tasks.

airflow dags show my_dag_id --graph

Generate a visual graph (DOT format) of the DAG’s structure. You’ll need Graphviz installed to render this:

airflow dags show my_dag_id --graph | dot -Tpng > my_dag.png

Enable/Disable DAGs

airflow dags state my_dag_id paused True

Pause a DAG.

airflow dags state my_dag_id paused False

Unpause a DAG.

Trigger DAGs

airflow dags trigger my_dag_id

Trigger a new DAG run for my_dag_id.

airflow dags trigger my_dag_id -c '{"key": "value"}'

Trigger a DAG run with configuration JSON.

airflow dags trigger my_dag_id --run-id manual__2023-10-27T10:00:00+00:00

Trigger a DAG run with a specific run ID.

Delete DAGs

airflow dags delete my_dag_id --yes

Delete a DAG and its associated task instances and DAG runs. Use --yes to skip confirmation.

Task Management

List Tasks in a DAG

airflow tasks list my_dag_id

List all tasks within a specific DAG.

airflow tasks list my_dag_id --tree

Display tasks in a tree format, showing dependencies.

Show Task Information

airflow tasks state my_dag_id my_task_id

Show the state of a specific task instance for the latest DAG run.

Run Tasks

airflow tasks run my_dag_id my_task_id 2023-10-27

Run a specific task instance for a given DAG ID and logical date.

airflow tasks run my_dag_id my_task_id 2023-10-27 --local

Run a task instance locally without interacting with the metadata database. Useful for testing.

airflow tasks test my_dag_id my_task_id 2023-10-27

Test a task instance locally. This bypasses most Airflow logic and runs the task’s execute method directly.

Clear Task States

airflow tasks clear my_dag_id --task-regex 'my_task_.*' -s 2023-10-26 -e 2023-10-27

Clear task states for tasks matching a regex within a date range.

airflow tasks clear my_dag_id --dag-run-id manual__2023-10-27T10:00:00+00:00

Clear tasks associated with a specific DAG run.

airflow tasks clear my_dag_id --yes

Clear all task instances for the specified DAG. Use --yes to skip confirmation.

Task Instance States

airflow tasks state my_dag_id my_task_id 2023-10-27

Get the state of a specific task instance for a given logical date.

List Task Instances

airflow tasks list-runs my_dag_id --start-date 2023-10-26 --end-date 2023-10-27

List DAG runs within a date range.

airflow tasks list-instances my_dag_id --state running

List task instances for a DAG that are currently in a 'running' state.

Connections

List Connections

airflow connections list

List all connections configured in Airflow.

airflow connections list --conn-type postgres

List connections of a specific type (e.g., PostgreSQL).

Add Connection

airflow connections add 'my_postgres_conn' --conn-type 'postgres' --conn-host 'localhost' --conn-login 'user' --conn-password 'password' --conn-schema 'mydb' --conn-port 5432

Add a new connection with specified details.

Delete Connection

airflow connections delete 'my_postgres_conn'

Delete a connection by its connection ID.

Get Connection

airflow connections get 'my_postgres_conn'

Retrieve the details of a specific connection.

Variables

List Variables

airflow variables list

List all variables stored in Airflow.

Set Variable

airflow variables set my_variable_key 'my_variable_value'

Set or update a variable.

airflow variables set --json '{"key1": "value1", "key2": 123}'

Set multiple variables from a JSON string.

Get Variable

airflow variables get my_variable_key

Retrieve the value of a specific variable.

airflow variables get --json my_variable_key

Retrieve the value of a variable as JSON.

Delete Variable

airflow variables delete my_variable_key

Delete a variable.

Import/Export Variables

airflow variables import /path/to/variables.json

Import variables from a JSON file.

airflow variables export /path/to/variables.json

Export all variables to a JSON file.

Core Airflow Operations

Initialize Database

airflow db migrate

Initialize or upgrade the Airflow metadata database. This is crucial after installation or upgrading Airflow.

Start Webserver

airflow webserver -p 8080

Start the Airflow web UI on port 8080.

Start Scheduler

airflow scheduler

Start the Airflow scheduler process, which monitors DAGs and triggers task runs.

Check Database Connection

airflow db check

Verify that Airflow can connect to its metadata database.

Version

airflow version

Display the installed Airflow version.

Common Patterns

Running Airflow Locally for Development

  1. Initialize the database:
    airflow db migrate
    
  2. Create an admin user (if needed):
    airflow users create \
        --username admin \
        --firstname Admin \
        --lastname User \
        --role Admin \
        --email admin@example.com \
        --password admin
    
  3. Start the webserver:
    airflow webserver -p 8080
    
  4. Start the scheduler in a separate terminal:
    airflow scheduler
    

Testing a Task Locally

airflow tasks test my_dag_id my_task_id 2023-10-27

This is invaluable for debugging individual task logic without needing to trigger a full DAG run or rely on the scheduler.

Clearing Task Instances for Reruns

To rerun a specific task and all its downstream tasks for a particular date:

airflow tasks clear my_dag_id --task-regex 'my_failing_task' -s 2023-10-27 -e 2023-10-27 --yes

This clears the state of my_failing_task and any tasks that depend on it for the logical date 2023-10-27, allowing them to be rescheduled.

Triggering a DAG with Specific Configuration

airflow dags trigger my_dag_id --run-id custom_run_20231027 --conf '{"key1": "value1", "date": "2023-10-27"}'

This triggers a DAG run with a custom run ID and passes a JSON configuration object that can be accessed within the DAG tasks using dag_run.conf.

Managing Secrets with Connections

Instead of hardcoding credentials in DAGs, use Airflow connections:

airflow connections add 'my_s3_conn' \
    --conn-type 'aws' \
    --conn-extra '{"aws_access_key_id": "AKIA...", "aws_secret_access_key": "SECRET..."}'

Then, reference this connection in your S3-related operators.

Gotchas

  • airflow db migrate is essential: Always run airflow db migrate after installing or upgrading Airflow to ensure your metadata database schema is up-to-date.
  • Scheduler vs. Webserver: You need both the airflow scheduler and airflow webserver running for a functional Airflow environment. They are separate processes.
  • Logical Date vs. Execution Date: Airflow uses "logical dates" (often referred to as execution dates) to define DAG runs. airflow tasks run and airflow tasks clear require this date. Be mindful of timezone configurations.
  • airflow tasks test vs. airflow tasks run: test is for isolated task execution debugging. run interacts with the scheduler and database to execute a task instance as part of a DAG run.
  • airflow tasks clear behavior: Clearing tasks does not stop running tasks. It marks them for rescheduling. If a task is running when cleared, it might continue to completion before Airflow picks up the cleared state.
  • Task Instance States: Understand the different states (queued, running, success, failed, skipped, up_for_retry, up_for_reschedule, deferred). The CLI can help you inspect and manage these.
  • Configuration: Many Airflow settings are controlled via airflow.cfg or environment variables. The CLI commands operate within the context of this configuration.
  • Permissions: Ensure the user running the CLI commands has the necessary permissions to interact with the Airflow metadata database and potentially the underlying execution environment (e.g., file system access for DAG files).
  • Version Compatibility: Always ensure your CLI version matches your Airflow installation version for predictable behavior.