Airflow started as an open-source project at Airbnb. Over the years, it has turned into one of the most popular platforms to create, manage and monitor workflows.

A workflow could be as small as sending an email alert or notification based on some trigger or something as large as a complex machine-learning workflow with many moving parts. In other words, a workflow can be a combination of several tasks that can be executed sequentially or parallel to each other.

Though Airflow is a highly extensible platform with many third-party provider packages available, the core Airflow setup comprises of a Webserver, CLI and a Scheduler. Together, these components form the backbone of a minimal Airflow installation.

In this post, we will look at the step-by-step process to install Airflow on Ubuntu.

1 – Requirements to Install Airflow

Airflow requires Python as a dependency. Therefore, first step would be to check the Python installation on the server where you wish to setup Airflow.

It can be easily achieved by logging in to your server and executing the command python –version or python3 –version.

If the output comes out as Python 3.6.* or above, you are all set to start installing Airflow.

In case you have an older Python version, you can install the latest Python version.

2 – Airflow Installation

Airflow installation needs a home. To establish the same, we can export the AIRFLOW_HOME variable using the below command:

export AIRFLOW_HOME=~/airflow

This step is optional unless we wish to setup AIRFLOW_HOME at a location other than ~/airflow. But we will set it up nonetheless for the sake of example.

Next, we can install Airflow using the below command:

pip3 install apache-airflow==2.1.0

Here, 2.1.0 is the Airflow version we wish to install.

Useful Tip

In case we face some issue with pip while executing the above command, we can upgrade pip itself by using the below command:

python3 -m pip install -U pip

Once the pip upgrade is successful, we can try installing apache-airflow once again.

3 – Initializing the Airflow DB

Next step is to initialize the Airflow database.

airflow db init

By default, Airflow uses sqlite database and this command initializes the necessary tables. We can also connect other databases such as PostgreSQL and MySQL but more on that in a later post. After the command successfully finishes, you would also be able to see a bunch of files created in /home/airflow

├── airflow.cfg
├── airflow.db
├── logs
│   └── scheduler
│       ├── 2021-06-21
│       └── latest -> /home/saurabh/airflow/logs/scheduler/2021-06-21
└── webserver_config.py

Here, airflow.cfg file contains the configuration properties for the airflow and various settings. The airflow.db is the database file. Also, there is a log file and webserver_config.py.

4 – Setup Admin User

In order to access Airflow admin, we have to create admin user using the below command:

airflow users create \
    --username admin \
    --firstname Peter \
    --lastname Parker \
    --role Admin \
    --email spiderman@superhero.org

You will be prompted for a password and confirmation.

5 – Start Webserver

Next step is to start the webserver. Below is the command for the same:

airflow webserver --port 8080 -D

The -D option is to start the webserver as a daemon. Once the webserver starts, we can access the same at http://localhost:8080. We can login using the user created in Step 4.

install airflow

In the above screen, we see a warning about scheduler not running. Also, there will be a list of sample DAGs that form a part of the default installation.

6 – Start Scheduler

The scheduler is the component that actually manages and runs the various jobs. To start the scheduler, we can execute the below command:

airflow scheduler

Once the scheduler starts successfully and refresh the web application in the browser, the warning about the scheduler vanishes. In other words, the webserver is now connected to a valid scheduler instance and ready to execute jobs.


With this we have completed the step-by-step guide to install Airflow with all the core components. In the next post, we will create our first DAG and execute it. The official site for Airflow can be found here.

Categories: AirflowBlog

Saurabh Dashora

Saurabh is a Software Architect with over 12 years of experience. He has worked on large-scale distributed systems across various domains and organizations. He is also a passionate Technical Writer and loves sharing knowledge in the community.

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *