Installing Airflow

This topic includes instructions for using package managers to download and install Apache Airflow from the EEP repository.

For instructions to set up the EEP repository, see Step 11: Install Ecosystem Components Manually.

Installation on a Server Node or Edge Node

The Airflow client/server architecture requires you to install three packages on the server node or edge node:
  • mapr-airflow
  • mapr-airflow-webserver
  • mapr-airflow-scheduler

The mapr-airflow-webserver and mapr-airflow-scheduler packages depend on mapr-airflow. The package manager automatically installs mapr-airflow when you install either mapr-airflow-webserver or mapr-airflow-scheduler. Execute the following commands as root or by using sudo on an HPE Ezmeral Data Fabric cluster.

  1. On a node where you want to install Airflow, install mapr-airflow, mapr-airflow-webserver, and mapr-airflow-scheduler:
    • On Ubuntu:
      apt-get install mapr-airflow mapr-airflow-webserver mapr-airflow-scheduler 
    • On RHEL/CentOS:
      yum install mapr-airflow mapr-airflow-webserver mapr-airflow-scheduler 
    • On SLES:
      zypper install mapr-airflow mapr-airflow-webserver mapr-airflow-scheduler 
    Note that installations on Oracle Enterprise Linux (OEL) must be done by the root user.
  2. Run configure.sh -R.
    /opt/mapr/server/configure.sh -R 

Installation on a Client Node

Airflow can be installed on a client node. The installation steps are the same as for a server node or edge node. However, after installation on a client node, you must manage all Airflow services manually. For example:

To manage the webserver:
/opt/mapr/airflow/airflow-<version>/bin/airflow.sh [start|stop] webserver
To manage the scheduler:
/opt/mapr/airflow/airflow-<version>/bin/airflow.sh [start|stop] scheduler