Interface to Spark

Airflow provides an interface to Spark by using the Providers.

See Provider examples of Spark at <airflow_home>/build/env/lib/python3.9/site-packages/airflow/providers/ezmeral/spark/example_dags/.

Hooks

EzSparkSubmitHook
Python path: airflow.providers.ezmeral.spark.hooks.ezspark_submit
Description: Launches Spark applications. This hook is a wrapper around the spark-submit binary to run a spark-submit job.
EzSparkSqlHook
Python path: airflow.providers.ezmeral.spark.hooks.ezspark_sql
Description: Enables interaction with binary tables through spark-sql. This hook is a wrapper around the spark-sql binary.
EzSparkJDBCHook
Python path: airflow.providers.ezmeral.spark.hooks.ezspark_jdbc
Description: Enables data transfers between JDBC databases and Apache Spark.

Operators

EzSparkSubmitOperator
Python path: airflow.providers.ezmeral.spark.operators.ezspark_submit
Description: This operator is a wrapper around the spark-submit binary to run a spark-submit job.
EzSparkSqlOperator
Python path: airflow.providers.ezmeral.spark.operators.ezspark_sql
Description: Executes Spark SQL query.
EzSparkJDBCOperator
Python path: airflow.providers.ezmeral.spark.operators.ezspark_jdbc
Description: Extends the SparkSubmitOperator to enable data transfers between JDBC databases and Apache Spark.