Hive UDFs

About this task

To use Hive UDFs in Impala, get the applicable JAR files from the Hive UDFs, and use the Impala UDF deployment process to create new UDFs with new names.

Data types of arguments must match the function signature exactly when reusing Hive Java code for built-in functions.

To use a Hive UDF with Impala, complete the following steps:

Procedure

  1. Get a copy of the Hive JAR file with the UDFs that you want to use with Impala.
  2. Issue the following command to see a list of classes inside the JAR file: jar -tf <jar_filename>
  3. Copy the JAR file to a MapR filesystem location that Impala can read.
  4. From the impala-shell, create a database to use with the UDF.
  5. To identify the data base that you want to query using the UDF, issue the USE statement to through the impala-shell for that particular database, or specify the SQL function name as db_name.function_name.
  6. Issue a CREATE FUNCTION statement for each UDF that you want to use with Impala. The CREATE FUNCTION statement should contain a LOCATION clause with the full MapR filesystem path to the JAR file and a SYMBOL clause with a fully qualified name of the class. Use dots as separators. Do not use the .classpath extension.
  7. Issue a query and call the function. Pass the correct type of arguments to the function.