Monday, October 5, 2015

Create Hive custom User-Defined Functions (UDFs)

Hive scripts use an SQL-like language that integrates queries in MapReduce programming model. With a large body of built-in operators, built-in functions and the functionality provided for users to create custom UDFs, Hive finds a wide range of applications in big data analytics.

This blog post summarizes the steps to create custom UDFs. Apache Hive official documentation has detailed information.

The Java code skeleton for a UDF jar is included below:
Processing takes place inside the evaluate() function.
Once finish coding, export as jar file and upload to S3.

The Hive scripts could then access the UDF by:






And further utilize the declared function in SQL-like statements such as:


No comments:

Post a Comment