Parsing HTTP User-Agents with Teradata
Parsing User-Agents with Teradata User-Installed-Files (UIF) and Python has been discussed and demonstrated here.
A slight setback with the UIF approach was that we needed to install an additional Python package (ua-parser) to support the parsing, i.e., the default Python installation on the Teradata Node did not include capabilities for parsing User-Agents. Getting additional Python packages installed onto Teradata Node/s in a “real” Production environment is not something that can be rapidly accomplished. This is due to the fact that DBAs, Teradata Engineers and other implementation procedures would need to be involved, which might make the process fairly lengthy.
This article describes the process for using Teradata Java-User-Defined-Functions (JUDF) for implementing User-Agent parsing capabilities.
Pre-requisites
If you’re familiar with Java Development using Maven, then you can go ahead and set yourself up with the dependencies. The pom.xml can be found at https://github.com/ua-parser/uap-java.
For this article, a Java IDE will not be used.
Project Folder
Create a folder to store the UDF components. I’m using C:\Apps\Judf
Install and Configure Java Development Kit (JDK)
I’m running JDK version 1.8.0_231
on Windows 10 x64.
JDK for my setup is installed into target directory:
C:\Program Files\Java\jdk1.8.0_231
After the installation is complete, add the following to your System Environment Variables:

Include the following to your System Path Environment variable:
%JAVA_HOME%\bin
Download ua-parser Jar and Dependencies
Download the following jars and save them to your project folder.
uap-java-1.4.0.jar:
https://repo1.maven.org/maven2/com/github/ua-parser/uap-java/1.4.0/uap-java-1.4.0.jar
gson-2.8.6.jar:
https://repo1.maven.org/maven2/com/google/code/gson/gson/2.8.6/gson-2.8.6.jar
snakeyaml-1.20.jar
https://repo1.maven.org/maven2/org/yaml/snakeyaml/1.20/snakeyaml-1.20.jar
Create Teradata Database for the JUDF components
Create the following database to hold our UDF jars/function:
CREATE DATABASE MYPROJ FROM DBC AS PERM=200E6;
Grant the following access to the user that will be installing the UDF:
GRANT EXECUTE FUNCTION ON MYPROJ TO <username>;
GRANT CREATE FUNCTION ON MYPROJ TO <username>;
GRANT EXECUTE PROCEDURE ON SQLJ TO <username>;
GRANT CREATE EXTERNAL PROCEDURE ON MYPROJ TO <username>;
JAVA UDF Source and Packaging
Code the UDF Class and Method
Create a file in your project folder named parseUA.java
and paste in the following code:
import ua_parser.Parser;
import ua_parser.Client;import java.io.IOException;import com.google.gson.Gson;public class parseUA {public static String uaJson (String raw_ua) throws IOException {
Parser newParser = new Parser();
Client agent = newParser.parse(raw_ua.trim());
Gson gson = new Gson();
String returnValue = gson.toJson(agent);if (raw_ua.trim().isEmpty()) {
returnValue = "";
}
return returnValue;
}
}
Compile Java Source Code
The parseUA.java
can be compiled to generate an output .class
file using the following:
PS C:\Apps\Judf> javac -classpath `
";.\gson-2.8.6.jar;.\snakeyaml-1.20.jar;.\uap-java-1.4.0.jar" `
parseUA.java
Package Java UDF Class into .JAR
Generate a jar containing the compiled Java Class (parseUA.class
) by running the following:
PS C:\Apps\Judf> jar.exe -cf ua-udf.jar parseUA.class
Deploy JAR Files to Teradata
To deploy the JAR files into our target Teradata database (MYPROJ
), we will need to use procedure SQLJ.INSTALL_JAR
.
Install JARs
The following will install the JARs required for our UDF into database “MYPROJ
”:
DATABASE MYPROJ;
call sqlj.install_jar('cj!gson-2.8.6.jar','gson-2.8.6',0);
call sqlj.install_jar('cj!snakeyaml-1.20.jar','snakeyaml-1.20',0);
call sqlj.install_jar('cj!uap-java-1.4.0.jar','uap-java-1.4.0',0);
call sqlj.install_jar('cj!ua-udf.jar','ua-udf',0);
Use BTEQ to run the above, and ensure that BTEQ is invoked from the path where your project resides.
Output of a sample run looks like this:
C:\Apps\Judf>bteqEnter your logon or BTEQ command:
.logon localtd/<your_username>;
Password:<your_password>DATABASE MYPROJ;*** New default database accepted.
*** Total elapsed time was 1 second.
call sqlj.install_jar('cj!gson-2.8.6.jar','gson-2.8.6',0);*** Procedure has been executed.
call sqlj.install_jar('cj!snakeyaml-1.20.jar','snakeyaml-1.20',0);*** Procedure has been executed.call sqlj.install_jar('cj!uap-java-1.4.0.jar','uap-java-1.4.0',0);*** Procedure has been executed.
call sqlj.install_jar('cj!ua-udf.jar','ua-udf',0);*** Procedure has been executed.
Set JAR Associations
Since the JARs we’ve just deployed are all associated (i.e. our JAVA source code uses classes from all of the installed JARs), we need to ensure that the required classes can be loaded when the UDF is invoked. This is accomplished by executing procedure SQLJ.ALTER_JAVA_PATH
.
Execute this procedure as follows (via BTEQ):
DATABASE MYPROJ;CALL SQLJ.ALTER_JAVA_PATH('ua-udf'
,'(*,SNAKEYAML-1.20)(*,GSON-2.8.6)(*,UAP-JAVA-1.4.0)');
The parameters for the second statement above can be explained as follows:
'ua-udf'
This is the ID of the jar file we created (ua-udf.jar) that contains the compiled parseUA.class
'(*,SNAKEYAML-1.20)(*,GSON-2.8.6)(*,UAP-JAVA-1.4.0)')
This line lists the JARs and classes that will be searched when the UDF is invoked. For example, the value of (*,SNAKEYAML-1.20)
interprets to, search all classes (*) in JAR file snakeyaml-1.20.jar.
After executing the statements, you can check if the associations have been made using the following SQL:
select distinct b.databasename,
a.JarName,
a.PathJarName
from DBC.JAR_JAR_USAGE a , DBC.DBASE b ,DBC.TVM cwhere a.DatabaseId =b.DatabaseId
and b.DatabaseId=c.DatabaseId
and c.TableKind='D'
and b.DatabaseNameI='MYPROJ'
;
If you’re at a site where access to dbc dictionary tables is restricted, then you can use the JAR-specific metadata objects in SQLJ
:
DATABASE MYPROJ;sel distinct database,a.jarname,b.pathjarname
from SQLJ.JARS a
join SQLJ.JAR_JAR_USAGE b
on a.databaseid=b.databaseid
and a.jarname=b.jarname
;

Install Java UDF to Target Database
Our function can be installed via BTEQ using the following definition:
DATABASE MYPROJ;REPLACE FUNCTION MYPROJ.UAJSON
(raw_ua VARCHAR(1024) CHARACTER SET UNICODE)
RETURNS VARCHAR(2000) CHARACTER SET UNICODE
SPECIFIC uajson
LANGUAGE JAVA
NO SQL
NO EXTERNAL DATA
PARAMETER STYLE JAVA
RETURNS NULL ON NULL INPUT
EXTERNAL NAME 'ua-udf:parseUA.uaJson'
;
Test UDF
The UDF takes an input parameter, User-Agent as a string, and outputs a parse-able JSON version.
To test, the following User-Agent can be used:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36
Run the UDF as follows using this Agent string:
SEL
MYPROJ.UAJSON('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36');
The output is a single JSON field, which looks like the following after neatly formatting using a JSON formatter:
{
"userAgent": {
"family": "Chrome",
"major": "48",
"minor": "0",
"patch": "2564"
},
"os": {
"family": "Mac OS X",
"major": "10",
"minor": "10",
"patch": "5"
},
"device": {
"family": "Other"
}
}
To extract components/fields from the new JSON version of the User-Agent, Teradata JSON functions can be used, for example, to extract the User-Agent “family
”, you could run the following:
SEL
cast(
MYPROJ.UAJSON(
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36'
)
AS JSON
).JsonExtractValue('$.userAgent.family')(VARCHAR(50))
AS User_Agent_Family

Uninstall/Remove Jars & UDF
To remove all objects associated with the installation, run the following:
DATABASE MYPROJ;DROP FUNCTION MYPROJ.UAJSON;call sqlj.alter_java_path('ua-udf','')call sqlj.remove_jar('ua-udf',0)
call sqlj.remove_jar('gson-2.8.6',0);
call sqlj.remove_jar('snakeyaml-1.20',0);
call sqlj.remove_jar('uap-java-1.4.0',0);
Closing Points
Hopefully this article has helped in demonstrating the flexibility that Java UDFs offer.
One important point to note is, unlike UDFs written in C, UDFs written in Java can only run in protected mode. If you plan on implementing UDFs using JAVA, performance metrics should be taken into consideration and assessed during testing.
If you are interested in developing Java UDFs using Eclipse, then you may want to check out the Teradata Plugin for Eclipse.