Teradata User Installed File (UIF)
Teradata’s SCRIPT Table Operator allows us to execute a custom user script/command, which can be installed onto the database as a User-Installed-File (UIF).
At a high level, the Operator works by sending data from sql “select” statements to the script for processing. The script then reads and processes data from stdin and returns its output to stdout (answerset).
In order to demonstrate how it all works, I’ll go through the process of downloading a demo Teradata Virtual Machine instance which, will be used to do the discovery and run some sample code.
VMPlayer/VMWare Workstation
If you do not have VMPlayer/Workstation installed, head over to https://www.vmware.com/au/products/workstation-player.html and download the free copy (for non-commercial use). You’ll need the software to run the Teradata Node.
Teradata Express for VMWare Player
Download a demo version of “Teradata Express for VMWare Player” from https://downloads.teradata.com/download/database/teradata-express-for-vmware-player.
As of now, the latest version is 16.20. You’ll need to signup/register before being able to access the link. Registration is free.

Download the .7z archive, which contains the VMWare image of the SUSE Linux OS with Teradata pre-installed.
Save the file to a local folder and use 7zip to extract the contents. You should have the following files in the extracted folder:
PS S:\Virtual_Machines\Teradata\TDExpress16.20_Sles11>Name
--------------------------------------------------
TDExpress16.20_Sles11-disk1.vmdk
TDExpress16.20_Sles11-disk2.vmdk
TDExpress16.20_Sles11-disk3.vmdk
TDExpress16.20_Sles11.vmx
Double click the file ending .vmx
file , which contains the config for the Teradata VM.
Secure Network Access to the Teradata VM
Before launching the VM, we’ll take some precautions and ensure that we secure network connectivity to the guest by setting up a host/guest only network.
This can be accomplished by selecting the VM from the list:

and then:
- Player -> Manage -> Virtual Machine Settings
- “Hardware”
- “Network Adaptor”
- “Host-only: A private network shared with the host””
- Click on OK to save changes
Start Teradata VM and Test Connectivity
Launch the VM by clicking “Play Virtual Machine”.

Once the logon dialog appears, login with username: root
and password:root

After successfully logging in, you should be at the VM’s workspace screen.

Some basic checks before proceeding. We’ll check if we can “ping” the guest VM from the host.
Open up a “Terminal” within the guest, and check the IP address by issuing an ifconfig
.

The above shows that the VM has been assigned an IP address of 192.168.96.129 (your IP may be different).
Ping the the above IP address from the host. I’m running a Windows 10 OS, so from Powershell, a ping returns the following:

This test indicates that the guest/host network active. For the examples that follow, you can either choose to connect to the guest from your host using the IP address identified by the ifconfig
above, or via commands issued directly from within the VM.
To check if Teradata has launched at startup, from a Terminal on the VM, enter the command pdestate
:

Teradata should be up and running, it can take a few minutes for Teradata PDE/RDBMS to launch at startup.
With Teradata up and running, let’s run a test query via bteq
to check connectivity to the RDMS. A simple “select date
” query will do.
TDExpress1620_Sles11:~ # bteq
Teradata BTEQ 16.20.00.04 for LINUX. PID: 25476
Copyright 1984–2018, Teradata Corporation. ALL RIGHTS RESERVED.
Enter your logon or BTEQ command:Password:dbc*** Logon successfully completed.
*** Teradata Database Release is 16.20.32.01
*** Teradata Database Version is 16.20.32.01
*** Transaction Semantics are BTET.
*** Session Character Set Name is ‘ASCII’.
*** Total elapsed time was 1 second.BTEQ — Enter your SQL request or BTEQ command:select date*** Query completed. One row found. One column returned.** Total elapsed time was 1 second.
TERADATA SCRIPT Table Operator
The official documentation for the Teradata version we’re running (16.20) confirms that UIFs are supported. If you are planning on testing with an alternate version, check the documentation to ensure UIFs are supported.
Below are some basic examples of running some basic commands.
Example 1: Determine Linux User Executing Script
The following command lists the Linux user that executes the script on the Teradata Node.
SELECT DISTINCT *SCRIPT_COMMAND('whoami')RETURNS ('stdout VARCHAR(50)');*** Query completed. One row found. One column returned.
*** Total elapsed time was 3 seconds.stdout
--------------------------------------------------
tdatuser
The result shows that the Linux user is tdatuser. This is consistent with the documentation:
"The text included in the SCRIPT_COMMAND clause runs in its own shell inside a temporary working directory.........installed script files run in protected mode (outside of the Teradata Database). The scripts are run as Tdatuser."
Example 2: List Python Version Running SMP/TPA Node
The example below runs code to display the version of python installed on the Teradata SMP/TPA Node.
SELECT DISTINCT *FROM SCRIPT (SCRIPT_COMMAND('echo "import sys
print(sys.version)" | /usr/bin/python'RETURNS (‘stdout VARCHAR(100)’);*** Query completed. One row found. One column returned.
*** Total elapsed time was 4 seconds.stdout
--------------------------------------------------
(2, 6, 9, 'final', 0)
The version of python currently installed is, 2.6.9, which is fairly outdated.
TERADATA SCRIPT Operator and Python
Python is by far one of the most appealing choices when it comes to choosing a coding language to meet your solution needs. With the Teradata Script Operator, we have the option of writing our own Python script (among other types) to process data. The script can be installed onto the database and used within our sql.
The best way of demonstrating how it all works is best illustrated with a worked solution, which is described next.
Creating a Teradata SHA256 Python UIF
This demo will show the process involved for creating, testing and installing a python UIF which processes emails stored in a database table, and outputs the raw email along with its associated sha256 hash.
Create Teradata Database Sandpit
Start by creating a sandpit database to store the UIFs and other objects that we’ll use during the process. The following will create a 2GB database named “myproj
” as a child of “dbc
”.
CREATE DATABASE MYPROJ FROM DBC AS PERM=2E9;
The access rights required are as follows:
--This grants all rights on the sandpit. In a Prod environment, --grants would not be this openGRANT ALL ON MYPROJ TO DBC;--Required for UIF Maintenance/InstallationGRANT CREATE EXTERNAL PROCEDURE ON MYPROJ TO DBC;
GRANT EXECUTE FUNCTION on TD_SYSFNLIB.SCRIPT TO DBC;
Create Sample Data
The following table will hold the raw emails addresses.
CREATE SET TABLE MYPROJ.EMAILS,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNALEMAIL_ADDR VARCHAR(255) CHARACTER SET LATIN NOT CASESPECIFICPRIMARY INDEX ( EMAIL_ADDR );
Populate the table with some fake emails.
INSERT INTO MYPROJ.EMAILS VALUES ('myemail@test.com') ;
INSERT INTO MYPROJ.EMAILS VALUES ('anotheremail@test.com') ;
Write and Test Python Script
The sample Python code below generates sha256 hashes for supplied input strings (in this case sample email addresses used above) and outputs the raw email along with its sha256 hash.
import hashlibdef sha256_hash(raw_data):
data = []
for rec in raw_data:
rec = rec.strip()
if rec:
data.append("\t".join([rec,hashlib.sha256(rec).hexdigest()]))
return dataif __name__ == "__main__":
emails = ["myemail@test.com", "anotheremail@test.com"]
print("\n".join(sha256_hash(emails)))
The code generates the following hashes:
myemail@test.com f1614ba6bcc183e5836483c0c48d1bf932ab1fe338d0ef097d18555a3a2614bdanotheremail@test.com c23cd2934c7912140abf66f3f2d6e0f7cb7c17de14393a0daa5d9fe618a1714d
To ensure the hashes are correct, choose one email (myemail@test.com
) and test its sha256 hash at https://asecuritysite.com/encryption/md5
:

The sha256 hash matches the output of the Python script, which indicates the script works as expected.
Prepare Python Script for Installation as Teradata UIF
The below shows the modified version of the script, which would need to be installed as a UIF onto the database.
#!/usr/bin/python
import sys
import hashlib
for rec in sys.stdin:
rec = rec.strip().split("\t")
if rec:
print("\t".join([rec[0],hashlib.sha256(rec[0]).hexdigest()]))
The script sets the environment to “/usr/bin/python
”, reads rows (with tab delimited fields) from sysin, and then returns the output to stdout.
We can now start working on getting the script installed onto the database
Save the script locally. I’ve saved to directory/root/Desktop/uif_scripts/sha256.py
within the guest.
Installing the Python Script as a UIF
Open a bash terminal on the guest and change directory to the location where the UIF-ready script was saved to. Use bteq
to install the script as UIF to database myproj
:
.logon dbc;
<enter password>
DATABASE MYPROJ;
SET SESSION SEARCHUIFDBPATH = MYPROJ;
CALL SYSUIF.INSTALL_FILE('sha256','sha256.py','cz!sha256.py');
HELP DATABASE MYPROJ;
The output of the run should look similar to the following:
TDExpress1620_Sles11:/ # cd /root/Desktop/uif_scriptsTDExpress1620_Sles11:~/Desktop/uif_scripts # bteqTeradata BTEQ 16.20.00.04 for LINUX. PID: 12730Copyright 1984-2018, Teradata Corporation. ALL RIGHTS RESERVED.
Enter your logon or BTEQ command:
.logon dbc
Password:
*** Logon successfully completed.
*** Teradata Database Release is 16.20.32.01
*** Teradata Database Version is 16.20.32.01
*** Transaction Semantics are BTET.
*** Session Character Set Name is 'ASCII'.
*** Total elapsed time was 1 second.BTEQ -- Enter your SQL request or BTEQ command: DATABASE MYPROJ;*** New default database accepted.
*** Total elapsed time was 1 second.BTEQ -- Enter your SQL request or BTEQ command: SET SESSION SEARCHUIFDBPATH = MYPROJ;*** Session UIF Search Database Path has been set.
*** Total elapsed time was 1 second.BTEQ -- Enter your SQL request or BTEQ command: CALL SYSUIF.INSTALL_FILE('sha256','sha256.py','cz!sha256.py');*** Procedure has been executed.
*** Total elapsed time was 1 second.BTEQ -- Enter your SQL request or BTEQ command: HELP DATABASE MYPROJ;*** Help information returned. 3 rows.
*** Total elapsed time was 1 second.Table/View/Macro name Kind Comment
------------------------------ ---- -------
EMAILS T ?
sha256 Z ?
Running the UIF: MYPROJ.sha256
Before we can run the UIF, we need to let Teradata know the database location of the script. This is achieved by running the following for the session. This parameter is valid for the session duration and will be unset once your session is no longer active:
SET SESSION SEARCHUIFDBPATH = MYPROJ;
The UIF can now be run against our test table (myproj.emails
) using the following code.
SEL
SCR_OPER.RAW_EMAIL
,SCR_OPER.HASHED_EMAIL
FROM SCRIPT (ON (sel email_addr FROM myproj.emails)
SCRIPT_COMMAND('./MYPROJ/sha256.py')
RETURNS ('RAW_EMAIL VARCHAR(55)' ,' HASHED_EMAIL VARCHAR(100)')
) AS SCR_OPER
The following output is returned:
RAW_EMAIL HASHED_EMAIL
anotheremail@test.com c23cd2934c7912140abf66f3f2d6e0f7cb7c17de14393a0daa5d9fe618a1714d
myemail@test.com f1614ba6bcc183e5836483c0c48d1bf932ab1fe338d0ef097d18555a3a2614bd
Conclusion and Extended Topics
Hopefully this article has given a decent insight into UIFs and the flexibility they offer.
There are a some other topics that will be covered as an extension to this article. They were not included in this article as they deserve a dedicated discussion.
These topics include:
- Extending the capabilities of UIFs to harness advanced features of other python packages not included in the default version of python
- Options and workarounds for having to explicitly set UIF database search path (
SEARCHUIFDBPATH
)