Teradata User-Installed-Files (UIF) and the Database Search Path

In a previous article, I gave an overview of UIF features and the process required to setup, install and test a basic python UIF (MYPROJ.sha256) for generating a sha256 hash.

This article aims to expand on the initial learnings to cover off on:

  • Methods of invoking UIFs without having to specifically declare the UIF database search path for each session, i.e. SET SESSION SEARCHUIFDBPATH = <databasename>;

Using the User’ STARTUP String

The following “Create User” statement for creating userid SAMPLE_USER includes a STARTUP parameter, which sets the UIF database search path for each session login.

CREATE USER SAMPLE_USER FROM DBC
AS PASSWORD = ********
,PERM=0
STARTUP = 'SET SESSION SEARCHUIFDBPATH = MYPROJ;'
;

The user SAMPLE_USER can also opt to modify this parameter for themselves using:

MODIFY SAMPLE_USER AS 
STARTUP = ‘SET SESSION SEARCHUIFDBPATH = MYPROJ;’

NOTE:
There are LIMITATIONS to this approach, and they are made clear in the
Teradata Documentation:

“Purpose of the STARTUP String”

“The length of a startup string is limited to 255 characters…….”

“The startup string is processed when:

  • You log on to Teradata Database through BTEQ
  • You establish a JDBC connection using the Teradata JDBC Driver and have specified the connection parameter RUNSTARTUP=ON
  • Only BTEQ, the Teradata JDBC Driver, and CLIv2 support startup strings.”

Custom Tweak, Use this at Your Own Risk

I’ll explain the thought process that led to conclusion/result using the UIF script (MYPROJ.sha256) that was created in my previous article. By all means, if you are reading this, and are able to provide more insight, feel free to do so by leaving a note.

Question 1: If the UIF/script runs on the node, which directory on the Linux filesystem holds our Script?

The Teradata documentation tells us that in order to troubleshoot a UIF script execution, we can examine the script log file on the node/s, located at /var/opt/teradata/tdtemp/uiflib, and that this directory also holds our UIFs.

Let’s logon to the Node and find out some details about the child objects of this path.

From a Terminal, we run run the following to get a recursive list of all files and sub-directories:

find /var/opt/teradata/tdtemp/uiflib

This returns:

/var/opt/teradata/tdtemp/uiflib
/var/opt/teradata/tdtemp/uiflib/tdbs_1032
/var/opt/teradata/tdtemp/uiflib/tdbs_1032/sha256.py
/var/opt/teradata/tdtemp/uiflib/uifrev.log
/var/opt/teradata/tdtemp/uiflib/scriptlog

So, from the above, directory /var/opt/teradata/tdtemp/uiflib/ is the parent for UIF related objects. If we take a closer look, we see that the sha256 UIF has been stored as sha256.py at:

/var/opt/teradata/tdtemp/uiflib/tdbs_1032/sha256.py

Let’s make sure that this file is the sha256 python code we installed onto the database as a UIF by listing its contents via cat:

cat /var/opt/teradata/tdtemp/uiflib/tdbs_1032/sha256.py

This returns:

#!/usr/bin/pythonimport sys
import hashlib
for rec in sys.stdin:
rec = rec.strip().split("\t")
if rec:
print("\t".join([rec[0],hashlib.sha256(rec[0]).hexdigest()]))

This is definitely the code for the sha256 UIF we installed onto the database. A show file will confirm this:

Image for post
Image for post

So, the answer to the subject question, our UIF has been stored on the node at:

/var/opt/teradata/tdtemp/uiflib/tdbs_1032/sha256.py

Question 2: Now that we know the absolute path of our UIF Script on the Node, can we use this to forgo having to set the SEARCHUIFDBPATH

What we’re trying to prove is that if we run the UIF from bteq or sql assistant and use the absolute path instead of the relative path, then our UIF should execute without having to issue aSET SESSION SEARCHUIFDBPATH = <databasename>;

Let’s test this theory.

The existing code that we used in the previous article to run our UIF is as follows:

SET SESSION SEARCHUIFDBPATH = MYPROJ
;
SEL
SCR_OPER.RAW_EMAIL
,SCR_OPER.HASHED_EMAIL
FROM SCRIPT (ON (sel email_addr FROM myproj.emails)
SCRIPT_COMMAND('./MYPROJ/sha256.py')
RETURNS ('RAW_EMAIL VARCHAR(55)' ,' HASHED_EMAIL VARCHAR(100)')
) AS SCR_OPER
;

We need to change the relative path in this code from:

SCRIPT_COMMAND('./MYPROJ/sha256.py')

to

SCRIPT_COMMAND('/var/opt/teradata/tdtemp/uiflib/tdbs_1032/sha256.py')

So, our new code becomes:

SEL 
SCR_OPER.RAW_EMAIL
,SCR_OPER.HASHED_EMAIL
FROM SCRIPT (ON (sel email_addr FROM myproj.emails)
SCRIPT_COMMAND('/var/opt/teradata/tdtemp/uiflib/tdbs_1032/sha256.py')
RETURNS ('RAW_EMAIL VARCHAR(55)' ,' HASHED_EMAIL VARCHAR(100)')
) AS SCR_OPER
;

Now we can try running this code WITHOUT issuing a:

SET SESSION SEARCHUIFDBPATH = MYPROJ;

Log off any active sessions and log back in again. Now run the new code and check the output:

Image for post
Image for post

Wow! It runs successfully without the need to set the SEARCHUIFDBPATH.

Question 3: Ok, this works, BUT, how on earth was the tdbs_1032 subdirectory derived in the absolute path to the UIF, i.e
/var/opt/teradata/tdtemp/uiflib/tdbs_1032

I did try digging around in the doco to find some info on how this subdirectory is created but ran out of luck.

So, I took a “brute force” style thinking approach.

I started out by assuming that tdbs_1032 maps in some way to the database where the UIF was installed to (MYPROJ).

tdbs_ appears to be a naming convention, most likely derived from the term “Teradata Database System”.

1032 is cryptic, but, my theory is that this number has been generated using the database dictionary tables field/fields. I’m going to work through the assumption that the 1032 is derived in some way from dbc.dbase.databaseid for the database where the UIF has been installed (MYPROJ).

By running the following query:

SELECT UIF.DATABASEID,
DB.DATABASENAMEI,
UIF.UIFFileName
FROM DBC.UIF_INFO UIF
INNER JOIN DBC.DBASE DB
ON UIF.DatabaseId = DB.DatabaseId
WHERE DATABASENAME='MYPROJ'
AND UIFFILENAME='sha256.py'
;

we get the following output:

Image for post
Image for post
DatabaseId  DatabaseNameI UIFFileName
00–00–08–04 MYPROJ sha256.py

By looking at the column definition in the DDL for dbc.dbase.databaseid, we can see that it’s defined as:

DatabaseId BYTE(4) NOT NULL

So, the value of databaseid from the previous query is 00–00–08–04, which is BYTE data represented in HEX, i.e. '00000804'xb .

I need a quick way of converting this HEX value to decimal/integer using different “Endianess” .

Head over to https://www.scadacore.com/tools/programming-calculators/online-hex-converter/ and paste in the HEX value of our DatabaseId 00–00–08–04 and click on “AnalyzeData”.

You should see the following output:

Image for post
Image for post

By looking at the conversion above, it’s becoming a bit more obvious.

The DatabaseIdis being converted to a 32 bit integer, BUT, before this happens, the byte orders have changed.

The integer 2052 value highlighted in the two leftmost boxes was derived using Big-Endian format of the DatabaseId

The ABDC refers to the order of the bytes.

The DatabaseIdin Big-Endian is represented as (hex).

A    B    C    D
00 00 08 04

This converts to integer value of 2052.

If the DatabaseIdbytes are treated as having Mid-Endian Byte order, then we have:

B    A    D    C
00 00 04 08

This hex value converts to integer value of 1032, which is the same value as seen in the directory path to the UIF script, i.e.

/var/opt/teradata/tdtemp/uiflib/tdbs_1032                         

Key UIF Details

sel 
DATABASENAME
,UIFFileName
--Convert Hex Bytes to Chars and LEFT PAD with 00xb
,CAST(LPAD(FROM_BYTES('00'xb||UIF.databaseid, 'BASE16') ,8,'0') AS CHAR(8)) as DatabaseID_Bytes2HexChar
--Extract each bytes so we can re-order them
,SUBSTR(DatabaseID_Bytes2HexChar, 1 for 2) as A
,SUBSTR(DatabaseID_Bytes2HexChar, 3 for 2) as B
,SUBSTR(DatabaseID_Bytes2HexChar, 5 for 2) as C
,SUBSTR(DatabaseID_Bytes2HexChar, 7 for 2) as D
--Big Endian Byte order
,(A||B||C||D) AS BIG_ENDIAN_HEX_CHARS
--Order Bytes to Mid-Endian format
,(B||A||D||C) AS MID_ENDIAN_HEX_CHARS
--Convert the Hex Mid-Endian representation Integer
,TO_NUMBER(MID_ENDIAN_HEX_CHARS, 'xxxxxxxx' ) AS MID_ENDIAN_INT
--Our UIF Absolute Path
,'/var/opt/teradata/tdtemp/uiflib/'||'tdbs_'
||trim(MID_ENDIAN_INT)||'/'
||trim(UIFFileName)
AS UIF_Absol_PATH

FROM DBC.DBASE DB
INNER JOIN DBC.UIF_Info UIF
ON DB.DatabaseId=UIF.DatabaseId
WHERE DATABASENAMEI='MYPROJ'
and UIFFileName='sha256.py'
Image for post
Image for post

Take-Home Points

If you are using BTEQ, CLIv2 or Teradata' JDBC Driver, then:

  • You modify, or set the User’s STARTUP at creation, to include a statement for setting the database UIF search path
  • For JDBC connections, in addition to setting the user’ STARTUP string, you will also need to set the driver option RUNSTARTUP=ON

If you are using ODBC/.Net or any other other connection type including the ones listed above, you can use the “Custom Tweak” described in this article.

At a high level, the “Custom Tweak” method uses the absolute path to the UIF, which is derived as follows:

  • Edit the SQL mentioned in “Key UIF Details” to filter for the required UIF in question
  • The query will return several columns with one of them being UIF_Absol_PATH
  • Use the value returned for UIF_Absol_PATH in the UIF’s SCRIPT_COMMAND to forgo having to set the SEARCHUIFDBPATH

Written by

Primarily a Learner/Coder with interests in Python, Cloud Technologies, Security and Automation. Pandas munching on Bamboo sticks give me the “Giggles” :))

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store