Setup Rabix Composer Local Executor using WSL2
--
In the space of Data Science/Analytics, we rely heavily on automation to drive insights from several data sources. Docker, Python, node, JSON and YAML are a amongst a wide range of technologies used as part of the automation.
I recently came across Rabix and Common Workflow Language (CWL) in the space of biomedical research. As a developer, and not having come from a biomedical background, I wanted to explore opportunities for using Rabix/CWL.
After looking into Cancer Genomics Cloud and signing up, I started to get a feel for Rabix/CWL. I decided to setup a local Rabix/CWL playground on my Windows WSL2 environment. This would also be a prelude to some personal discovery work into CWL & Apache Airflow.
This article aims to cover the following:
- Installation of Rabix Composer on Windows 10, with WSL2 installed/enabled
- Configuring the Rabix Local Executor which would allow workflows to be run/tested locally
- Creation/Execution of sample pipelines
WSL2 References & Prerequisites
It’s important to note that Linux distributions enabled with WSL2 are Windows hosted Virtual Machines which run using a Linux Kernel. As such, we’re able to run Linux apps natively, including Rabix.
My attempts to get the Windows version of Rabix working with a local executor, were unsuccessful. Perhaps tinkering with different Java versions might get you over the line if you decide to experiment with this approach.
Requirements
- In order to access the Rabix Composer GUI from WSL2 VM, we need an Xserver running on the Windows host, which will render our display. For this article the Home Edition of Mobaxterm is used as an Xserver
- A WSL2 enabled Linux distribution is required, and for the setup, Debian 10, Buster from Microsoft Store was used. The release details for the OS are shown below.
$ sudo apt install lsb-release
$ lsb_release -aDistributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster
- Docker for Desktop needs to be installed, with WSL2 integration enabled for the respective Linux distribution
Install/Configure XServer (MobaXterm)
- Download Mobaxterm
- From the Windows host, install MobaXterm
- Launch the program and select
Settings --> Configuration
. Choose the options shown below on theX11
tab
Note: If you on shared network, setting “X11 remote access” requests to “full” may not be an appropriate option, in which case, select “restricted”
- We need to know the XServer IP address/details. These are required to direct Linux WSL2 to the location (
DISPLAY=
) where the Rabix display is to be sent - To find the Xserver IP, go to
Tools --> List running processes
- Make a note of the IP address as this will required it later on
Installation of Rabix and Dependencies
- Open a prompt within our WSL2 Linux distro. Debian is used for this setup:
- Install dependencies for X support
$ sudo apt update
$ sudo apt-get install libgtk2.0-0 fuse mesa-utils \
wget libdbus-glib-1-2 kdialog desktop-file-utils xdg-utils \
software-properties-common gnupg libnss3
Install JDK8 (this version is listed as a requirement at the official Rabix site).
- Download the
tar.gz
from https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html corresponding to your machine’s architecture. - The example below shows the installation steps for Linux x64 platform using
jdk-8u261-linux-x64.tar.gz
$ sudo mkdir -p /usr/lib/jvm
$ sudo tar zxvf jdk-8u261-linux-x64.tar.gz -C /usr/lib/jvm
$ rm jdk-8u261-linux-x64.tar.gz$ sudo update-alternatives \
--install "/usr/bin/java" "java" \
/usr/lib/jvm/jdk1.8.0_261/bin/java" 1$ sudo update-alternatives \
--set java /usr/lib/jvm/jdk1.8.0_261/bin/java
Configure JAVA_HOME and Global Profile
- Add the following entries
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_261
export PATH=$JAVA_HOME/bin:$PATH
to the end of global profile file (or $HOME/.bashrc for individual profile)
$ sudo vi /etc/profile..
..
export JAVA_HOME=/usr/lib/jvm/adoptopenjdk-8-hotspot-amd64
export PATH=$JAVA_HOME/bin:$PATH
- Activate the new configuration by exiting and logging back into WSL2 distro prompt, or by running:
$ source /etc/profile
Download Rabix
The latest release of Rabix can be found at the following link: https://github.com/rabix/composer/releases.
- As of now, the latest release (1.0.2), can be downloaded to an appropriate subfolder via:
$ mkdir $HOME/rabix
$ cd $HOME/rabix
$ wget https://github.com/rabix/composer/releases/download/1.0.2/rabix-composer.1.0.2.AppImage -O rabix-composer.1.0.2.AppImage
- Add execute privileges
$ sudo chmod 755 rabix-composer.1.0.2.AppImage
Add Display Server to ~/.bashrc
- Using the address of the XServer we noted earlier on, add the following to the bottom of your
~/.bashrc
file, replacing the IP address with your XServer's IP vi ~/.bashrc
export DISPLAY=192.168.1.111:0.0
export LIBGL_ALWAYS_INDIRECT=1
- The
LIBGL_ALWAYS_INDIRECT=1
variable enables OpenGL hardware rendering (if supported). If you experience issues in subsequent steps related to display output, try changing the "OpenGL acceleration" setting in the MobaXtermX11
Configuration tab fromHardware
toSoftware
Launch/Configure Rabix
- Start Rabix
$ cd $HOME/rabix
$ sudo ./rabix-composer.1.0.2.AppImage
- Rabix frontend should appear in a Window on your host
- Check Rabix is configured to use the bundled executor, and also configure an output folder for logs
Output folder: $HOME/rabix/Executions
Sample Workflows
Ensure Docker is running before attempting to run the following samples.
1. Running a Command from within a Docker Container
We are going to build a simple workflow which pulls an Alpine
docker image and runs and spawns a container, from which we run an echo <message>
. We'll be able to pass in the message
parameter value during testing from Rabix.
- Start by adding a local workspace from within Rabix
My Projects --> Open a Project --> Select Folder
- Add workspace location
$HOME/rabix
- You should now be back at the home screen and on the left hand pane, workspace
rabix
should be visible underlocal files
- Right click on
rabix
folder in the left hand pane and chooseNew CommandLineTool
- For the
App name
enterAlpine Docker Image
- Save to
$HOME/rabix/alpine-docker.cwl
- Click on the
Code
tab and paste the following
cwlVersion: v1.0
class: CommandLineTool
label: Alpine Docker Image
baseCommand:
- echo
inputs:
- id: message
type: string
inputBinding:
position: 1
outputs:
- id: std_out
type: stdout
requirements:
- class: DockerRequirement
dockerPull: alpine
stdout: output.txt
- To test the workflow, click on the
Test
tab and for themessage
parameter, enter valueHello World
- Click
Run
to execute the workflow
- If all goes well, you should be able to see output similar to the following, indicating the workflow successfully completed
- Detailed logs related to the workflow execution are stored at
$HOME/rabix/Executions
You can use either of the following methods to view the log output:
- WSL2 terminal using standard Linux commands, such as
cat
- Adding the
$HOME/rabix/Executions
folder to the Rabix workspace, which will allow access to the logs as plain text files from Rabix - or, by using Windows Explorer to browse the Linux distro file system
For example, to view output written to file output.txt
for the above job using Windows Explorer, replace the parameters and browser to the location using Windows Explorer.
\\wsl$\<linux distro name>
\home
\<Linux distro username>
\rabix
\Executions
\local
\<workflow name>
\<YYYY-MM-DD-HH-MM-SS>
\app-<YYYY-MM-DD-HH-MM-SS.SSS>
\root
2. Workflow with Embedded Python Script
The following workflow contains an inline/embedded python script, samplepy.py
:
- The script is created/populated and executed during the execution of the workflow inside a
python:3
docker container - The python script sets a variable
var="value of var"
and prints its value stdout - stdout, is mapped to file
export.txt
Proceeding as per the prior workflow:
- Right click on
rabix
folder in the left hand pane and chooseNew CommandLineTool
- For the
App name
enterRun an Embedded Python script
- Save as
$HOME/rabix/pyscript.cwl
- Copy the below and paste into the
Code
area in Rabix
cwlVersion: v1.0
class: CommandLineTool
label: Run an embedded Python script
hints:
DockerRequirement:
dockerPull: python:3
baseCommand: python
inputs:
script:
type: File
inputBinding:
position: 1
default:
class: File
basename: "samplepy.py"
contents: |-
var = "value of var"
print(var)
outputs:
results:
type: stdout
stdout: export.txt
- Run the workflow
- Check the output
Where to From Here?
Rabix and CWL have provided an additional perspective in the area of solution design/implementation. The way in which CWL is implemented promotes containerised, parameter driven components.
Having gained a basic knowledge of CWL workflows and establishing a local test environment, some possible areas I aim to cover off on in future are:
- Setting up Rabix executor with a TES server
- Apache Airflow scheduling of workflows, with a focus on containerising processes
- Setup Rabix to use Cloud apps/projects, such as the publicly available apps offered at Cancer Genomics Cloud developer portal