Installation

Install the d2s client#

Install the d2s client and cwlref-runner with pipx on Linux and MacOS:

pipx install d2s cwlref-runner

We recommend to use pipx if you just want to execute d2s. You can also install with pip or pip3 depending on your preferences.

Requirements (see below for installation instructions):

See those instructions to install d2s on Windows using Chocolatey and pipx. CWL Workflow execution on Windows with the CWL reference runner requires WSL 2 and Docker Desktop.

Enable autocompletion#

Optional

Enabling commandline autocompletion in the terminal provides a better experience using the d2s client.

  • ZSH: add the import autocomplete line to the ~/.zshrc file.
echo 'eval "$(_D2S_COMPLETE=source_zsh d2s)"' >> ~/.zshrc

Set your terminal to use ZSH by default:

chsh -s /bin/zsh

A oh-my-zsh theme can be easily chosen for a personalized experience. See the zsh-theme-biradate to easily install a simple theme and configure your terminal in a few minutes.

  • Bash: add the import autocomplete line to the ~/.bashrc file.
echo 'eval "$(_D2S_COMPLETE=source d2s)"' >> ~/.bashrc

Bash autocompletion needs to be tested.

Windows support#

Support of the d2s tool on Windows is a work in progress.

  • Most workflow orchestrators do not support Windows, as workflows are based on Linux containers, see CWL workflows and Nextflow.
  • Windows can run Docker, but not natively like Linux, making it more prone to errors.

CWL Workflow execution on Windows with the CWL reference runner works great with WSL 2 and Docker Desktop.

We recommend to use the Windows PowerShell terminal (which is easier to use than the basic terminal).

Try the client#

d2s

You need to open a new terminal for the autocomplete to be activated.

Use Tab after a d2s command in the terminal to see all the available options (it will adapt to the command and dynamically retrieve your datasets and workflows!).


Download the GraphDB triplestore#

For licensing reason the GraphDB RDF triplestore free edition distribution needs to be downloaded manually 📥

  • Go to https://ontotext.com/products/graphdb/ and provide informations to get an email with the link to download GraphDB

  • Download the latest version of GraphDB as stand-alone server free version (.zip file)

  • The d2s client will ask you to provide the path to the GraphDB distribution .zip file when initializing the workspace.

    • By default the d2s client will try to get the file from your home directory (e.g. /home/my-user)

      # Copy the GraphDB distribution file to your home folder
      cp graphdb-free-*-dist.zip ~/
Update GraphDB version

Change the GraphDB version used by d2s to the one you downloaded by changing GRAPHDB_VERSION in the file .env in your project folder.


Install pipx#

If you just want to run d2s we recommend you to use pipx as it install the tool in an isolated environment. It can be compared to apt, brew or npx.

Consider doing a pip install --upgrade pip to update your pip installation.

Instructions use pip3 to make sure pipx is installed with Python3, but feel free to use your own pip installation.

Install pipx on Ubuntu#

# Install Python3.6 and pip if necessary
sudo apt-get install python3 python3-venv python3-dev python3-distutils
wget https://bootstrap.pypa.io/get-pip.py
sudo python3 get-pip.py
# Install pipx
pip3 install --user pipx

Install pipx on MacOS#

Install python3 and pip3 if not installed.

brew install python3
pip3 install pipx
# Add pipx apps to path
pipx ensurepath

Install pipx on CentOS#

# Install python3 and pip3
sudo yum install python36
sudo yum install python36-devel
sudo easy_install-3.6 pip
pip3 install --user pipx
pipx ensurepath

Install pipx on Windows#

We will use the Chocolatey package manager for Windows on the PowerShell. To install Chocolatey:

  • Open the PowerShell as administrator to install Chocolatey and its packages.
  • Check and fix system restrictions:
Get-ExecutionPolicy
# If returns Restricted:
Set-ExecutionPolicy Bypass -Scope Process
# or Set-ExecutionPolicy AllSigned
  • Install Chocolatey on PowerShell:
Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))

See the official Chocolatey documentation.

Chocolatey can also be installed using a non-administrative shell. See the documentation.

Open the PowerShell as administrator and use Chocolatey to install Python 3.8 and pip:

choco install python pip

A reboot of your system is required to complete the installation.

Pip does not need to be run as administrator (only choco install)

We recommend using pipx if you are not developing on the d2s Python CLI:

pip install pipx
pipx ensurepath

Upgrade d2s version#

Upgrade d2s to the latest release:

pipx upgrade d2s

Uninstall#

pipx uninstall d2s cwlref-runner

If you face issues where d2s or cwl-runner is already installed, try to make sure it is properly uninstall from pip:

sudo pip uninstall d2s cwlref-runner cwltool
sudo pip3 uninstall d2s cwlref-runner cwltool

If you are facing issue with No module name pip found, it might be due to pip and pipx version issues. Be careful when installing pip and pipx as you want it to properly use python3.6. Those commands will help you uninstalling pipx properly:

rm -rf ~/.local/pipx
pip uninstall pipx
pip3 uninstall pipx
python3.6 -m pip uninstall pipx
python3.6 -m pip3 uninstall pipx

Install Docker#

On Ubuntu#

Install Docker and docker-compose.

sudo apt update
sudo apt install docker.io
sudo systemctl enable --now docker
sudo usermod -aG docker ${USER}
# Install docker-compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.24.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose

sudo groupadd docker could be required before usermod if the group has not been created.

On MacOS#

Use the Docker installer for MacOS (.dmg file) to install Docker and docker-compose.

If you have a DockerHub account you can use the DockerHub installer instead.

You can change Docker settings by clicking on the Docker icon 🐳 in the top bar, then click Preferences...

The volumes /Users and /tmp should be shared by default. It is recommended to create the d2s project folder in a subfolder of /Users.

On Windows#

We use Chocolatey to install Docker. See the pipx installation section to install Chocolatey.

Use docker-desktop for an easier installation if you have Windows Pro or Enterprise and a DockerHub account:

choco install docker-desktop

If you have Windows home or don't want to create a DockerHub account, use docker-toolbox:

choco install docker-toolbox -ia /COMPONENTS="kitematic,virtualbox,dockercompose" -ia /TASKS="desktopicon,modifypath,upgradevm"

Open the Docker Quickstart Terminal to start Docker.

Additional components that might be needed to install (VM and GUI):

choco install virtualbox docker-kitematic

Activate Virtualization#

Virtualization and Hyper-V must be activated. Check in the Task Manager , in tab Performance if Virtualization is enabled.

Check the documentation to enable it.

  • Docker-desktop installation will propose to install virtualization automatically after the Docker installation, if they are not installed.

  • Note that Docker Hyper-V is not available for Windows 10 Home edition (you will need Pro or Enterprise edition)

  • You might need to access the BIOS to enable VT-x virtualization

  • Share drive

By default docker-desktop and docker-toolbox are sharing your C:/Users volume. Docker will only be able to access folders and files in the Shared Drives. So make sure you execute d2s init somewhere in your users directories.

On docker-desktop you can change it in Docker config > Settings > Shared Drives > Share Drive C

On docker-toolbox you need to change the settings of the Virtual Box

Fix known issues#

  • DNS issue: Docker build can't access the internet. E.g.: getting wget: unable to resolve host address

    • If Docker can't access internet when building you might want to change the DNS (to use Google's one).

    • On Linux:

      nano /etc/resolv.conf
      > nameserver 8.8.8.8
    • On Windows: go to Docker Settings > Network > DNS Server > Fixed: 8.8.8.8

  • Firewall issue on Windows: it is common to face a firewall when Docker tries to connect to the internet

    • This could be due to local services: try deactivate your firewall and/or antivirus
    • If you are running it on your office network you might face issues related to the office network firewall. Try at home and contact your IT department if needed.

For more details on how to run Docker see the Docker guide.


Install Rabix Benten for VSCode#

Optional

Rabix Benten is a plugin for completion, error and warning messages for CWL files in Visual Studio Code.

Install the package using pipx:

pipx install benten --python python3.7

And add CWL (Rabix/Benten) extension to Visual Studio Code.

Install Rabix Composer GUI#

Optional

Rabix Composer is a nice way to visualize CWL workflows.

Download the right installation file and run it.

Open the d2s-core folder in Rabix Composer.

Note that Rabix will overwrite how you originally wrote your CWL files, and add xy coordinates to steps.

Try Apache Airflow#

Apache Airflow allow to run and monitor workflows. It requires to install the cwl-airflow package to add CWL compatibility.

Install Apache Airflow and the cwl-airflow pip package to get started.

pip install apache-airflow
pip install cwl-airflow --find-links https://michael-kotliar.github.io/cwl-airflow-wheels/
airflow init
cwl-airflow init

cwl-airflow init fails: FileNotFoundError:[Errno 2]No such file or directory:'airflow'

Start Airflow:

airflow webserver -p 8080
airflow scheduler

Submit a workflow:

cwl-airflow submit d2s-core/cwl/workflows/csv-virtuoso.cwl datasets/cohd/config.yml
# Run demo
cwl-airflow demo --auto

To be tested.

Try Toil#

Toil is a Python workflow manager which allows to run CWL workflows.

Install Toil for CWL:

pipx install toil

Run a workflow

toil-cwl-runner --workdir workspace/output/tmp --outdir workspace/output d2s-core/cwl/workflows/csv-virtuoso.cwl datasets/cohd/config.yml

To be tested.

Last updated on by Vincent Emonet