Skip to content

Recommended software

Terminal

Terminal software and terminal sessions are a fundamental part of bioinformatics and data analysis. The most bioinformatics related tools are command line tools and feeling yourself comfortable (at least at some extend) with command line is mandatory. Terminals can be used locally or through Secure Shell (SSH) connection to a server. SSH is a cryptographic network protocol for operating network services securely over an unsecured network.

Command examples:

Create a directory:

mkdir NewDir

Rename the directory:

mv NewDir AnotherDir

Change the directory:

cd AnotherDir

Download file from Internet:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Connect to SSH gateway server:

ssh sshgw.uef.fi

Terminal multiplexer

screen is a terminal multiplexer. It allows to manage multiple terminal sessions within the same console. In a way, it does the same thing as GUI terminal emulators with their built-in tab system and layout management. tmux another terminal multiplexer and can be used in similar fashion as screen.

Terminal multiplexer also provide persisten terminal sessions. This means that you can keep your terminal session running on the server even if you disconnect from the server and shutdown your computer. Once you reconnect to the server you can attach/connect to the previously started screen terminal session. This is not only recommended way of working from the command line but is sometimes only way to run long running analysis jobs.

Command examples:

Start new screen session:

screen

Attach (onnect to) the existing screen session (if only one session exists):

screen -r

List existing (running) sessions:

screen -ls

Attach (onnect to) the specific screen session (if more than one session exist):

screen -x 12345

Remote desktop connection

If you want to use full featured desktop environment instead of simple terminal session you can make Remote Desktop Connection to the server. However, at the moment tuma.uef.fi is the only server provided by Bioinformatics Center that provides graphical desktop environment. In order to make remote desktop connection to tuma.uef.fi server you have to be in the UEF intranet. So remote desktop connections have to be made from within WVD instance.

Please note that Remote Desktop Connection and Remote Desktop software are two completely different software! Remote Desktop Connection software is bundled together with every Windows 10 installation. However, Remote Dekstop software appears to your computer not until you install WVD.

Remote Desktop Connection to tuma.uef.fi:
Remote Desktop Connection

Remote Desktop to WVD service:
Remote Desktop

Conda and Mamba

Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. Conda quickly installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments on your local computer. To start working with conda environments you have to first install conda by youself because it is not pre-installed on any of the servers provided by Bioinformatics Center. Miniconda is a good option to start with. It provides a minimal set of conda tools upon which you can start building your analysis environments.

Conda is not very fast at resolving package dependencies and it might take a while when you install new software into the current environment. This problem is solved by installing Mamba which is a reimplementation of the Conda package manager. Mamba works exactly the same way as Conda but does it's job much faster!

Command examples:

Install miniconda on linux (64-bit):

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Install miniconda on Windows (64-bit):

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe
bash Miniconda3-latest-Windows-x86_64.sh

Set up installation channels:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

Create a new conda environment with name "MyEnv":

conda create -n MyEnv

Activate the environment:

conda activate MyEnv

Install bwa into the current environment:

conda install bwa

Install bwa into the current environment from conda-forge channel:

conda install -c conda-forge bwa

Deactivate the environment:

conda deactivate

Install mamba into the current environment:

conda install -c conda-forge mamba

Containers

Containers are a solution to the problem of how to get software to run reliably and reproducibly when moved from one computing environment to another. This could be e.g. from your personal laptop to a server environment. Problems arise when the software environments are not identical (e.g Windows vs Linux or Debian vs CentOS) and not all the the same software versions are always available (e.g. python 2.7 vs python 3). Also installing numerous software into new environments might be challenging. Also network topologies might be different, or the security policies and storage might be different but the software has to run on it.

There are two major container technologies available: Docker and Singularity. Docker containers requires root privileges to run but Singularity doesn't. So you can run singularity containers by using your own user account and privileges. However, developing and building singularity containers still requires root privileges. Singularity is capable of running docker containers and singularity is way to go if you want to utilize (pre-built) containers within your data analysis tasks.

Command examples:

Download pre-built images from Docker Hub:

singularity pull --name [SINGULAR_IMAGE_NAME] docker://[REPOSITORY]/[IMAGE]

Run singularity container:

singularity run [SINGULAR_IMAGE_NAME]

Execute a custom command (ls /) within a container:

singularity exec [SINGULAR_IMAGE_NAME] ls /

Spawn a new shell within a container and interact with the environment:

singularity shell [SINGULAR_IMAGE_NAME]

GitHub and Git

GitHub, Inc. is a provider of Internet hosting for software development and version control using Git. It offers the distributed version control and source code management functionality of Git, plus its own features. Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

If you want to version control your code and documentation GitHub service is a great option. In order to start working with GitHub it is advisable to first create a repository at GitHub and then clone the repository from your working environment. Once you have cloned the remote repository from GitHub you can start working with the local repository, add files, commit changes and finally push changes to GitHub.

Command examples:

Clone a repository from GitHub:

git clone https://github.com/[USERNAME]/[REPOSITORY_NAME].git

Add files into the local repository:

git add FILENAME

Add all the files in the current directory into the local repository:

git add *

Commit all the changes into the local repository (-m = commit message):

git commit -a -m "The latest changes"

Check status of the local repository:

git status

Push local changes to the remote repositor:

git push

Visual Studio Code

Visual Studio Code is a lightweight but powerful source code editor which runs on your desktop and is available for Windows, macOS and Linux. It comes with built-in support for JavaScript, TypeScript, Node.js, Git and Markdown and has a rich ecosystem of extensions for other languages (such as C++, C#, Java, Python, PHP, Go) and runtimes (such as .NET and Unity). Visual Studio Code is installed on tuma.uef.fi server and you can install it on your own computer too.

Visual Studio Code with Markdown preview: VSCode

Jupyter

Project Jupyter is a non-profit, open-source project. Jupyter Notebooks provides easy to use browser based analysis environments, support interactive data science and scientific computing across all programming languages. Jupyter Notebooks are not pre-installed on any of the servers provided by Bioinformatics Center. The easiest way to start working with Jupyter Notebooks is to install is by using conda.

Jupyter Notebook: Jupyter Notebook

JupyterLab: JupyterLab

Command examples:

Install classic Jupyter Notebook into the current conda environment:

conda install -c conda-forge notebook

Run Jupyter Notebook:

jupyter notebook

Install Jupyterlab into current conda the environment:

conda install -c conda-forge jupyterlab

Run JupyterLab:

jupyter-lab

Run JupyterLab by using custom port 8890:

jupyter-lab --port 8890

Windows Subsystem for Linux

Nowadays it is not always mandatory to have access to Linux server to be able to run bioinformatic analyses from command line on your personal Windows 10 computer. A good alternative is Windows Subsystem for Linux (WSL). WSL lets you run a full freature Linux environment -- including most command line tools, utilities, and applications -- directly on Windows, unmodified, without the overhead of a traditional virtual machine or dual-boot setup.

Linux terminals on Windows:
WSL