HPC group will give guidance to create and use singularity container but will not manage user containers. If you decide to use singularity it is your responsibility to build and manage your containers. That includes managing your linux environment for development.
Software has grown in complexity over the years making it difficult at times to simply run the software. Containers address this problem by bundling an application together with its software dependencies, scripts, documentation, license and a minimal operating system within a self-sustainable image so that when it comes to running the software everything “just works”. By containerizing the application platform, it makes the software both sharable and portable while the output becomes reproducible.
Singularity: A Secure Alternative to Docker
Singularity allows running Docker containers natively, and is a great replacement for Docker on HPC systems. This means that existing Docker container can be directly imported and natively run with Singularity.
Singularity is a tool that we offer for running containers on our clusters, similar to Docker. Docker images are not secure because they provide means to gain root access to the system they are running on. Singularity is a secure HPC alternative container framework that uses a completely different implementation that doesn’t require any elevated privileges to run containers, while also allowing direct interaction with existing Docker containers. Learn about the differences between virtual machines, Docker, and Singularity.
The main motivation to use Singularity over Docker is the fact that it’s been developed with HPC systems in mind that solves these problems:
- Security: A user in the container is the same user as the one running the container, no privilege escalation
- Ease of deployment: No daemon running as root on each node, a container is simply executable
- Ability to run massively-parallel (MPI) applications by leveraging fast InfiniBand interconnects and GPUs with minimal performance loss
Singularity is a container platform specifically for high-performance computing.
Obtaining the Image: Using the pull Command
Some software is provided as a Singularity image with the .sif or .simg file extension. A Docker image could also be provided which must be converted to a Singularity image. For example, if the installation directions say:
Then download and convert the Docker image to a Singularity image with:
This will produce the file lolcow_latest.sif in the current working directory, where “latest” is a specific version of the software or a "tag". Other cases include images on the Singularity Cloud Library:
In some cases the build command can be used to create the image, examples below:
Unlike pull, build will convert the image to the latest Singularity image format after downloading it.
Obtaining the Image: Working from a Dockerfile
Some software is provided as a Dockerfile instead of an actual container. In this case, if you have Docker installed on your local machine (e.g., laptop) then you can create the Docker image yourself and then transfer it to one of the HPC clusters where the Singularity image can be built.
Next, save that image as a tar file. In this example, the image ID is c230486ba945.
Copy myimage.tar to one of the HPC clusters using scp and then create the Singularity image. The commands will look as follows:
Building Singularity Images
Singularity images can be built from scratch using a definition file which is a text file that specifies the base image, the software to be installed and other information. However, you need root access to the build Singularity containers and therefore you won’t be able to do so on the cluster. Possible options are:
- Building on a Linux system to which you have root (admin) access
- Building on a virtual machine
- Or consider creating Docker images since it has a larger community with more support and then converting it into a Singularity image
Detailed documentation about building Singularity container images is available here.
To run the default command within the Singularity image:
To run a specific command that is defined within the container, use singularity exec:
Use the shell command to run a shell within the container. This command is useful for searching for certain files within the container.
Available Files and Storage Space
A running container automatically bind mounts these paths:
- the directory from which the container was ran
This makes it easy for software within the container to read or write files on the cluster filesystems. For instance, if your image is looking for an argument that specifies the path to your data then one can simply supply the path:
Additionally, there are two options to create your own custom bind mounts within your containers.
- The --bind option to bind directories:
To bind multiple directories in a single command:
- Using the environment variable $SINGULARITY_BIND instead of the command line argument:
Singularity by default exposes all environment variables from the host inside the container. Use the --cleanenv argument to prevent this:
One can then define an environment variable within the container:
With the above definition, MYVAR will have the value "Overridden". You can also modify the PATH environment variable within the container using definitions as follows:
The Singularity image in the above can be obtained with:
To learn about binding a directory within the container to a directory on the host, please refer to the -B flag from this command: $ singularity help run
Container Design Strategies
There are different ways to design containers when their purpose is to encapsulate pipelines: make the orchestration either inside the container, or to make it from outside the container and simply make the calls to software located inside the container.
- A 3-step pipeline: using two containers that have dependencies installed inside.
The pipeline is run on the host via a bash script. Each step is calling a tool located in one of the two containers using the “singularity exec” command.
- A 2-step pipeline is encapsulated in a container.
The ENTRYPOINT of the container defines the steps to be executed when the container is called using “singularity run”.
- A 3-step pipeline is encapsulated in a container.
A script defining the execution steps (python / bash / snakemake / ...) is called from outside using the “singularity exec” command. If the script is mounted inside the container, it can be easily changed from outside without recreating the container.
A guide on how to use a BWA singularity container:
Demonstration (click to download and view):
User Group Sessions
Singularity (Version 2.2) is installed as a RPM on the lilac cluster and does not need any additional modules to be loaded. Singularity containers are each stored in a single file, unlike Docker containers, for mobility and reproducibility.