AlphaFold2 / RoseTTAFold 🔗

📣

Alphafold2 relies on old toolchains which won't be supported any longer in the near future. Please consider using AlphaFold 3 instead

	Container	Native
AlphaFold	✅	✅
Alphafold 2.1.1 (Multimer)		✅
RoseTTAFold	🚧 WIP	✅

Version	skylake (gpuv100)	zen3 (gpu2080, gputitanrtx, gpu3090, gpuv100, gpuhgx )
2.0.0
2.1.1
2.1.2		module load palma/2021a module load foss/2021a module load AlphaFold/2.1.2

AlphaFold 2 🔗

Detailed information can be found at: https://github.com/deepmind/alphafold

Genetic Databases 🔗

Alphafold and RoseTTAFold are using distinct data bases optimized for the corresponding algorithms. The Alphafold database can be found here:

/Applic.HPC/data/alphafold/ 
|-- bfd
|-- mgnify
|-- params
|-- pdb70
|-- pdb_mmcif
|-- pdb_seqres
|-- small_bfd
|-- uniclust30
|-- uniprot
`-- uniref90

The complete database size is around 5TB. It takes more than 50h to download and unpack them. Therefore: PLEASE DO NOT DOWNLOAD THESE DATABASES AGAIN!

Native 🔗

Interactive session 🔗

Alphafold has been updated to the latest version 2.1.1 including the multimer feature and compiled for the skylake-GPU as well as Zen3 nodes.

Before you start, do the following steps

Create a suitable directory for your calculations on scratch, e.g. /scratch/tmp/$USER/AlphaFold/
Create sub-directories for any locations you additionally want to use inside the container (here we create a results folder as well as a folder for storing the initial fasta file)

For an interactive session on the GPGPU Node the Alphafold module can be loaded:

module load palma/2020b
module load fosscuda
module load AlphaFold/2.1.1

For executing Alphafold(2.1.1) you need to create a folder in your scratch directory and copy your sequence file such as fasta into it.

Submission to the batch system 🔗

For submission to the batch system, the following Script can be adapted:

⚠️

Adjust the job script for your data! Don't just copy-paste it and expect it to work.

#!/bin/bash
#SBATCH --partition=gpuv100
#SBATCH --nodes=1
#SBATCH --gres=gpu:1
#SBATCH --gpus=1
#SBATCH --gpus-per-node=1
#SBATCH --cpus-per-task=6
#SBATCH --mem=60G
#SBATCH --time=1-23:59:00
#SBATCH --job-name=alphafold
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_account@uni-muenster.de

module load palma/2021a
module load foss
module load ml AlphaFold/2.1.1-CUDA-11.3.1
wait 
export ALPHAFOLD_DATA_DIR=/Applic.HPC/data/alphafold  
alphafold \
 --fasta_paths=Input_path \
 --model_preset=multimer \ 								#Default is Monomer
 --output_dir=/scratch/tmp/$USER/Alphafold/Results \
 --max_template_date=2021-11-25 \
 --is_prokaryote_list=false \
 --db_preset=reduced_dbs \
 --data_dir=/Applic.HPC/data/alphafold \

Container 🔗

The execution of Docker containers on the cluster is not allowed due to security reasons. Therefore we provide a container image for Singularity (a containerization software for HPC purposes):

for skylake nodes (normal, gpuv100): /Applic.HPC/container/alphafold_skylake-latest.sif
for ivybridge/sandybridge nodes (gputitanrtx, gpu2080): /Applic.HPC/container/alphafold_ivybridge-latest.sif

We created an AlphaFold module, automatically loading Singularity and setting the environment variable $ALPHAFOLD_SIFIMAGE to point to the correct path.

Starting AlphaFold 🔗

You can find an example job script of how to run AlphaFold on PALMA below. Before you start, do the following stepts

Create a suitable directory for your calculations on scratch, e.g. /scratch/tmp/$USER/AlphaFold/
Create sub-directories for any locations you additionally want to use inside the container (here we create a results folder as well as a folder for storing the initial fasta file)
- Those directories have to be bind-mounted into the container! (The -B flag in the singularity run command)
Create a Job-Script or use an interactive SLURM session to request resources on the cluster. You should request a minimum of 8 cores and 64GB of memory. GPUs are supported as well.

⚠️