AlphaFold 🔗

AlphaFold 3 is a model that is capable of high-accuracy prediction of complexes, including proteins, nucleic acids, smal molecules, ions and modified residues.

Google DeepMind has changed the license structure of the Alphafold software for the version 3 release. The software source code remains open source.

The model parameters are the result of training the AlphaFold model and are required to run the inference calculation that predicts molecular structure. The model parameters are distributed separately from the source code with their own terms of use, and PALMA users who wish to run AlphaFold 3 need to read and understand the terms of use for the model parameters:

Obtaining the Authorization for the Model Parameters 🔗

You can request a copy of the model parameters by filling out and submitting this form. Approval of the request is entirely at the discretion of Google DeepMind. The HPC staff of University of Münster cannot assist with filling out the form or intervene on your behalf with Google DeepMind if the request is rejected.

📣

In order to comply with the terms of use, each AlphaFold 3 user on the HPC cluster needs to request, download, and use their own personal copy of the model parameters.

To fill in the form you will need to:

Read and understand the section titled Key things to know when using the AlphaFold 3 model parameters and output.
Enter your uni-muenster email address in the first email field.
In the second email field you must enter an email address that ends in gmail.com. If you don't have a gmail email address, visit gmail.com and create one.
In the field titled “URL of public-facing website for non-commercial organization“, enter https://www.uni-muenster.de
When all the fields on the first page are filled in, click the Next button.
The second page has a single question that asks “Do you intend to provide access to the AlphaFold 3 model parameters to other researchers within your non-commercial organization? (E.g. as part of a centrally managed computing cluster)“. Answer No.
Click Next, read the text on the last page, and if you are satisfied you understand the terms of use, complete the form and submit it.

The waiting time for a reply from the form submission ranges from a few hours to a few days.

Download the model parameters 🔗

An email will be sent to the email address provided in the first email field on the request form with a download link. Clicking the link will connect you to a Google Drive page where you can download the model parameters. The parameters consist of a single file approximately 1GB in size. This file can then be stored on the HPC in your /scratch/tmp/... folder.

Setup your Alphafold environment and run your first prediction 🔗

Before running AlphaFold3 you will need to setup an environment containing some standard folders for input/output and temporary files. To do so, first download this script, copy it in your /scratch/tmp/user/ folder and make it executable:

chmod +x setup_af.sh

The script takes as input the name of a new project folder that will be created, e.g.:

./setup_af.sh my_project_name

This will create a folder with the following subfolders and template files:

my_project_name
|____AF_ENV
|____TEMPLATE_slurm.sh
|____af_input
| |____TEMPLATE_model.json
|____af_output
|____af_public_databases
|____af_weights
|____tmp

The files containing the predictions to be run should be stored in the af_input folder, in JSON format. The folder contains already a TEMPLATE_model.json: adapt it to your own need, or refer to the official AlphaFold documentation to use a first working example.

Then edit the environment file my_project_name/AF_ENV to change two variables:

the name of the json file to be used for the prediction (JSON_NAME)
the path to the model parameters downloaded in the previous step (WEIGHT_DIR)

Finally edit TEMPLATE_slurm.sh according to your needs.

📣

Please keep in mind that AlphaFold3 runs on GPUs only for compute capability >= 8.0. Please check out the list of GPU nodes.

The output will be produced in the self-explanatory af_output folder, whereas the other directories are needed to the container to recreate the needed mountpoints.

Performance enhancement 🔗

The official alphafold github repo has a very useful documentation concerning performance. In particular, Out Of Memory (OOM) errors can be handled by enabling unified memory by setting the following environment variables in the call to the container:

# Run AlphaFold job from the system-wide container
apptainer exec \
 --containall \
 --nv \
 --mount type=bind,src=$(pwd)/,dst=$HOME \
 --mount type=bind,src=$DBASE_DIR/,dst=$HOME/af_public_databases \
 --mount type=bind,src=$WEIGHT_DIR/,dst=$HOME/af_weights \
 --mount type=bind,src=$TMP_DIR/,dst=/tmp \
+ --env XLA_PYTHON_CLIENT_PREALLOCATE=false \
+      --env TF_FORCE_UNIFIED_MEMORY=true \
+      --env XLA_CLIENT_MEM_FRACTION=3.2 \ 
$AF_CONTAINER \
bash -c "python /app/alphafold/run_alphafold.py

Downloads 🔗

Setup shell file