I successfully installed and ran HADDOCK3 locally on macOS, and it works beautifully. Thank you for developing such a versatile and powerful framework; coming from an experimental training background, I’ve found HADDOCK extremely helpful in facilitating my research.
I have a couple of questions regarding the HADDOCK3-alascan workflow and MD simulation:
Alascan workflow:
What is the recommended cutoff for identifying hotspot residues? Is there a generally accepted threshold (e.g., Δscore < -2 or -3), or should hotspots be defined statistically (e.g., as outliers beyond 1.5×IQR among interface residues within 5 Å)? I would appreciate guidance on best practices for interpreting these results. Additionally, there is a question regarding the number of models to run the alascan workflow. In the HADDOCK3 paper, only the top model is analyzed and shown, while the user manual indicates that multiple refined models are generated by a specified sampling factor.
OpenMM / MD performance:
I managed to install OpenMM and it is correctly invoked within HADDOCK3 runs. However, I noticed that it appears to run on CPU only, making it significantly slower compared to standalone OpenMM usage on macOS, where GPU acceleration (e.g., via OpenCL platform specified) can be enabled.
Is there currently a way to specify the OpenMM platform within HADDOCK3, or are there plans to support GPU acceleration in future releases?
At last, I would like to express my strong interest in potential future support for cryo-EM restraints to guide docking workflows in HADDOCK3, like webserver 2.4/2.5 ver..
Dear user,
Thanks for your interest in using HADDOCK3 for your research.
In [alascan] module:
we do not recommend any cutoff for identifying hotspot residues, but this rather give a qualitative appreciation of the potential residues that are important / specific for the interaction.
Indeed, in the paper we only show the result for one model. But the module can be run on multiple models. If they are clustered, the results are averaged among cluster members.
[openmm] module:
Usually, OpenMM detects by itself if the GPU is available (tested on Linux only), and should use it by default. It falls back on CPU mode if undetected. It probably means that it cannot by default detect the GPU on the MacOS. I will try to investigate, but I first need to find a MacOS with a GPU on it
Cryo-EM: It is planned to be added back to the code in the future, but I cannot yet announce any date.
Thanks for the clarification, and that makes sense. Initially, I was hoping for something more directly comparable to haddock suite tools like Spot-On, mainly to get clearer leads.
I also tried running alascan on multiple models by first using the mdrun module to generate refined structures from the top-ranked model. However, I noticed that the variation per-interface-residue is still quite high. I’m considering increasing the sampling factor (e.g., from 10 to 50) to see if the results converge better; does that sound like a reasonable approach to you?
Also, thank you for considering both OpenMM GPU support and future EM restraint incorporation; that would be fantastic to see. Really appreciate the continued development!
I think it is a great idea to run an MD refinement before:
# ... something before
# Perform MD refinement
[mdref]
sampling_factor = 10 # for each input molecule, generate 10 MD runs
# If you only have 1 input molecule, you could increase this value
# Cluster them
[clustfcc]
# Run the alanine-scanning module
[alascan]
Interesting… could that be the reason I built a new haddock image containing openmm and pdbfixer, so when I start running this new image haddock-openmm, it won’t take the GPU to work?
(base) ~ % nano Dockerfile.openmm
# paster below in the Dockerfile and write out, enter, and exit the editor.
FROM haddock3
# Install OpenMM + pdbfixer
RUN python -m pip install --no-cache-dir openmm pdbfixer
I assume your haddock is installed locally, so basically I need to install both in the same virtual environment like using conda command and it should work then?