Running clustfcc across multiple Haddock runs

ppalandre · March 9, 2026, 3:51pm

Dear Haddock team,

I am trying to predict the binding site of a binder to its target (the binder is a protein engineered to bind to a specific target) by performing protein-protein docking with Haddock3 (standard Haddock2.X workflow). As I have information on which residues are involved in the docking for both the binder and the target, I performed the docking with restraint files. I performed the docking for multiple different binders which have different binding affinities to the target. All binders have the same backbone scaffold with a mutated binding interface (around 10-15 residues are different between each binder), so I used the same restraint files for all runs to enable comparisons across binders. For each binder, I performed three replicates of the docking experiment with different random seeds.

My aim is now to find out if the different binders yield similar docking poses. For this, my strategy is the following:

First, I want to verify that the randomness of docking does not majorly influence my results. For this, I want to find out for each binder whether the triplicates yield similar docking poses.
Second, I want to verify across binders whether I get similar docking poses or not.

I both cases, I would like to perform a clustering by fraction of common contacts on the docking results of multiple docking runs (e. g. clustfcc of the poses of all three replicates for one binder). I cannot seem to make clustfcc run without running a docking workflow in front of it.

Does this mean that my strategy is not optimal and I should use other tools to find similar docking poses across triplicates and binders? Or is there a way of running clustfcc standalone?

Thank you very much for your help!

All the best,

Pauline

amjjbonvin · March 10, 2026, 8:13am

Hi Pauline

One way of doing that is to generate an ensemble that contains all your conformations from the different runs and then run a scoring workflow, e.g.:


# ====================================================================
# Scoring example

# directory in which the scoring will be done
run_dir = “run1-score-cluster"

# execution mode
ncores = 40
mode = "local"

# ensemble of different complexes to be scored
molecules = [“my-ensemble.pdb"]

# ====================================================================
# Parameters for each stage are defined below

[topoaa]

[emscoring]

[clustfcc]

[seletopclusts]

[caprieval]

# ====================================================================

And you can create the ensemble by using pdb_mkensemble

You can look at the end in the traceback directory to trace the models to their original input PDB.

ppalandre · March 13, 2026, 3:37pm

Hello,

Thank you very much, that worked very well!

amjjbonvin · March 13, 2026, 5:33pm

Glad to hear that!

Topic		Replies	Views
Ensemble vs separate docking runs HADDOCK	2	1054	June 21, 2018
Combining runs and re-running analysis HADDOCK	3	276	April 21, 2023
Protein- peptide docking HADDOCK	14	3105	November 26, 2023
Hdock3 scoring for large-scale complex models HADDOCK haddock	1	51	April 11, 2025
Haddock generatong only one cluster HADDOCK	6	77	January 15, 2026

Running clustfcc across multiple Haddock runs

Related topics