Ensemble runs vs. multiple runs on single conformations

I calculated normal modes and extracted three interesting models from each of the lowest frequency modes for both my receptor (protein) and ligand (DNA). I have 15 models of the protein and 15 of the DNA, meaning 225 combinations in the sampling module. I am currently running all models against each other in single runs, but I don’t know how to compare scores between these runs when they are finished.

Also, I am thinking of doing an ensemble run of all 15 x 15, but the problem is that the max sampling of 50000 would mean that each conformer only would be sampled ~222 times. I was thinking that I can do an initial ensemble run and then use --extend-run later, but don’t really understand it and haven’t made it work yet.

There is something similar to my issue here, but I am using haddock3 locally: Ensemble vs separate docking runs.

Thanks in advance

I calculated normal modes and extracted three interesting models from each of the lowest frequency modes for both my receptor (protein) and ligand (DNA). I have 15 models of the protein and 15 of the DNA, meaning 225 combinations in the sampling module. I am currently running all models against each other in single runs, but I don’t know how to compare scores between these runs when they are finished.

Provided you have the same information to drive the docking, you should be able to compare directly the HADDOCK scores.

You will find the stats in the CAPRI analysis directories (provided you added those modules to your workflow).

Also, I am thinking of doing an ensemble run of all 15 x 15, but the problem is that the max sampling of 50000 would mean that each conformer only would be sampled ~222 times. I was thinking that I can do an initial ensemble run and then use --extend-run later, but don’t really understand it and haven’t made it work yet.

With such an approach, you would score all combinations directly in one run.

Note that you can change the maximum number of models sampled (the sampling parameter) by editing in the HADDOCK3 installation the src/haddock/modules/sampling/rigidbody/defaults.yaml file, searching for the sampling parameter and increasing the maximum value.

Do it at your own risk :slight_smile:

Thanks for the nice answers to both scenarios!

For now I plan to finish the 1 x 1 conformations for all 15 of both receptor and DNA, place them in the same folder, rename them and (re)score them using this command:

for file in *.pdb; do haddock3-score -k–keep-all --full “${file}”; mv haddock-score-client/ “${file}”_haddock-score-client; done

I was just wondering about this part:

“Provided you have the same information to drive the docking, you should be able to compare directly the HADDOCK scores.”

I used the same restraints and parameters, yes. However, if one does not have high confidence in the restraints, does it not make sense to rescore the models, as I suggested in my previous post? My understanding is that haddock3-score removes the restraint bias from the initial HADDOCK scores(?)

I was just wondering about this part:

“Provided you have the same information to drive the docking, you should be able to compare directly the HADDOCK scores.”

I used the same restraints and parameters, yes. However, if one does not have high confidence in the restraints, does it not make sense to rescore the models, as I suggested in my previous post? My understanding is that haddock3-score removes the restraint bias from the initial HADDOCK scores(?

You can indeed do that, or using the scoring example, running the em scoring module on the ensemble of models.

You can also actually add that em scoring module in your workflow directly (if you have not run everything yet) to get the “unrestrained” scores.