Variability between HADDOCK runs for protein–DNA docking (wild-type vs mutant comparison)

Hello everyone,

I’m currently working on a protein–DNA docking study comparing the wild-type and mutant forms of the same protein.

To perform this, I’m using HADDOCK2.4 (web server). The DNA structure used in the docking is extracted from a known crystallographic complex involving the same protein (wild-type form), to ensure a realistic conformation. I kept all docking parameters (restraints, input structures, etc.) exactly the same between all runs.

However, I’m observing a high variability in the results between two identical submissions, even for the wild-type protein. This makes it difficult to reliably compare docking scores and interpret the effect of mutations.

For example, for the wild-type form (same input, same settings):

  • HADDOCK score: -141.7 (run 1) vs -137.8 (run 2)
  • Electrostatic energy: -659.5 vs -630.2
  • RMSD: 0.5 vs 0.6
  • Z-score: -2.1 vs -1.4
    (and so on)

I understand that HADDOCK involves stochastic steps, but I’m looking for advice on:

  1. How to reduce this variability or make the results more reproducible.
  2. Best practices to interpret HADDOCK scores and differences between wild-type and mutant forms despite this variability.
  3. Recommended protocols for comparing mutant vs wild-type proteins interacting with DNA, especially when using HADDOCK2.4 (web) or HADDOCK3.

Would running multiple replicas and comparing cluster average scores with standard deviations be the most robust strategy?

I’m also open to running HADDOCK locally if that helps improve reproducibility.

Thank you in advance for any guidance, tips, or shared experience.

Best regards,

When comparing the runs, what are the standard deviations associated with the scores?

If they overlap within standard deviations they are not statistically different.

You can of course also run haddock2.5 locally - in that case you might ensure to run on similar hardware (which is not the case when using the web server)

Thank you for your feedback.

I will run several replicates to estimate the standard deviations of the scores and better assess the significance of the differences between the wild-type and mutant forms.

Regarding the local installation of HADDOCK, I’m interested in this option. Which version would you recommend for this type of study (protein–DNA docking with comparison of variants): HADDOCK 2.5 or HADDOCK3? And how can I obtain access to download these versions?

Thank you again for your advice.

Best regards,

are you using the refinement interface with energy minimisation only?

If not, the results page do report the scores with their standard deviation calculate on the top4 models of each cluster

Thank you for your response.

In my initial approach, I separately extracted the protein (from complex A) and the DNA (from complex B, which contains the same protein).
I then introduced mutations into the protein using PyMOL and performed docking using the standard HADDOCK 2.4 interface, in order to compare interactions between the wild-type and mutant forms.

Your question led me to consider a more rigorous alternative: using the full complex B structure, introducing mutations in PyMOL, and then submitting both the wild-type and mutant forms to the Refinement only interface in HADDOCK, using the energy minimisation only mode.

This strategy would help maintain a consistent structural context while reducing variability introduced by stochastic docking steps.
Do you think this approach is more appropriate for reliably assessing the impact of mutations on protein–DNA interactions?
Thank you again for your advice.

Best regards,

If you want to have a bit more refinement, and also some variabilty in refined models, consider using the water refinement protocol instead of EM.
This will give you an average score with standard deviation

1 Like