I’m getting different results from running examples/docking-protein-protein/docking-protein-protein-test.cfg
command to run
cd examples/docking-protein-protein
haddock3 docking-protein-protein-test.cfg
amd’s run1-test/7_caprieval/capri_ss.tsv
model md5 caprieval_rank score irmsd fnat lrmsd ilrmsd dockq cluster-id cluster-ranking self.model-cluster-ranking
../6_emref/emref_1.pdb - 1 -110.599 9.146 0.111 17.011 15.393 0.112 - - -
../6_emref/emref_2.pdb - 2 -105.002 2.145 0.556 3.893 3.476 0.570 - - -
../6_emref/emref_3.pdb - 3 -88.291 9.267 0.111 16.617 13.212 0.115 - - -
../6_emref/emref_4.pdb - 4 -81.664 11.176 0.139 18.428 18.058 0.111 - - -
../6_emref/emref_5.pdb - 5 -74.947 11.047 0.111 18.039 17.959 0.104 - - -
intel’s
model md5 caprieval_rank score irmsd fnat lrmsd ilrmsd dockq cluster-id cluster-ranking self.model-cluster-ranking
../6_emref/emref_1.pdb - 1 -110.232 1.450 0.694 2.915 2.326 0.702 - - -
../6_emref/emref_2.pdb - 2 -92.448 2.680 0.500 5.157 4.130 0.490 - - -
../6_emref/emref_5.pdb - 3 -92.102 10.946 0.028 19.065 19.475 0.071 - - -
../6_emref/emref_3.pdb - 4 -74.786 10.115 0.028 16.993 15.180 0.083 - - -
../6_emref/emref_4.pdb - 5 -72.464 3.774 0.222 6.956 5.266 0.319 - - -
other info:
1、both using haddock3 git commit 0dad275
2、same CNS binary (compiled on Intel machine and then copied to AMD)
3、both with Ubuntu 22.04
4、CPU:
cat /proc/cpuinfo | grep 'name'| uniq
model name : 11th Gen Intel(R) Core(TM) i7-11700 @ 2.50GHz
model name : AMD EPYC 7532 32-Core Processor
5、in 0_topoaa folder, *.inp are exactly same, *.psf files are the same except for the one line telling date of output
6、I’m getting slightly different _haddock.pdb files in 0_topoaa folder as following (left is Intel)
You can not get exactly the same results when running on different hardware.
This is the nature of the computations which is chaotic. Full reproducibility is only achievable on the same hardware.
PS: To compare two hardware, better to perform a full run to see if the results are consistent (but won’t be exactly the same).
Thanks for the reply.
I tried haddocking with docking-protein-protein-full.cfg on three setups (local)
setup1: 11th Gen Intel(R) Core™ i7-11700 @ 2.50GHz
setup2: Intel(R) Core™ i9-10900 CPU @ 2.80GHz
setup3: AMD EPYC 7532 32-Core Processor
and here’s what I found:
setup1 and setup2 output exactly the same 08_caprieval/capri_ss.tsv, down to last digits:
model md5 caprieval_rank score irmsd fnat lrmsd ilrmsd dockq cluster-id cluster-ranking self.model-cluster-ranking
../07_emref/emref_32.pdb - 1 -133.753 2.090 0.694 3.979 3.088 0.618 - - -
../07_emref/emref_7.pdb - 2 -123.898 1.901 0.694 3.476 2.849 0.645 - - -
../07_emref/emref_20.pdb - 3 -120.499 0.973 0.833 1.497 1.226 0.836 - - -
../07_emref/emref_44.pdb - 4 -119.955 0.917 0.889 1.527 1.416 0.862 - - -
../07_emref/emref_78.pdb - 5 -119.565 0.969 0.889 1.504 1.319 0.855 - - -
../07_emref/emref_117.pdb - 6 -117.615 10.917 0.139 18.054 18.025 0.113 - - -
../07_emref/emref_26.pdb - 7 -117.032 1.693 0.639 2.813 2.158 0.660 - - -
../07_emref/emref_8.pdb - 8 -116.344 2.552 0.500 4.728 3.784 0.507 - - -
../07_emref/emref_89.pdb - 9 -115.889 2.445 0.444 4.823 3.066 0.491 - - -
meanwhile setup3 gives quite different top scores and ranking.
../07_emref/emref_8.pdb - 1 -126.115 1.949 0.722 3.876 2.855 0.641 - - -
../07_emref/emref_75.pdb - 2 -120.761 1.454 0.750 2.981 2.533 0.719 - - -
../07_emref/emref_4.pdb - 3 -120.174 1.635 0.722 2.843 2.385 0.693 - - -
../07_emref/emref_1.pdb - 4 -117.396 1.371 0.806 2.257 2.011 0.762 - - -
../07_emref/emref_3.pdb - 5 -116.627 2.612 0.528 4.944 4.049 0.508 - - -
../07_emref/emref_5.pdb - 6 -114.115 3.834 0.444 7.359 5.301 0.383 - - -
../07_emref/emref_57.pdb - 7 -113.359 1.625 0.639 2.678 2.271 0.670 - - -
../07_emref/emref_6.pdb - 8 -113.011 2.385 0.500 4.399 3.596 0.524 - - -
../07_emref/emref_65.pdb - 9 -111.223 0.967 0.889 1.818 1.512 0.851 - - -
Thus it’s possible two machines, for example setup1/2 using Intel architectures, generate same results (which I would rather prefer), and the difference between AMD and Intel CPUs is not insignificant. Furthermore, when running jobs on a cluster, they may be distributed to different architectures so output scores and structures would change from run to run, which makes reproducing results difficult.
1 Like
Thanks for this detailed explanation, could you please check if the models
../07_emref/emref_32.pdb
from setup1/2 is the same as ../07_emref/emref_8.pdb
from setup3?
also @unmerged please remember that haddock3 in its current state is still very experimental and has not been tested/benchmarked and is not recomended for production. Please refer to the current production version HADDOCK2.4
Interesting
Although the scores are different, the quality of the models is quite similar.
Did you check the cluster stats as well?