Different results on Intel/AMD machines

unmerged · July 19, 2022, 2:34am

I’m getting different results from running examples/docking-protein-protein/docking-protein-protein-test.cfg

command to run

cd examples/docking-protein-protein
haddock3 docking-protein-protein-test.cfg

amd’s run1-test/7_caprieval/capri_ss.tsv

model   md5 caprieval_rank  score   irmsd   fnat    lrmsd   ilrmsd  dockq   cluster-id  cluster-ranking self.model-cluster-ranking
../6_emref/emref_1.pdb  -   1   -110.599    9.146   0.111   17.011  15.393  0.112   -   -   -
../6_emref/emref_2.pdb  -   2   -105.002    2.145   0.556   3.893   3.476   0.570   -   -   -
../6_emref/emref_3.pdb  -   3   -88.291 9.267   0.111   16.617  13.212  0.115   -   -   -
../6_emref/emref_4.pdb  -   4   -81.664 11.176  0.139   18.428  18.058  0.111   -   -   -
../6_emref/emref_5.pdb  -   5   -74.947 11.047  0.111   18.039  17.959  0.104   -   -   -

intel’s

model   md5 caprieval_rank  score   irmsd   fnat    lrmsd   ilrmsd  dockq   cluster-id  cluster-ranking self.model-cluster-ranking
../6_emref/emref_1.pdb  -   1   -110.232    1.450   0.694   2.915   2.326   0.702   -   -   -
../6_emref/emref_2.pdb  -   2   -92.448 2.680   0.500   5.157   4.130   0.490   -   -   -
../6_emref/emref_5.pdb  -   3   -92.102 10.946  0.028   19.065  19.475  0.071   -   -   -
../6_emref/emref_3.pdb  -   4   -74.786 10.115  0.028   16.993  15.180  0.083   -   -   -
../6_emref/emref_4.pdb  -   5   -72.464 3.774   0.222   6.956   5.266   0.319   -   -   -

other info:
1、both using haddock3 git commit 0dad275
2、same CNS binary (compiled on Intel machine and then copied to AMD)
3、both with Ubuntu 22.04
4、CPU:

cat /proc/cpuinfo  | grep 'name'| uniq
model name      : 11th Gen Intel(R) Core(TM) i7-11700 @ 2.50GHz
model name	: AMD EPYC 7532 32-Core Processor

5、in 0_topoaa folder, *.inp are exactly same, *.psf files are the same except for the one line telling date of output
6、I’m getting slightly different _haddock.pdb files in 0_topoaa folder as following (left is Intel)

amjjbonvin · July 19, 2022, 8:33am

You can not get exactly the same results when running on different hardware.
This is the nature of the computations which is chaotic. Full reproducibility is only achievable on the same hardware.

amjjbonvin · July 19, 2022, 9:48am

PS: To compare two hardware, better to perform a full run to see if the results are consistent (but won’t be exactly the same).

unmerged · July 19, 2022, 3:44pm

Thanks for the reply.

I tried haddocking with docking-protein-protein-full.cfg on three setups (local)

setup1: 11th Gen Intel(R) Core™ i7-11700 @ 2.50GHz
setup2: Intel(R) Core™ i9-10900 CPU @ 2.80GHz
setup3: AMD EPYC 7532 32-Core Processor

and here’s what I found:

setup1 and setup2 output exactly the same 08_caprieval/capri_ss.tsv, down to last digits:

model   md5     caprieval_rank  score   irmsd   fnat    lrmsd   ilrmsd  dockq   cluster-id      cluster-ranking self.model-cluster-ranking
../07_emref/emref_32.pdb        -       1       -133.753        2.090   0.694   3.979   3.088   0.618   -       -       -
../07_emref/emref_7.pdb -       2       -123.898        1.901   0.694   3.476   2.849   0.645   -       -       -
../07_emref/emref_20.pdb        -       3       -120.499        0.973   0.833   1.497   1.226   0.836   -       -       -
../07_emref/emref_44.pdb        -       4       -119.955        0.917   0.889   1.527   1.416   0.862   -       -       -
../07_emref/emref_78.pdb        -       5       -119.565        0.969   0.889   1.504   1.319   0.855   -       -       -
../07_emref/emref_117.pdb       -       6       -117.615        10.917  0.139   18.054  18.025  0.113   -       -       -
../07_emref/emref_26.pdb        -       7       -117.032        1.693   0.639   2.813   2.158   0.660   -       -       -
../07_emref/emref_8.pdb -       8       -116.344        2.552   0.500   4.728   3.784   0.507   -       -       -
../07_emref/emref_89.pdb        -       9       -115.889        2.445   0.444   4.823   3.066   0.491   -       -       -

meanwhile setup3 gives quite different top scores and ranking.

../07_emref/emref_8.pdb -       1       -126.115        1.949   0.722   3.876   2.855   0.641   -       -       -
../07_emref/emref_75.pdb        -       2       -120.761        1.454   0.750   2.981   2.533   0.719   -       -       -
../07_emref/emref_4.pdb -       3       -120.174        1.635   0.722   2.843   2.385   0.693   -       -       -
../07_emref/emref_1.pdb -       4       -117.396        1.371   0.806   2.257   2.011   0.762   -       -       -
../07_emref/emref_3.pdb -       5       -116.627        2.612   0.528   4.944   4.049   0.508   -       -       -
../07_emref/emref_5.pdb -       6       -114.115        3.834   0.444   7.359   5.301   0.383   -       -       -
../07_emref/emref_57.pdb        -       7       -113.359        1.625   0.639   2.678   2.271   0.670   -       -       -
../07_emref/emref_6.pdb -       8       -113.011        2.385   0.500   4.399   3.596   0.524   -       -       -
../07_emref/emref_65.pdb        -       9       -111.223        0.967   0.889   1.818   1.512   0.851   -       -       -

Thus it’s possible two machines, for example setup1/2 using Intel architectures, generate same results (which I would rather prefer), and the difference between AMD and Intel CPUs is not insignificant. Furthermore, when running jobs on a cluster, they may be distributed to different architectures so output scores and structures would change from run to run, which makes reproducing results difficult.

honoratorv · July 19, 2022, 4:58pm

Thanks for this detailed explanation, could you please check if the models
../07_emref/emref_32.pdb from setup1/2 is the same as ../07_emref/emref_8.pdb from setup3?

also @unmerged please remember that haddock3 in its current state is still very experimental and has not been tested/benchmarked and is not recomended for production. Please refer to the current production version HADDOCK2.4

amjjbonvin · July 19, 2022, 6:28pm

Interesting

Although the scores are different, the quality of the models is quite similar.

Did you check the cluster stats as well?

Topic		Replies	Views
Different result/HADDOCK Score each time the same molecules (Protein-DNA) are docked	3	1211	April 16, 2020
A difference in HADDOCK scores, RMSD and Z-scores for the identical protein-protein complex run multiple times HADDOCK	5	1196	November 16, 2020
Challenges in Consistency: Comparing Mutated Protein Forms with Wild-Type Using Haddock HADDOCK	1	53	June 19, 2024
Docking of 2000 protein-antibody pairs using HADDOCK HADDOCK	3	535	February 2, 2021
Two docking runs for the same protein complex with active sites that differ in their length. why does HADDOCK score differs? HADDOCK	3	575	August 3, 2020

Different results on Intel/AMD machines

Related topics