Usage of cross-link MS-derived distance constraints in HADDOCK 3.0

Dear HADDOCK developers,

I’m trying to use the HADDOCK 3 webserver (Haddock3) to dock antibody to antigen with experimental cross-link distance constraint.

I have some confusion that would be nice to have some input from you,

I started with defining active and passive residues on both molecules using the available scenario, then clicked the “Refine in builder“ to put in the unambiguous.tbl file (the distance constraints from XL-MS). I assume the defined residue information remains, correct?

My question is, rigidbody, flexref, and emref modules from the workflow all have “distance restraints“ option. Should I supply my file to all of them or just to the first rigid body module would suffice?

My second question is, in previous HADDOCK 2.4 setting, it is recommended to set the “Number of rigid body docking,” “Number of structures for semi-flexible refinement,” and “Number of structures for water refinement” to 10000, 200, 200 respectively. What is the corresponding option in the current HADDOCK 3, or do I need to add the “seletop” module in between modules for the current workflow?

I have attached my current workflow setup text for your reference. Appreciated for any thoughts. (ambig.tbl, and unambig.tbl are generated automatically with input molecules)

molecules = [
‘processed-A2A–unified_chain_Ab.pdb’,
‘processed-A2B–fixedchain_Ag.pdb’,
]

run_dir = ‘output’

[topoaa]

[topoaa.mol1]

[topoaa.mol2]

[rigidbody]

sampling = 10000
ambig_fname = ‘ambig.tbl’
unambig_fname = ‘unambig.tbl’

[clustfcc]

min_population = 10

[seletopclusts]

top_cluster = 500

[caprieval]

[flexref]

ambig_fname = ‘ambig.tbl’
unambig_fname = ‘unambig.tbl’

[emref]

ambig_fname = ‘ambig.tbl’
unambig_fname = ‘unambig.tbl’

[‘clustfcc.2’]

[‘seletopclusts.2’]

top_cluster = 500

[‘caprieval.2’]

[contactmap]

[seletop]

Hi DiT,

Yes, for each of these modules, you need to supply the restraints files (ambig and unambig) of interest, as you already did

  • Yes, to go from 10000 rigidbody docking complexes to 200, you should use the “seletop” module, with the parameter “select = 200”.
  • I would place it between the rigidbody and flexref modules, instead of doing a clustering (clustfcc - seletopclusts).
  • I would place the clustering step only after the emref module, as you already did.
  • I would not terminate the workflow by seletop

If you have good data (e.g. Your cross links a sampling of 1000 / 200 / 200should be sufficient (default settings)

And you should define the restraints for each stage (rigidbody / flexref / emref)

Do realise that the haddock3 server, while online is still in development with very limited computational resources behind it

I would suggest to either run haddock3 locally or use the 2.4 server

1 Like

Hi VGPReys,

thanks for the detailed clarification, appreciated for the help!

Thanks professor Bonvin, I will define the restraints for each stage.

Yes, I notice the online ver. of HADDOCK 3 takes extremely long time. Will coordinate with bioinformatic colleagues to see if we can set up a local version or I will use the 2.4 sever for now.