Use of distance restraints for protein complex docking

Good morning,

I would like to use cross-linking data (DSSO, DSS and DMTMM) to guide the protein docking or a fairly big protein complex.
From the cross-links detected and predictions via AlphaFold, I am hypothesizing that one protein of the complex is present either as a homodimer or as a homotetramer. The difficulty I have is to define if the intra-links detected are actual intra-links within the same subunit, or connecting different subunits of the homodimer/tetramer. I was wondering if HADDOCK can take this ambiguity into account, and select the shortest distance cross-links among all possible combinations. As I assume the structure of the dimer/tetramer of this protein will heavily affect the binding interface with the other proteins of the complex, I am not sure how to approach the modelling. Do you think modelling first the dimer/tetramer for “fitting” the crosslinks and then provide the best structure of this homodimerizing protein as a single protein (removing the intra-links at this point) would help to dock the other interactors ?

I wish my request for the Expert access to HADDOCK will be approved soon, so I can dive into the docking word and distance restraints !

Thanks in advance for your support and for developing such a useful software,

Giulia

Hi Giulia,

you could use our DISVIS software for the first task, perhaps checking this tutorial, where a few crosslinks are used to determine the oligomeric state of the complex Modelling a homo-oligomeric complex from MS cross-links – Bonvin Lab

Another tutorial (Integrative modelling of the apo RNA-Polymerase-III complex from MS cross-linking and cryo-EM data – Bonvin Lab) can be used to filter out false positives from the crosslinking dataset.

while docking don’t forget to enforce the appropriate symmetry!

as for the second question/step: yes, the dimer/tetramer conformation will affect the docking with the other proteins. I would indeed choose the (or a few) best conformation obtained in the first step and use it as a single protein (converting it into a single chain and renumbering the amino acids)

Hope it helps,
Marco

To add to Marco’s answer, you can define indeed ambiguous cross-links that allow for both intra- and intermolecular distances to be considered.

E.g. if you have two chains (A and B) a distance restraint would then look like:

assi (resid XX and name CA and segid A) (resid YY and name CA and (segid A or segid B)) 23.0 23.0 0.0

Here XX and YY are the residue numbers, and this would assume a 23A distance.

But if you only have such ambiguous crosslinks you will have to turn on the center of mass restraints options to make sure to get compact solutions.