From what I’ve read -
When providing active residues in an AIR, 50% of these are discarded for each run (by default, but this option can be removed)
Does this mean for each structure generated it forces it to use all of the active residues used for that run in the interface or are some of them optional? I’m wondering if this is a “hard” docking, meaning that if all of the active residues can not be used, the structure is discarded. Or does it do something like use at least 1 residue?
50% of the AIRs are by default randomly deleted for each docking model (but the same AIRs are used per model for the various stages of HADDOCK). It means that a given model does not have to fulfil all the defined AIRs.
And this will of course only work provided there are at least two residues defined as active.
How does providing residues then affect the random generation of structures?
I think you are mixing a few things here. There are different steps in HADDOCK:
- the first one is a randomisation of the orientations of molecules prior to the real docking - AIRs play no role in that
- then the initial docking takes place by rigid-body energy minimisation, which is guided by the AIRs. This is where models will be generated with different subsets of the restraints if the 50% random removal is turned on. I.e. different restraints will be active for different models during this stage
- for the subsequent flexible refinement, for each model, the same subset of AIRs as in it0 (rigid body docking) is used.
I’m still a little unclear on things:
If I want to provide a list of residues (ab-initio), which could be anywhere on the protein (nowhere near each other), would I get the best results doing a single run for 2 residues or can I just provide the entire list for one run? Also, would they need to be defined as passive or active?
If you want to perform an ab-initio docking run with haddock you have two options (if no info available for both proteins):
Use the Center of mass restraints (expert/guru interface -> distance restraints menu).
Use the Random patches option which randomly selects a solvent accessible residue on each protein and defines a patch around those. Those become the active residues for that model (expert/guru interface -> distance restraints menu)
Option 2 is probably better if your molecules are highly anisotropic in shape. In both cases do increase the sampling to say 10000 models for it0 and 400 for it1 and water.
In case you do have info for one molecule but not for the other, a better strategy is to define on the second molecule all solvent accessible residues as passive.
Where to save (in which directory) the generated AIR restraints file in CNS format?
And also is it acceptable to change the save as file to xyz.cns, as it automatically capture xyz.cgi format?
As long as it is saved as plain text ASCII file it does not matter where and under which name you save it. But it is not a cgi script. So .txt might make more sense.