Buried surface area

Hi

I’m using Haddock for a protein small-molecule system (with PCS and PRE restraints & tensors)

Some of my structure in it0/it1 contain
dock_76.pdb:REMARK buried surface area: -999999
which value ends up being used to calculate the haddock score.

From “print_coorheader.cns” it seems that this happens when the summed SA of the individual components is smaller than the SA of the complex:

 evaluate ($saburied = $saafree - $satot)
if ($saburied < 0) then
  evaluate ($saburied = -999999)
end if

However when looking at these structure they don’t look strange, the ligand is positioned in a pocket so I don’t see why the $saburied would not be positive.

Any idea what is going wrong ?

Eiso

Hi,

To answer my own question… ,

This seems to be caused by cns not removing the tensors from the buried accessible surface calculation in print_coorheader.cns. The XAN ‘molecules’ will be excluded from the SA calculation of the components but not of the SA of the complex, so the SA of the complex will be artificially high if the tensors are not overlapping with the protein/ligand.

Adding a " or resn XAN " to the selection of molecules in to be ignored seems to solve it.

In cases where the buried surface area is included in the haddock score this will affect the results of the scoring so the example run in examples/protein-protein-pcs is also (slightly) affected by this issue.

Here’s the relevant part of corrected print_coorheader.cns , please have a look.

while ($nchain1 < $data.ncomponents) loop nloop1
  evaluate ($nchain1 = $nchain1 + 1)
  {====>} {* buried surface area and desolvation*}
  do (rmsd = 0) (all)
  surface mode=access accu=0.075 rh2o=1.4 sele=(segid $Toppar.prot_segid_$nchain1 and not (resn WAT or resn HOH or resn TIP* or resn DMS or resn SHA or resn XAN)) end
  show sum (rmsd) (segid $Toppar.prot_segid_$nchain1 and not  (resn WAT or resn HOH or resn TIP* or resn DMS or resn SHA or resn XAN))
  evaluate ($saafree = $saafree + $result)
  do (store2 = rmsd * store1) (segid $Toppar.prot_segid_$nchain1 and not (resn WAT or resn HOH or resn TIP* or resn DMS or resn SHA or resn XAN))
  show sum (store2) (segid $Toppar.prot_segid_$nchain1 and not (resn WAT or resn HOH or resn TIP* or resn DMS or resn SHA or resn XAN))
  evaluate ($esolfree = $esolfree + $result)
end loop nloop1

do (rmsd = 0) (all)
do (store2 = 0) (all)
surface mode=access accu=0.075 rh2o=1.4 sele=(not (resn WAT or resn HOH or resn TIP* or resn DMS or resn SHA or resn XAN)) end
show sum (rmsd) (not ((resn WAT or resn HOH or resn TIP*) or resn DMS or resn SHA or resn XAN))
evaluate ($satot = $result)
do (store2 = rmsd * store1) (not (resn WAT or resn HOH or resn TIP* or resn DMS or resn SHA or resn XAN))
show sum (store2) (not (resn WAT or resn HOH or resn TIP* or resn DMS or resn SHA or resn XAN))
evaluate ($esolcplx = $result)
evaluate ($saburied = $saafree - $satot)
if ($saburied < 0) then
  evaluate ($saburied = -999999)
end if

best,

Eiso

Strange - did you try to manually calculate the BSA (e.g. using freesasa or naccess) as a check?

This check was introduced to penalise cases where for some reason the molecules would not dock (which would happen mainly at it0)

Well spotted!

You discovered what we call an undocumented feature :slight_smile:

PCS with ligands is not something that must have been used a lot in HADDOCK so far.

Will correct. Thanks!

Here’s what freesasa gives for the water refined structures in the protein-protein-pcs example.

INPUT
source  : run1/structures/it1/water/protein-protein-pcs_1w.pdb
chains  : AB 
model   : 1
atoms   : 2042

RESULTS (A^2)
Total   :   13300.01
Apolar  :    7331.78
Polar   :    5931.77
Unknown :      36.46
CHAIN A :    8250.35
CHAIN B :    5015.62
CHAIN   :      34.04

the CHAIN : 34.04 comes from the tensors I guess (they don’t have a chain ID in the PDB)

and for the same structure with the tensor/XAN residues removed from the pdb-file:

INPUT
source  : run1/structures/it1/water/protein-protein-pcs_1w-noTensor.pdb
chains  : AB
model   : 1
atoms   : 2002

RESULTS (A^2)
Total   :   13303.45
Apolar  :    7338.46
Polar   :    5937.60
Unknown :      27.40
CHAIN A :    8287.82
CHAIN B :    5015.62

You can see that all results except for CHAIN B are affected by tensors in the pdb.

Eiso

Great!

For protein-protein docking the effect is limited because the buried surface is usually much larger. With small molecules the contribution of tensors can make the buried surface invert sign and result in cns setting this to -999999

lucky that the warning was there!

Eiso