Protein- peptide docking

Francesca_Cantini · June 21, 2016, 12:56pm

Dear Haddock users

I’m using the restraint-driven docking approach to predict the manner in which a 40-residue peptide binds a protein. As in all calculations HADDOCK clustered only 50 structures in 10 cluster(s), I have increased the number of models sampled during calculation , 5000/400/400 as suggested by Prof. Bonvin.

My question is , How can I also change the number of analysed structures ? In the dyr /structures/it1/water/ I obtain indeed 400 structures but in the dyr “analysis” only 200 resulted to be analysed

thanks
Francesca

amjjbonvin · June 21, 2016, 1:19pm

At the guru interface, you specify the number of models to analyse in the analysis parameter menu

Francesca_Cantini · June 27, 2016, 1:22pm

Thanks

I would like also ask something about the clustering.
I have analysed manually the resulted clusters and I have also tried to make the clustering of the solutions with “cluster_struc” program by myself, with and without the “full linkage” option. In all cases I have some strange results but I do not understand my error.
Once I analysed such clusters by chimera and overlay the pdb files contained within the same cluster I realize that such pdbs have different relative orientation , while it could be that two more similar complex structures have been put by the program in two different clusters.
I 'm sure that I’m doing something wrong in clustering but I do not what!

thanks Francesca Cantini

amjjbonvin · June 27, 2016, 2:06pm

Dear Francesca

First of all check which clustering option was selected in the server. The default might well be FCC, which means you now have to cluster with a different command and cutoff:

	$HADDOCKTOOLS/cluster_fcc.py 

Usage: cluster_fcc.py <matrix file> <threshold [float]> [options]

Options:
  -h, --help            show this help message and exit
  -o OUTPUT_HANDLE, --output=OUTPUT_HANDLE
                        Output File [STDOUT]
  -c CLUS_SIZE, --cluster-size=CLUS_SIZE
                        Minimum number of elements in a cluster [4]
  -s STRICTNESS, --strictness=STRICTNESS
                        Multiplier for cutoff for M->R inclusion threshold.
                        [0.75 or effective cutoff of 0.5625]

The cutoff should be between 0 and 1, the higher the value the more stringent the requirement for similar values (default is 0.75)

Francesca_Cantini · June 29, 2016, 1:18pm

thanks , but in $HADDOCKTOOLS dyr I have only cluster_struc program not fcc i suppose because I’m HADDOCK 2.1 instead of 2.2

Which algorithm do you suggest to use a peptide-protein complex? I was thinking that the RMSD-based clustering was better in my case than FCC method

amjjbonvin · June 29, 2016, 1:38pm

Are you using the local version of HADDOCK? Did you use the server or not? And which version?
If you use the 2.2 version of the server you will have to use also the 2.2. local version of HADDOCK to perform a manual cluster analysis

Francesca_Cantini · September 30, 2016, 8:38am

OK Now I’m using always the 2.2. version of HADDOCK both on the server and local for manual cluster analysis.
Which algorithm do you suggest to use for a peptide-protein complex? Perhaps the RMSD-based clustering is better in this case than FCC method?

I obtain always a high number of clusters. Reading he literature it seems to me that this is usual in the case of protein-peptide docking, but i do not have experience.
Moreover the cluster that contains the complex model that better fulfills experimental data ( like residues mutation) does not follow the CAPRI criteria ; docking model with i-RMSD or l-RMSD below 1A can be considered as a high accuracy predition or medium quality prediction will have a i-RMSD below 2A and/or l-RMSD below 5A. Such cluster indeed never contains the model with lower haddock score.

amjjbonvin · September 30, 2016, 9:05am

Hi Francesca

For smaller molecules, indeed RMSD might be a better option for clustering. But do reduce the cutoff (e.g. 5 or even 2.5A instead of 7.5 default)

As for your CAPRI criteria, do you mean you know the answer and the best HADDOCK score cluster is not the closest to the real structure?

If you do not know the answer, then there is no way you can use a RMSD criteria to select your cluster.

Francesca_Cantini · September 30, 2016, 9:46am

I do not know the answer but I know same key residues and i know for example that the peptide and one helix of the protein interaction with the N-term -Nterm and C-term-Cterm ,

amjjbonvin · September 30, 2016, 10:08am

Then you can not use any RMSD criteria to decide which cluster is better.
Simply analyse/visualize the top-ranked clusters and see how they fit/explain your experimental data

Francesca_Cantini · September 30, 2016, 10:46am

Thanks for your reply

Francesca_Cantini · October 3, 2016, 9:51am

Dear all

i would like to analyze the fraction of intermolecular contacts within each clusters, not only hydrophobic contacts. Perhaps it is already done by the program ? In the analysis dyr there is a file named nbcontacts.disp , this file contains all intermolecular hydrophobic contacts ,i guess, Is it possible analyse the contacts within each clusters?

thanks

amjjbonvin · October 3, 2016, 12:23pm

Hi Francesca

The server returns two relevant files for your analysis:

ana_hbonds.lis : give stats of intermolecular hydrodgen bonds
ana_nbcontacts.lis : give stats of intermolecular non-bonded contacts

To obtain the same analysis per cluster, you would have to run the analysis locally after downloading the run. This means editing the run.cns file to define the correct directories and cns executable. Follow then the instructions from:

http://www.bonvinlab.org/software/haddock2.2/analysis/#reanal

You can also use the contact or contact-chainID to list all contacts within a given distance cutoff for a given model, e.g.:

$HADDOCKTOOLS/contact cluster1_1.pdb 3.9

Lamp · November 24, 2023, 5:36pm

Hi, @amjjbonvin

I am using the server and I am trying to find the number of hydrogen bonds. and I have downloaded the complete run and I can see the “ana_hbonds.csh” and “count_hbonds.awk” files in the tools folder. The page you linked in this comment to give stats on the hbonds (Bonvin Lab) says to run these scripts manually by copying these files from the tools directory into the analysis directory, however I do not see the analysis directory. Additionally, should I be copying and " ./ana_hbonds.csh hbonds.disp" (although I don’t have the hbonds.disp file) into the terminal?
I am operating on a Mac do I need a separate program to run these scripts?
I don’t have much knowledge on how to “run scripts” so I would really appreciate your guidance.

amjjbonvin · November 26, 2023, 12:39pm

Simplest would be to use some third party software to analyse the hydrogen bonds.

Running the scripts you mentioned would analyse all models.

There are plenty of software for this. An old one that still does a good job is dimplot (part of ligplot).

Example of a recent tool is described in Arpeggio: A Web Server for Calculating and Visualising Interatomic Interactions in Protein Structures - PMC

Topic		Replies	Views
HADDOCK failed in the clustering of the solutions HADDOCK	5	1291	November 30, 2016
New User Question- HADDOCK	37	264	July 23, 2024
Unable to set number of molecules as 1 on haddock 2.4 web server to run single-molecule mode HADDOCK	26	95	May 16, 2025
Only 24% of models clustered HADDOCK	4	380	January 18, 2021
Error in HADDOCK3 Clustering (RMSD-based [clustrmsd]) during Protein-Protein Docking HADDOCK	5	174	February 26, 2024

Protein- peptide docking

Related topics