Scoring interface histidine

Hi
quick question: how do deal with HIE residue in my protein when performing the refinement stage? The web server does not recognize HIE as HISE. Any help in this?
Then, when submitting an ensemble of structures, does HADDOCK refinement stage match the corresponding protein-peptide partner? I mean, in protein.pdb and in peptide.pdb, I will put 20 structure respectively, which are from 20 complexes.
Best
and

Hello Andrea,

Using the expert (or higher-level) interface, you can specify the protonation state of your proteins. In “First/Second molecule” >> “Histidine protonation states”, you can uncheck “Automatically guess histidine protonation states using molprobity” and provide a residue id together with a protonation state (“HISD”, “HISE”, “HIS+”) for each histidine or a selection of them.

I’m not sure I understood your second question but you will have a way to know which model come from which combination of ensembles by backtracking the refinement models using PDBtraceback.py present in the tools/ directory of your HADDOCK job directory.
As a side note, If you unckecked “Perform cross-docking” in “Advanced Sampling Parameters” of the guru interface, HADDOCK will dock the structure 1 of the ensemble A (here protein.pdb) with the structure 1 of the ensemble B (here peptide.pdb).

I hope I answered your questions, do not hesitate if this is not the case.

Best regards,

Mikael

Hi Mikael,
indeed I was expecting these options in the “refinement interface” but there are not. My goal is to score my protein-peptide using HADDOCK score.

best

quick question: how do deal with HIE residue in my protein when performing the refinement stage? The web server does not recognize HIE as HISE. Any help in this?

The server only accepts HIS and will automatically defined the protonation state using molprobity
I you want to define those manually, you would have to use the guru interface and change all settings to those used by the refinement server (save a parameter file and look at the changes compared to a default docking using the easy interface.

Then, when submitting an ensemble of structures, does HADDOCK refinement stage match the corresponding protein-peptide partner? I mean, in protein.pdb and in peptide.pdb, I will put 20 structure respectively, which are from 20 complexes.

Yes - the matching is one to one in the refinement interface, e.g. model 10 in your protein ensemble will be combined with model 10 in your peptide ensemble.

1 Like

Thanks Alexandre! I will do it then.
best
andrea

Oh my bad, I did not understand you were using the refinement interface.

To get access to these options, I would suggest to use the guru interface and change a couple of parameters to do only refinement:
In “Distance Restraints”:
Define center of mass restraints to enforce contact between the molecules” > True
Define surface contact restraints to enforce contact between the molecules” > True
In “Sampling parameters”:
Number of structures for rigid body docking (it0)” > Same as it1 and water, in your case, if you want 10 models per complex I’d suggest to use 200
Sample 180 degrees rotated solutions during rigid body EM” > False
In “Advanced Sampling Parameters”:
Perform cross-docking” > False
Multiply the number of calculated structures by all combinations” > True
Randomize starting orientations” > False
Perform initial rigid body minimisation” > False
Allow translation in rigid body minimisation” > False
number of MD steps …”*4 > 0 for the 4 values (respectively 500/500/1000/1000 by default)

And of course the histidine protonation states to be provided in the “First/Second molecule” sections.

Cheers,
Mikael

1 Like

Probably I was not clear … :wink:
andrea

Hi Alexandre,
I have just submitted a job in “refinement interface”, an ensemble of 20 complexes (20 proteins and 20 peptides). I just checked the run and there are 400 complexes running in the refinement … (20x20). I was expecting only 20, 1 vs 1, 2 vs 2, etc …
what is wrong?
and

Nothing is wrong - it refines 20 times each combination of starting models (no cross-combinations!)

I.e. if you would have 10 models in your ensemble, you will get at the end 200 models.

Download the param file and check that crossdocking is set to false.

I see. crossdock is indeed false.

Okay I got the job finished and at the end I got 11 Clusters. However, my goal is to get for each model of 20 a single HADDOCK score, i.e. single point score. Probably the best and fast way is to submit a single model one at a time.

If you download the archive, all the info is there…
You only have to figure out the mapping between your input models and the water-refined models.

There is a script that will tell you that. It is in the tools directory: PDBtraceback.py

The only thing is that is expect the run directory to start with run1.
So rename the dir and then call the script from within the run1 dir

./tools/PDBtraceback.py

It will create a file called traceback.list containing the info you want, e.g.:

Structure traceback information for run /home/abonvin/haddock2.2/examples/protein-protein/run1
Date/time: Wed Feb  8 23:16:18 2017
Number of structures: 1000 in it0, 200 in it1 and 200 in water refinement
Sorting order: water(struct. nr.) matches it1 (struct. nr.) matches it0 (struct. nr.) matches input structures.
*****************************************************************************************************************
      complex                             it0      hscoreit0       it1      hscoreit1      water      hscorew
BEGIN:e2aP_1F3G_1.pdb/BEGIN:hpr_1_3.pdb         3       -38.3082        65      -112.7621        65      -156.8870
BEGIN:e2aP_1F3G_1.pdb/BEGIN:hpr_1_3.pdb       773       -37.8647        72       -98.3804        72      -155.5624
BEGIN:e2aP_1F3G_1.pdb/BEGIN:hpr_1_3.pdb       173       -37.5500        79      -110.3051        79      -153.7425
...

:slight_smile:

1 Like

great!!! I will try to get the score then.

mv run2 to run1, then I run the python script as you suggested:
–> Starting PDB traceback process
Traceback (most recent call last):
File “./tools/PDBtraceback.py”, line 530, in
PluginCore(option_dict, inputlist=option_dict[‘input’])
File “./tools/PDBtraceback.py”, line 49, in PluginCore
traceback.GetIt0Structures()
File “./tools/PDBtraceback.py”, line 392, in GetIt0Structures
self.fileit0_list = self._SortonIndex(inlist=self.fileit0_list,indexlist=self.fileit1_list,index=True) #Sort according to index it1
File “./tools/PDBtraceback.py”, line 231, in _SortonIndex
tmp1.append(inlist[int(n)-1])
IndexError: list index out of range

Which python version are you using?
Works for me with 2.6 and 2.7

version is 2.7.12 on ubuntu 16.04

In first approximation, the score from the refinement interface coul be considered expressed in kcal/mol?

Kind of… but I rather say a.u. - i.e. arbitrary units, since we have shown there is little correlation between docking scores and binding affinity…

I just thought that in this case the Eair = 0 so the finale score = kcal/mol. Am I wrong?

In the water refinement score, we still have Edesolv (a.u.) and a weighted distribution of Evdw and Eelec. This scheme works well for our scoring purpose but cannot be interpreted as binding affinity expressed in kcal/mol