Haddock multi-chain problem

Hello everyone,
I’m docking HLA proteins and some peptides. HLA protein has two chains (A and B). I read local tutorial and “Dealing with multi-chain proteins” part. I followed the tutorial. 4g9d and 1jgd pdb structures was used. Firstly, pdb_selaltloc.py was used to get rid of double occupancies. Then, I deleted waters using pdb_delhetatm script. There are 3 chains with peptides. pdb_selchain was used to separate the chains. Numbering of chain B was shifted for both of them. Next, I merged chain A and B using cat command and I also used pdb_seg.py and pdb_chain.py to delete chain and segment names. Lastly, restrain_bodies script was used to define distance restraints and segid was added as tutorial.
For docking, molecule 1 was 4g9d clean version and molecule 2 was a peptide. I changed only the kind of molecule 2 for first page. The rest was default. For active and passive residue page, I chose the binding grove residues as active residues for molecule 1 and full flexibility was given to these residues (binding site residues). The peptide residues were passive and the peptide was also fully flexible. For third page, unambiguous restraint file was added and the sampling parameters were changed. Number of structures for rigid body docking, number of structures for semi-flexible refinement, and number of structures for the final refinement were increased to 2000, 400, and 400, respectively. The number of models generated was adjusted as 400. “Refine with short molecular dynamics in explicit solvent?” option was opened.
I used three peptides for each HLA protein. One of them is HIV peptide in 4g9d pdb file. The others were modelled using PEP-FOLD. The results were not like I want. Two chains were separated after docking except HLAB2705-hiv and HLAB2709-hiv dockings. However, I want to keep them together so I added unambiguous restraint file. Can you help me? How can I fix this separation?

Are you entering your two HLA chains as one molecule in the server?
It should automatically define distance restraints to keep them together.

Did you check that the starting structure after all your preprocessing is looking ok?

Also define separation?

The fact that some work and some not is also strange. If you were to have bad clashes between the two chains this could also explain why the separate.

Yes, I am since I want to dock HLA protein (including chain A and chain B) and a peptide. “Unambiguous restraint file should be used to keep them together” is written in local tutorial. That is why I used restrain_bodies script.

I checked the starting structure. If I understand you correctly, I checked complex_1.pdb in begin folder of the result.

This picture shows hla protein in the beginning.

I’m adding some screenshots from result page to define separation. I want my results to be like normal one but as I said before, most of the results are like separated one.
normal
normal_2


Did you mean that separated ones have lower HADDOCK score?

Are you running locally or via the server?

The server should identify automatically the disconnected parts.

I see no reason why it should work in one case and not in another.

Did you generate unambiguous restraints for each complex separately?

And no full flexibility defined?

I’m using the web server.
Is there a problem about my parameters or job line?
These are two of my results I uploaded as picture : https://wenmr.science.uu.nl/haddock2.4/run/8330886674/75093-hla09-hiv-flexible-unambig-1
https://wenmr.science.uu.nl/haddock2.4/run/8330886674/75176-hla05-ARGQPGVMG-flexible-unambig-1

Yes, I generated using restrain_bodies for HLA-B27:05 and HLA-B27:09 separately.
These are the contents of my unambiguous restraint files:

HLA-B*27:05 (4g9d) ->assign (segid A and resi 268 and name CA) (segid B and resi 512 and name CA) 26.217 0.0 0.0
assign (segid A and resi 264 and name CA) (segid B and resi 584 and name CA) 36.261 0.0 0.0

HLA-B*27:09 (1jgd) ->assign (segid A and resi 268 and name CA) (segid B and resi 511 and name CA) 27.670 0.0 0.0
assign (segid A and resi 264 and name CA) (segid B and resi 583 and name CA) 36.539 0.0 0.0

I checked the gzipped tar file. There is unambiguous restraint file in data/distance. This is the content of the file for HLA-B*27:05 and ARGQPGVMG peptide:
! Molecule #1 gap(s) restraint(s)
assign (resid 26 and name CA and segid A) (resid 524 and name CA and segid A) 22.219 0.00 0.00
assign (resid 523 and name CA and segid A) (resid 28 and name CA and segid A) 25.812 0.00 0.00
! User defined unambiguous restraints
assign (segid A and resi 268 and name CA) (segid B and resi 512 and name CA) 26.217 0.0 0.0
assign (segid A and resi 264 and name CA) (segid B and resi 584 and name CA) 36.261 0.0 0.0

Do you think this difference could be the reason for the separation? I tried without using unambiguous restraint distance before. However, the separation happened again at that time.

I defined binding grove of the proteins and peptides as fully flexible.

Looking at your runs, you have changed many more parameters in the one failing… E.g. the number of steps has been increases by a factor 4.

Further your restraints won’t work since you defined chain A and B in those, and you HLA is one molecule with chainA in HADDOCK.
Or worse the restraints will be applied to the peptide if the numbering matches.

But as you have seen from the server restraints are automatically defined to keep the parts together. I.e. it is not required to defined those restraints for the server.
But it is for a local run as explained in the local tutorial.

Further the region moving could be removed all together to speed up things. It is not involved in binding anyway.

Any finally why don’t you try with default settings first instead of defining fully flexible segments?

I set back the number of steps to the default and deleted your unambig restraints. Here is the result:

https://wenmr.science.uu.nl/haddock2.4/run/1111111111/75546-test

Firstly, thank you very much for your time.

I read a tutorial (HADDOCKing of the p53 N-terminal peptide to MDM2) on HADDOCK web site
and your article ( A Unified Conformational Selection and Induced Fit Approach to Protein-Peptide Docking) about protein-peptide docking so I decided to give full flexibility to binding grove and the peptide and I changed some parameters.

A Unified Conformational Selection and Induced Fit Approach to Protein-Peptide Docking.

Ah, I did not realize at that time it is for only local run. Thank you for this :slight_smile:

To my knowledge from articles, chain B of HLA does not interact with peptide but it helps to chain A for stabilization and binding of peptide. I will run a MD simulation after docking and use the structure of docking result. Therefore, I want to add chain B of HLA for docking.

Do you suggest me to not change sampling parameters and use default settings for flexibility? It looks it worked.

Note that the run I did was still with your full flexibility settings

But in our paper we only give full flexibility to the peptide

I thought I should give full flexibility to binding groove for conformational selection since protein shape changes to be able to bind the peptide. I see now this thought was wrong.

I will remove full flexibility for the protein, increase number of steps by a factor 4, and remove unambiguous restraints for my next docking runs. I hope I can be successful.

Thank you a lot for your helping.

I mean that protein should not be rigid for flexible docking. Therefore, I gave full flexibility to binding groove.

By default all interfaces will be treated as semi-flexible - full flexibility is a level higher and potentially dangerous since the first stage of the simulated annealing takes place at 2000k

But ok for the peptide

Okay, thank you very much again for all information.
I learned a lot and now my docking results are as I want