Haddock submission homodimer + 2 testo + 2 NAD ligands - doubts


I have a homodimer of a protein that I want to dock with 2 testosterones and 2 NADs, but i want that 1 testo + 1 NAD binding to the first half/chain (“A”) of the homodimer and the other to the other to the other one (“B”).

In the past i leaned how to submit homodimers to haddock and, mine are all good and renumbered for the chain B and appearing all refering (“merged”) to the chain A.
I do know which residues formed chains A and B AND i want to bind the testosterone to a known storoid binding site, imputing the position in the “active site” and I want to do the same for NAD , in a also known NAD binding site (these are known in humans and i converted to my species to do this in-silico analysis).

My problem is, that as it is a homodimer I will need to give the total of positions referring to a “copy” in both “chains” .

Yesterday I tried the following:

1 - imputing all five molecules : A- homodimer, B- testo1, C- nad 1, D- testo 2, E - NAD2
and in the imput parameters for molecule 1 I added all sites together (all storoid binding + all NAD binding) and only allowed B to be docked to C and A, C to A and B ; D to E and A, and E to D and A.
I defined the same in the interaction matrix and selected “Surface contact restraints” and submitted using the adviced values for protein-ligand docking you provide

2 and 3 - imputed 3 molecules: either homodimer + testo 1 + testo 2 OR homodimer +NAD 1 + NAD 2
where i submitted either all steroid binding sites for the 2 “chains” OR the NAD binding sites for the 2 “chains”, and again only allowed the docking to be from homodimer to testo/nad1 and from homodimer to testo/nad 2 and did the same in the interaction matrix and selected “Surface contact restraints” and submitted using the adviced values for protein-ligand docking you provide

But this is not how i want it to happen ideally, because i want the testos and NADs to bind specifically to those known reagions, one in each chain, therefore i thought of a "patchwork approach where I would do:

4 - homodimer + testo with binding sites for chain A
5 - homodimer + testo with binding sites for chain B

6 - homodimer + NAD with binding sites for chain A
7 - homodimer + NAD with binding sites for chain B

And then “merge” the best results from 4,5,6,7 in pymol

but HADDOCK blocks in the imput parameters and insists that for molecule 2 (in this case the ligand) active sites also need to be submitted …

So how can I make this more targetted approach , would be my question, so that I can really mimic what happens in biologic reality ?

I would be deeply thankful if someone could provide me with some knowledge :slight_smile:
and thank you again for this great tool
Best wishes

Why not simply dock to a monomer and then regenerate the dimer?

And otherwise you will need to define manually a set of restraint for each monomer-ligand combination and provide these to the server (in the distance restraints menu).

You can generate such restraints for example with our gentbl server (or by manually editing a restraint file).

Check: https://bianca.science.uu.nl/gentbl/

thanks for the kind help :slight_smile:
the homodimer was from a non-model species, predicted with alphafold, and edited to fit haddock requirements, i also have the alphafold predicted monomer**, but i dont know after predict these dockings how to create a dimer** , from that (if you could suggest some way i will be really happy :slight_smile: )

I checked the link you sent (and even the old one - https://alcazar.science.uu.nl/services/GenTBL/) but i am not confident how I can make it work, because i am having some doubts in what to submit here, if we are woking JUST with a monomer, I need to create 2 groups of active sites:

steroid binding: 219, 220, 223, 224, 225, 226, 228, 229, 230, 232, 236, 264, 267, 268, 276, 302, 303, 305, 306, 309, 345, 347, 348, 370

NAD binding: 93, 94, 95, 96, 97, 98, 114, 118, 138, 139, 141, 142, 143, 167, 168, 169, 170, 171, 172, 173, 174, 195, 217, 221, 222, 223, 236, 240, 262, 269, 271, 272, 273, 274, 276

that come from the monomer chain (which is A) , but how can i imput the “ligand chains” here ? it doesnt alow me to put anything ligand like… I also checked other tutorials and I can´t have access to PRODRG or the other server suggested for getting topology and parameter files for the ligand should be provided in CNS format (Small molecule docking – Bonvin Lab)

my two ligands are testosterone:
TESTOSTERONE_renamed_HETATM_clean_12042023.pdb (3.9 KB)

and NAD:
NAD_5892_renumb_HETATM_clean.pdb (5.6 KB)

homodimer and monomer:
HSD17B2_calidrispugnax_alpha_dimer_seq_merged.pdb (468.0 KB)

HSD17B2_calidrispugnax_alpha_seq.pdb (475.8 KB)

So, in summary, if you could help me on how to properly generate the restraints and make a dimer from the nomomer results, I will be really grateful and off course we will demonstrate our appreciation accordingly for being so considerative so far :slight_smile:

You ligand has chainID and a residue number. Specify those. And define the second set of protein residues to target another ligand with a different chainID

Thank you so much for your answer, in pymol when i fetch for those details I get this:
PyMOL>iterate (hetatm), print(“Chain: %s, Residue: %s%s” % (chain.strip(), resn, resi))
Chain: , Residue: UNL1
Chain: , Residue: UNL1
Chain: , Residue: UNL1
Iterate: iterated over 71 atoms.

So in this case my chain is a space " " and my residue number is 0,1,2 (…),71 ? (for NAD)

Because in GenTBL i cannot introduce spaces, could you please give me some extra hint here? ; I am sorry and I appreciate your help so far

The chainID should be the one HADDOCK will use. E.g. if your ligand is uploaded as the second molecule, it will get by default chainID B

OK ! thank you for the enlightenment! thank you so much :slight_smile:

So, for the case being I would have to describe which sites (from the monomer) would connect to B and C (testo and Nad) , correct ? and then attribute B and C in the docking submittion in HADDOCK, correct ?

I will try to run everything from there now :slight_smile: really appreciate it

You got it!

hi there,

I tried several things, like defining the restraints in different ways , using several of your protocols (adapted from the protein-ligand-shape one and from the small molecule one), but i keep receiving errors


HSD17B2_calidrispugnax_alpha_seq.pdb (475.8 KB)

TESTOSTERONE_dock_28112023.pdb (3.9 KB)

NAD_dock_28112023.pdb (5.6 KB)

using the follwing restraints:
restraints_monomer_AABC_ambig.tbl (154.0 KB)

and the following run parameters:
test_monomer_ind_testo_nad_job_params.json (669.6 KB)

what have i done wrong defining/using/setting parameters in the HADDOCK run , that gets me the run to fail and “Your HADDOCK run stopped because of the following error:

I would much value your help, We are finalizing a high impact publication submission and I would really appreciate your help to finally finish and submit,
Thanks in advance

I have finally managed to do it, i was doing the restraints wrong, now they worked:

steroid binding site:

NAD binding site:

Residues from the ligands to bind to the restraints (these are all the residues in the ligand file):

I used the following:

A - Monomer (1) → 3
A - Monomer (2) → 4
B - Testo (3) → 1
C- NAD (4) → 1

and then, i used the generated -tbl file after uploading the monomer as A, testosterno as B and NAD as C

I did not changed anything in the imput parameters and made a mix from the ligand and the shape tutorial for the other parameters

Thank you so much :slight_smile:

Glad to hear it worked out at the end!