The HADDOCK category is meant to discuss any HADDOCK-related issue. For general information about HADDOCK refer to HADDOCK – Bonvin Lab

I am interested in conducting protein-DNA docking; however, don’t have any idea about the active site. Hence, the docking is conducted in two steps. Initially, I perform ab initio docking in order to identify the active site for protein DNA docking. Subsequently, I proceed with site-specific docking. After completing the first step, I encounter a question on how to determine the list of passive residues. I have already obtained the active residues list by identifying the interacting residues between the DNA and protein. However, I am uncertain about the process of finding the passive residues.

to generate the .tbl file I am using the following command e2a-act-pass.list hpr-act-pass.list > e2a-hpr-ambig.tbl

However, how do I generate the passive residues list.

Thanks for your help,

If you have a well defined binding site in principle you don’t need per se passive residues.

But you can define them by taking the surface neighbours of your active residues.

In our haddock-tools GitHub repo we have a script that will do that for you (at least for the protein part):

Greetings and thank you for responding. I currently possess a protein-DNA complex that I acquired through ab initio docking. My objective now is to conduct site-specific docking. The interaction between the protein and DNA occurs through hydrogen bonding at two locations: Arg45-Thy65 and Arg144-Cyt122. Therefore, the following are the steps I intend to take:

1: python protein-DNA.pdb active.list This command should give me the passive residues list
active.list = 45,65,144,122
2: active.list passive.list > genrate.tbl

Is the above method correct?

Gaurav Sharma

1: python protein-DNA.pdb active.list This command should give me the passive residues list
active.list = 45,65,144,122
2: active.list passive.list > genrate.tbl

I would do step 1) separately for the protein and the DNA

Thank you for your response. So, you mean to do these three steps?

1: python protein.pdb active_protein.list > passive_protein.list
2: python DNA.pdb active_DNA.list > passive_DNA.list
cat active_protein.list active_DNA.list > active_combined.list
cat passive_protein.list passive_DNA.list > passive_combined.list
3: active_combined.list passive_combined.list > combined.tbl

Gaurav Sharma

Yes - that should be it

So, I am trying to run this command
python protein.pdb active_protein.list > passive_protein.list
but I am getting “The list of active residues must be provided as a comma-separated list of integers” in the passive_protein.list file
my active_protein.list is 45,159
Please tell me what I am doing wrong

The list of active residues must be provided as a comma-separated list of integers…

Edit the passive_protein.list file accordingly

sorry I am confused.
python protein.pdb active_protein.list > passive_protein.list
the above command is used to find the passive residues list from the active residues. Is this correct?


Yes this is correct.

I have a file named “active_protein.txt” that contains the values “45, 159”. When I run the command

“python protein.pdb active_protein.list > passive_protein.list”

The resulting “passive_protein.list” file displays the error message “The list of active residues must be provided as a comma-separated list of integers.” However, I am confused because my “active_protein.list” file already contains comma-separated integers (specifically, “45, 159”). Could you please help me understand what I am doing wrong? Thank you.

Make sure that all residues defined in your list do exist in the PDB file

Further, are “45, 159” the only residues in your active list?

Difficult to investigate without having your files at hand.

protein_DNA.pdb (338.9 KB)

Thank you for responding. One error I made was that the residue numbering of the protein started from 24. However, I corrected it and now the residues are numbered starting from 1. The new amino acids binding to the DNA are at positions 22 and 136. Despite these changes, I am still encountering the same error. Please find the attached PDB file. Your assistance is greatly appreciated.


The simple solution in your case would be to manually pick up the surface neighbours of your active residues from a visual inspection

Yes, that is what I was doing this far. However, I am now seeking to integrate ab initio docking and blind docking methods to automate the entire process. I was wondering if there are alternative approaches available for achieving this. I would prefer not to explore alternative software options, as the results obtained from Haddock have proven to be highly accurate.
Any suggestions or guidance would be greatly appreciated.
Thank you

Thanks for your persistence @gaurav.sharmapsit!

I’ve had a look in the script, from the -h command you see what are the arguments:

$ python -h
usage: [-h] [-c CHAIN_ID] [-s SURFACE_LIST] pdb_file active_list

positional arguments:
  pdb_file              PDB file
  active_list           List of active residues IDs (int) separated by commas

optional arguments:
  -h, --help            show this help message and exit
  -c CHAIN_ID, --chain-id CHAIN_ID
                        Chain id to be used in the PDB file (default: All)
  -s SURFACE_LIST, --surface-list SURFACE_LIST
                        List of surface residues IDs (int) separated by commas

And based in your previous comment, seems that the command you are typing is wrong:

it should be instead without active_protein.list:

$ python protein.pdb 45,159

Keep in mind that there are no spaces between the commas.

However, unfortunately DNA is not supported by this script:

$ python protein_DNA.pdb 45,159                  
There was an error while calculating surface residues: Error: Radius is <= 0 (-1.0) for the residue: DC, atom:  O5*