Protein - dsDNA specificity

I’d like to find some preferable interaction between defined dsDNA with a defined sequence and small protein motif, specifically HTH. I suspect the binding area within the major groove, which is at the dsDNA centre, yet I’d like to get some more rational result - ideally hinting at amino-acid - DNA base interactions.
I got through the paper on MARTINI forcefield in HADDOCK’s 2.4 coarse-grained option and I’d like to ask for your advice on how to get the best of HADDOCK in this case.

  • What options to choose besides coarse-grained for both DNA and protein? Should I go for that anyway in my case?
  • What are the most exhaustive sampling values in HADDOCK 2.4 and what I should choose for this scenario? (I have expert-lvl access for now)
  • What restrains should I use? Center of mass in some loose form? I know/expect, that HTH’s recognition helix will recognize the dsDNA sequence, so instead of choosing the general centre of mass I should define somehow ambiguous restraints between protein region and the whole dsDNA (not to preclude less obvious solutions)? How to define such restrains in HADDOCK 2.4?

Hi! In this scenario in which you are inspecting aminoacid-nucbase interactions, the best course would be to define your protein motif as active and the bases as passive, excluding all phosphate backbone atoms

However there is no automatic way of defining such specific restraints, you need to define the atoms manually in a code editor (do not use Word/OpenOffice).

First input your active and passive residues using this tool: http://milou.science.uu.nl/services/GenTBL and then define the atoms:

Chain A Protein active residues: 1, 2
Chain B DNA (sequence: ATCGA) passive residues: 1, 2, 3, 4, 5

assign ( resid 1  and segid A)
       (
        ( resid 1  and segid B and (name N9 or name C4 or name C2 or name N3 or name C6 or name N6 or name N1 or name C8 or name N7 or name C5))
     or
        ( resid 2  and segid B and (name N1 or name C6 or name N3 or name C2 or name O2 or name C5 or name C4 or name O4 or name C7))
    or
        ( resid 3  and segid B and (name N1 or name C6 or name N3 or name C2 or name O2 or name C5 or name C4 or name N4))
    or
        ( resid 4  and segid B and (name C3' or name C2' or name O2' or name C1' or name N1 or name C6 or name N3 or name C2 or name O2 or name C5 or name C4 or name O4))
    or
        ( resid 5  and segid B and (name N9 or name C4 or name C2 or name N3 or name C6 or name N6 or name N1 or name C8 or name N7 or name C5))
        ) 2.0 2.0 0.0

assign ( resid 2  and segid A)
       (
        ( resid 1  and segid B and (name N9 or name C4 or name C2 or name N3 or name C6 or name N6 or name N1 or name C8 or name N7 or name C5))
     or
        ( resid 2  and segid B and (name N1 or name C6 or name N3 or name C2 or name O2 or name C5 or name C4 or name O4 or name C7))
    or
        ( resid 3  and segid B and (name N1 or name C6 or name N3 or name C2 or name O2 or name C5 or name C4 or name N4))
    or
        ( resid 4  and segid B and (name C3' or name C2' or name O2' or name C1' or name N1 or name C6 or name N3 or name C2 or name O2 or name C5 or name C4 or name O4))
    or
        ( resid 5  and segid B and (name N9 or name C4 or name C2 or name N3 or name C6 or name N6 or name N1 or name C8 or name N7 or name C5))
        ) 2.0 2.0 0.0

Keep in mind that if the syntax is wrong the run will fail! Save this file as ambig.tbl and upload it as ambiguous restraints, you might need Guru access for that.

Good luck!

Thank you for your response. I generated an air file with the webpage and modified it. Yet, I’m getting error in HADDOCK 2.4:

HADDOCK restraints TBL file

The error emerges also with un-modified restrains file. I double-checked numbering - all residues within restraints are present in respective PDB files. For check, I tried to submit the very default run but only with unmodified restrains file and I also got an error.

I define 56 active residues from protein and 24 passive bases from dsDNA. All the numbering is OK also in DNA.

Are you defining the whole protein as active?

Could you give more details about this error? In which stage it happens?

I think the problem here is the file encoding of your restraints.
Should be ASCII, Linux/OSX compatible and not DOS.