Utilizing GROMACS Simulation Files for DNA Structure Analysis with the biobb_dna Package

Hello, I would like to use biobb_dna to analyze the DNA structure. I have used GROMACS, so I have .tpr and .xtc files. Can I use these files with biobb_dna ?
thank you in advance

Hi fatemeh,

I think that the Curves+ available from the Conda package is only compatible with mdcrd and netcdf formats. I would convert the xtc file to netcdf (using some tool like the mdtraj mdconvert) and then use a pdb and the netcdf files as input for the biobb_dna workflow.

Please contact us if you have any problem running the workflow.

Regards,

-Adam-

I highly appreciate your prompt reply. biobb_dna is fantastic. It would be great if it could support GROMACS directly because Amber cannot be used for long simulations and I have to run my analysis with biobb_dna. My .xtc files are pretty big, around 4GB. Do you think biobb_dna can support this?
To be sure, I think I should upload .tpr or .top files instead of a .pdb file?!

Hi Fatemeh,

in order to help you I would need to understand how you are using biobb_dna. Are you using our workflows through Jupyter Notebooks, pure python or are you maybe using our biobb_wfs web server?

Thanks,

-Adam-

Thank you so much for your assistance. Working with Jupyter has become much easier for me, and I am now able to use Jupyter on the supercomputer. If I can use biobb_dna on the supercomputer for GROMACS, it will be fascinating.

Thank you again for your help
best regrads,
Fatemeh,

Hi Fatemeh,

ok, so if you are using the Jupyter notebook, then you just need to first convert your trajectory file (.xtc) to a biobb_dna compatible format like netcdf (.nc). This conversion can be done regardless of the MD engine used to generate the trajectory, it is just a format conversion. There are many available tools that can help you with this process (e.g. MDTraj-mdconvert). Then you can use the netcdf trajectory and a PDB as inputs for the biobb_dna workflow (again, no worries about using a PDB as a topology for this particular workflow, Curves+ should work fine with a PDB file as input).

Hope it helps!

-Adam-

sure, I will try it. thank you so much.

Hello,

Please accept my apologies. I converted an XTC file to NC and a GRO file to a PDB file. I have a few questions:

  1. The DNA molecule has 90 base pairs, but it seems that the Biobb_dna tool only considers the following base pairs:

CCCATTGTCGCCTTGCACCGTTCATATGTTATCGGGACTCTGGTGTCTCA
Does this mean that it does not consider the entire sequence?

2)I only change the following in the script
Input parameters
seq = “CCCATTGTCGCCTTGCACCGTTCATATGTTATCGGGACTCTGGTGTCTCACCCATGGGATGTCGTAACCTTAGCACGATCAGGGGTCGTC”
seq_comp = “GACGACCCCTGATCGTGCTAAGGTTACGACATCCCATGGGTGAGACACCAGAGTCCCGATAACATATGAACGGTGCAAGGCGACAATGGG”

and
prop = {
‘s1range’ : ‘1:90’,
‘s2range’ : ‘180:91’
please let me know if i have to change something.

3)I encountered the following error message:

Error: /miniconda3/envs/biobb_dna_env/lib/python3.9/site-packages/biobb_common/tools/file_utils.py:771:
UserWarning: biobb_dna.curvesplus.biobb_curves input_top_path:only_DNA_str_90.pdb extension is not in the
valid extensions list: [‘top’]. If you want to suppress this message, please set the check_extensions
property to False

Thank you in advance.
Best regards,
Fatemeh,

Hi Fatemeh,

  1. I think this should be solved using the biobb_canal 'sequence' property.
  2. It seems correct.
  3. This is a mistake we introduced in the latest versions of the biobb library. We should include the PDB format as an accepted format here. From now, please use the property 'check_extensions' as stated in the warning text like this:
from biobb_dna.curvesplus.biobb_curves import biobb_curves

curves_out_lis = "curves.out.lis"
curves_out_cda = "curves.out.cda"

prop = {
    's1range' : '1:56',
    's2range' : '112:57',
    'check_extension' : False
}

biobb_curves(
    input_struc_path=traj,
    input_top_path=pdb,
    output_lis_path=curves_out_lis,
    output_cda_path=curves_out_cda,
    properties=prop
)

A new issue has been created in the biobb_dna repository to fix this problem. Thank you very much for your feedback!

Hope it helps!

-Adam-

Hello, thank you so much for reply and assistance. I will definitely try it.
thank you again.
have a nice day.
Best regards,
fatemeh,

Hello,
Dear Dr. Hospital,
I think there is a problem . I applied the following changes to the Jupiter note book, but there is a problem for base pair more than 45

as you can see after 43 the twist angel is not calculated correctly. please correct me if i am wrong.in advance I apricate your assistance.
I look froward to hearing from you.
Best regards,
Fatemeh,

38 AC 29.58746667 5.147374541
39 CA 39.73465 5.525619679
40 AT 30.38325333 4.06359534
41 TA 38.34820333 6.376172085
42 AC 29.66947667 5.632445458
43 CA 37.50877333 7.04495244
44 AG
45 GA 0 0
46 AT 0 0
47 TT 113.6616133 72.57015102
48 TA 119.78377 65.48163868
49 AC 84.30679 130.1986493
50 CA 111.5992967 98.48160509
51 AT 124.00301 68.71755656
52 TA 92.43231333 111.706258
53 AC 128.6853067 46.20100385
54 CA 85.62819667 123.0809933
55 AT 128.6709 56.63424348

Hi Fatemeh,

[response cloned from a previous post]

it is difficult to find the reasons of this without the trajectory file. I guess the quality check analyses (RMSd, fluctuation, HBs, etc.) are not giving any strange jumps, is that right? If so, could you please try to share the topology and trajectory files with me somehow? Maybe with an on-line file sharing service like Google drive? (I know they are big)

Looking forward to your reply.

Regards,

-Adam-