Utilizing GROMACS Simulation Files for DNA Structure Analysis with the biobb_dna Package

fatemeh · May 13, 2024, 3:20pm

Hello, I would like to use biobb_dna to analyze the DNA structure. I have used GROMACS, so I have .tpr and .xtc files. Can I use these files with biobb_dna ?
thank you in advance

adam.hospital · May 13, 2024, 3:56pm

Hi fatemeh,

I think that the Curves+ available from the Conda package is only compatible with mdcrd and netcdf formats. I would convert the xtc file to netcdf (using some tool like the mdtraj mdconvert) and then use a pdb and the netcdf files as input for the biobb_dna workflow.

Please contact us if you have any problem running the workflow.

Regards,

-Adam-

fatemeh · May 13, 2024, 4:28pm

I highly appreciate your prompt reply. biobb_dna is fantastic. It would be great if it could support GROMACS directly because Amber cannot be used for long simulations and I have to run my analysis with biobb_dna. My .xtc files are pretty big, around 4GB. Do you think biobb_dna can support this?
To be sure, I think I should upload .tpr or .top files instead of a .pdb file?!

adam.hospital · May 14, 2024, 11:15am

Hi Fatemeh,

in order to help you I would need to understand how you are using biobb_dna. Are you using our workflows through Jupyter Notebooks, pure python or are you maybe using our biobb_wfs web server?

Thanks,

-Adam-

fatemeh · May 14, 2024, 11:55am

Thank you so much for your assistance. Working with Jupyter has become much easier for me, and I am now able to use Jupyter on the supercomputer. If I can use biobb_dna on the supercomputer for GROMACS, it will be fascinating.

Thank you again for your help
best regrads,
Fatemeh,

adam.hospital · May 14, 2024, 12:22pm

Hi Fatemeh,

ok, so if you are using the Jupyter notebook, then you just need to first convert your trajectory file (.xtc) to a biobb_dna compatible format like netcdf (.nc). This conversion can be done regardless of the MD engine used to generate the trajectory, it is just a format conversion. There are many available tools that can help you with this process (e.g. MDTraj-mdconvert). Then you can use the netcdf trajectory and a PDB as inputs for the biobb_dna workflow (again, no worries about using a PDB as a topology for this particular workflow, Curves+ should work fine with a PDB file as input).

Hope it helps!

-Adam-

fatemeh · May 14, 2024, 12:32pm

sure, I will try it. thank you so much.

fatemeh · May 15, 2024, 12:41pm

Hello,

Please accept my apologies. I converted an XTC file to NC and a GRO file to a PDB file. I have a few questions:

The DNA molecule has 90 base pairs, but it seems that the Biobb_dna tool only considers the following base pairs:

CCCATTGTCGCCTTGCACCGTTCATATGTTATCGGGACTCTGGTGTCTCA
Does this mean that it does not consider the entire sequence?

2)I only change the following in the script
Input parameters
seq = “CCCATTGTCGCCTTGCACCGTTCATATGTTATCGGGACTCTGGTGTCTCACCCATGGGATGTCGTAACCTTAGCACGATCAGGGGTCGTC”
seq_comp = “GACGACCCCTGATCGTGCTAAGGTTACGACATCCCATGGGTGAGACACCAGAGTCCCGATAACATATGAACGGTGCAAGGCGACAATGGG”

and
prop = {
‘s1range’ : ‘1:90’,
‘s2range’ : ‘180:91’
please let me know if i have to change something.

3)I encountered the following error message:

Error: /miniconda3/envs/biobb_dna_env/lib/python3.9/site-packages/biobb_common/tools/file_utils.py:771:
UserWarning: biobb_dna.curvesplus.biobb_curves input_top_path:only_DNA_str_90.pdb extension is not in the
valid extensions list: [‘top’]. If you want to suppress this message, please set the check_extensions
property to False

Thank you in advance.
Best regards,
Fatemeh,

adam.hospital · May 21, 2024, 1:27pm

Hi Fatemeh,

I think this should be solved using the biobb_canal 'sequence' property.
It seems correct.
This is a mistake we introduced in the latest versions of the biobb library. We should include the PDB format as an accepted format here. From now, please use the property 'check_extensions' as stated in the warning text like this:

from biobb_dna.curvesplus.biobb_curves import biobb_curves

curves_out_lis = "curves.out.lis"
curves_out_cda = "curves.out.cda"

prop = {
    's1range' : '1:56',
    's2range' : '112:57',
    'check_extension' : False
}

biobb_curves(
    input_struc_path=traj,
    input_top_path=pdb,
    output_lis_path=curves_out_lis,
    output_cda_path=curves_out_cda,
    properties=prop
)

A new issue has been created in the biobb_dna repository to fix this problem. Thank you very much for your feedback!

Hope it helps!

-Adam-

fatemeh · May 21, 2024, 3:44pm

Hello, thank you so much for reply and assistance. I will definitely try it.
thank you again.
have a nice day.
Best regards,
fatemeh,

fatemeh · August 18, 2024, 12:12pm

Hello,
Dear Dr. Hospital,
I think there is a problem . I applied the following changes to the Jupiter note book, but there is a problem for base pair more than 45

as you can see after 43 the twist angel is not calculated correctly. please correct me if i am wrong.in advance I apricate your assistance.
I look froward to hearing from you.
Best regards,
Fatemeh,

38	AC	29.58746667	5.147374541
39	CA	39.73465	5.525619679
40	AT	30.38325333	4.06359534
41	TA	38.34820333	6.376172085
42	AC	29.66947667	5.632445458
43	CA	37.50877333	7.04495244
44	AG
45	GA	0	0
46	AT	0	0
47	TT	113.6616133	72.57015102
48	TA	119.78377	65.48163868
49	AC	84.30679	130.1986493
50	CA	111.5992967	98.48160509
51	AT	124.00301	68.71755656
52	TA	92.43231333	111.706258
53	AC	128.6853067	46.20100385
54	CA	85.62819667	123.0809933
55	AT	128.6709	56.63424348

adam.hospital · August 19, 2024, 10:41am

Hi Fatemeh,

[response cloned from a previous post]

it is difficult to find the reasons of this without the trajectory file. I guess the quality check analyses (RMSd, fluctuation, HBs, etc.) are not giving any strange jumps, is that right? If so, could you please try to share the topology and trajectory files with me somehow? Maybe with an on-line file sharing service like Google drive? (I know they are big)

Looking forward to your reply.

Regards,

-Adam-

Topic		Replies	Views
Need help following Protein Analysis Tutorial MDWeb	4	726	September 29, 2021
Trajectory File Size BioBB	1	244	January 16, 2023
jupyter notebook /biobb	1	591	October 7, 2020
Problem running gromacs simulation MDWeb	27	346	November 14, 2023
Exploring the Limitations of biobb_dna in Handling Large DNA Sequences BioBB	3	78	August 19, 2024

Utilizing GROMACS Simulation Files for DNA Structure Analysis with the biobb_dna Package

Related topics