Exploring the Limitations of biobb_dna in Handling Large DNA Sequences

Hello, Can biobb_dna support a specific length of DNA? I have 90 base pairs, but it only considers 45 base pairs during calculation.

Hi Fatemeh,

sorry for the delay in answering your question.

Yes, you should be able to use biobb_dna with structures having >45 base pairs. The trick here is to use the 'sequence' property in the biobb_canal building block. This property is a way to explicitly specify the sequence of the structure analysed. If it is not added, the block tries to automatically extract this information from the previous step (biobb_curves), and this works only with sequences having < 45 base pairs (due to an internal trimming of the information written in the log files by Curves+).

So, could you please try to launch the biobb_canal step using the 'sequence' property like in this example, please:

from biobb_dna.curvesplus.biobb_canal import biobb_canal

canal_out = "canal.out.zip"

seq = "ACGTACGT" # Initialize with the corresponding sequence

prop = {
    'series' : True,
    'histo' : True,
    'sequence' : seq
}

biobb_canal(
    input_cda_file=curves_out_cda,
    input_lis_file=curves_out_lis,
    output_zip_path=canal_out,
    properties=prop
)

Hope it helps!

-Adam-

Hello,
Dear Dr. Hospital,

I applied the following changes to the Jupyter notebook, but there is a problem with base pairs greater than 43.

from biobb_dna.curvesplus.biobb_canal import biobb_canal
canal_out = “canal.out.zip”
seq = “GATTACATACATACAGATTACATACATACAGATTACATACATACAGATTACATACATACAGATTACATACATACAGATTACATACATACA”
prop = {
‘series’ : True,
‘histo’ : True,
‘sequence’ : seq
}

biobb_canal(
input_cda_file=curves_out_cda,
input_lis_file=curves_out_lis,
output_zip_path=canal_out,
properties=prop
)

As you can see, after 43, the twist angle is not calculated correctly. The plot also looks very strange. Please correct me if I am wrong. Could you please help me solve this?

38 AC 29.58746667 5.147374541
39 CA 39.73465 5.525619679
40 AT 30.38325333 4.06359534
41 TA 38.34820333 6.376172085
42 AC 29.66947667 5.632445458
43 CA 37.50877333 7.04495244
44 AG
45 GA 0 0
46 AT 0 0
47 TT 113.6616133 72.57015102
48 TA 119.78377 65.48163868
49 AC 84.30679 130.1986493
50 CA 111.5992967 98.48160509
51 AT 124.00301 68.71755656
52 TA 92.43231333 111.706258
53 AC 128.6853067 46.20100385
54 CA 85.62819667 123.0809933
55 AT 128.6709 56.63424348

Hi Fatemeh,

it is difficult to find the reasons of this without the trajectory file. I guess the quality check analyses (RMSd, fluctuation, HBs, etc.) are not giving any strange jumps, is that right? If so, could you please try to share the topology and trajectory files with me somehow? Maybe with an on-line file sharing service like Google drive? (I know they are big)

Looking forward to your reply.

Regards,

-Adam-