Exploring the Limitations of biobb_dna in Handling Large DNA Sequences

Hello, Can biobb_dna support a specific length of DNA? I have 90 base pairs, but it only considers 45 base pairs during calculation.

Hi Fatemeh,

sorry for the delay in answering your question.

Yes, you should be able to use biobb_dna with structures having >45 base pairs. The trick here is to use the 'sequence' property in the biobb_canal building block. This property is a way to explicitly specify the sequence of the structure analysed. If it is not added, the block tries to automatically extract this information from the previous step (biobb_curves), and this works only with sequences having < 45 base pairs (due to an internal trimming of the information written in the log files by Curves+).

So, could you please try to launch the biobb_canal step using the 'sequence' property like in this example, please:

from biobb_dna.curvesplus.biobb_canal import biobb_canal

canal_out = "canal.out.zip"

seq = "ACGTACGT" # Initialize with the corresponding sequence

prop = {
    'series' : True,
    'histo' : True,
    'sequence' : seq
}

biobb_canal(
    input_cda_file=curves_out_cda,
    input_lis_file=curves_out_lis,
    output_zip_path=canal_out,
    properties=prop
)

Hope it helps!

-Adam-