Bioinformatics device properly monitors synthetic DNA

IMAGE

IMAGE: The Rice University computer science laboratory of Todd Treangen challenged – and succeeded – in-depth experimental learning to see if a new bioinformatics approach effectively monitors the laboratory … view more

Credit: Tommy LaVergne / Rice University

HOUSTON – (February 26, 2021) – Tracking the origin of a synthetic genetic code has never been simpler, but it can be done through bioinformatic computational techniques or, increasingly, in-depth learning.

While the latter gets the lion’s share, new research by computer scientist Todd Treangen from Rice University’s Brown School of Engineering focuses on whether pan-genome-based sequence alignment and methods can perform better the in-depth learning approaches in this area.

“In a sense, this is against the grain because deep learning methods have been superior to traditional methods, such as BLAST,” he said. “My aim with this study is to discuss to begin with how you can combine the knowledge of the two fields to achieve further improvements for this important computing challenge. “

Treangen, which specializes in developing computing solutions for biosecurity and microbial forensics applications, and its team at Rice have introduced PlasmidHawk, a bioinformatics approach that analyzes DNA sequences to help by identifying the source of interesting engineering plasmids.

“We show that a sequence-based alignment approach can be superior to the controversial neural network (CNN) deep learning approach for the unique task of lab-of-origin prediction,” he said.

The researchers led by Treangen and lead author Qi Wang, a Rice graduate student, outlined their findings in an open access paper in Nature Communication.

The open source software can be found here: https: //gitlab.com /treangenlab /plasmidhawk.

The program can be useful not only for monitoring potentially harmful engineering sequences but also for protecting intellectual property.

“The goal is either to help protect the intellectual property rights of the donors of the series or to help trace the origin of a synthetic series in the event of something bad,” Treangen said.

Treangen recently noted a high-profile paper outlining a deep neural network (RNN) deep learning mechanism to trace a series original lab. This method achieved 70% accuracy in predicting the same original laboratory. “Despite this significant progress over the previous deep learning approach, PlasmidHawk offers superior performance across both methods,” he said.

The Rice program directly aligns unknown sequences of code from genome datasets and matches them to common or unique pan-genomic regions for a synthetic biology research laboratory

“To predict the laboratory of origin, PlasmidHawk scores each laboratory based on matching segments between an unclassified sequence and the plasmid pan-genome, and then specifies the an unknown series to a lab with the lowest score, “Wang said.

In the new study, using the same data set as one of the in-depth learning experiments, the researchers described the successful prediction of “unknown labs” investing labs ”76% of the time. -85% of the time in the correct lab was in the top 10 candidates.

Unlike the in-depth learning methods, they said PlasmidHawk requires less data preprocessing and does not need retraining when adding new layers to an existing project. It also differs in offering a detailed explanation for its lab-origin predictions compared to its previous in-depth learning methods.

“The goal is to fill your computing device box with as many devices as you can,” said co-author Ryan Leo Elworth, a postdoctoral researcher at Rice. “Ultimately, I believe the best results will be a combination of machine learning, more traditional computational techniques and a deeper understanding of the specific biological problem you are dealing with.”

###

Rice graduate students Bryce Kille and Tian Rui Liu are co-authors of the paper. Treangen is an assistant professor of computer science.

The research was supported by the National Institutes of Health through the National Institute for Neurological Disorders and Stroke, the Office of the Director of National Intelligence and the Army Research Office. Addgene gave access to the DNA sequences of the deposited plasmids.

Read the summary at http: // dx.doi.org /10.1038 /s41467-021-21180-w.

This press release is available online at https: //news.rus.edu /2021 /02 /26 /bioinformatics-tool-round-synthetic-path-dna /

Follow Rice News and Media Relations via Twitter @RiceUNews.

Related Materials:

Mitochondrial age pressure astronauts: http: // news.rus.edu /2020 /12 /02 /mitochondrial-stress-age-astronauts astronauts /

A flood of genome data is hampering efforts to identify bacteria: http: // news.rus.edu /2018 /10 /30 /flood-of-genome-data-block-attempts-to-id-bacteria-2 /

Lab Treangen: https: //sites.Google.com /view /treangen /home

Rice Department of Computer Science: https: //csweb.rus.edu

George R. Brown School of Engineering: https: //engineering.rus.edu

Image Download:

https: //news network.rus.edu /news /files /2021 /02 /0221_PLASMID-1a-WEB.jpg

CAPAL: Todd Treangen. (Credit: Tommy LaVergne / Rice University)

Located on a 300-acre forest campus in Houston, Rice University is consistently ranked among the nation’s top 20 universities by US News & World Report. Rice has earned honors from the schools of Architecture, Business, Continuing Studies, Engineering, Humanities, Music, Natural Sciences and Social Sciences and is home to the Baker Institute for Public Policy. With 3,978 undergraduates and 3,192 graduate students, Rice’s undergraduate student-to-faculty ratio is slightly lower than 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason Rice is ranked 1st for a lot of racial / class interactions and No. 1 for quality of life by Princeton Review. Rice is also considered the best value among private universities by Kiplinger Personal Finance.

Jeff Falk

713-348-6775

[email protected]

Mike Williams

713-348-6728

[email protected]

Disclaimer: AAAS and EurekAlert! they are not responsible for the accuracy of press releases posted to EurekAlert! by sending institutions or for using any information through the EurekAlert system.

.Source