HSE researchers use cloud networks to analyze DNA

IMAGE

IMAGE: Maria Poptsova, Head of Bioinformatics Laboratory (HSE Faculty of Computer Science) more

Credit: Maria Poptsova

HSE scientists have suggested a way to ensure the accuracy of Z-DNA detection, or twisted DNA segments on the left rather than the right. To do this, they used cloud networks and a data set of more than 30,000 experiments performed by various laboratories around the world. Details of the study are published in Scientific Reports.

Over the past 67 years since the structure of DNA was discovered, scientists have discovered many structural changes in this molecule. Sometimes DNA structures are not at all like the standard double helix, called B-DNA: they can differ from B-DNA depending on the number of chains (from two to four), chain density and thickness, the method in which the nitrogenous has binding centers, and the twisting direction of the helix.

One of the structures, Z-DNA, is made up of a double helix, twisted differently – on the left rather than on the right. Segments of Z-DNA are known to be found in the cells of various organisms (from bacteria to humans), arising under certain conditions (for example, in supercoiled DNA or high salt density), and can be combined with other DNA structures in a single molecule. For example, if, for some reason, the B-DNA molecule is partially coated to the extent that it undergoes transcription (synthesis of DNA-based RNA), some of its segments may turn the other way, thus relieving unnecessary ‘stress’. Scientists also suggest that Z-DNA can regulate transcription and increase the likelihood of mutations. Some research suggests that Z-DNA production may be linked to certain diseases such as cancer, diabetes, and Alzheimer’s. Recently, more and more studies have emerged that show the role of Z-DNA in the innate immune response – the response to viruses and other pathogens within the cell itself.

To learn more about the formation conditions and biological role of Z-DNA segments, they need to have methods to find the genome. The first genetic map with Z-DNA site identification was compiled in 1997, based on experimental data on the structural binding of continuous nucleotides. In recent years, methods have emerged in which the positioning of additional regions of B-DNA is predicted using computer algorithms. Advances in machine learning have made it possible to use another powerful tool for this task – cloud networks. Unlike most approaches, cloud networks can take care of many things and don’t require scientists to choose a little bit more likely in advance. But even for neural networks, Z-DNA is difficult to detect, as there is not enough experimental data: Z-DNA appears and disappears, and a test only records part few of these regions. The researchers decided to test whether the accuracy of neural networks increases by the inclusion of information from omics data, or information on how gene activity and protein synthesis in cells are regulated.

The scientists began by comparing how three types of cloud networks – controversial, recursive, and a combination of the first two – can handle the task. A controversial neural network is typically used for image processing, while a recursive neural network is typically used for analyzing texts. All three types of cloud networks have already been tested for problems related to genome analysis. In total, the study authors trained and evaluated 151 models on the DNA database augmented by omics data. One of the recursive neural networks, the authors of DeepZ named the best results, and used it to predict new Z-DNA segments in the human genome. Its accuracy is significantly higher than the accuracy of the existing algorithm, Z-Hunt.

With the help of DeepZ, the scientists mapped the entire sequence of the human genome, determining for each nucleotide its likelihood to be within the Z-DNA region. A series of several nucleotides where the probability was above a threshold specific value was identified as a potential target site.

‘The results of this study are important, because, with the help of cloud networks, we were not only able to reproduce the tests, but also predict the potential sites. in the formation of Z-DNA in the genome, ‘said Maria Poptsova, director of the study and Head of the laboratory of Bioinformatics at the Faculty of Computer Science at HSE University. The abundance of Z-DNA markers suggests that they are actively used to turn genes on and off. This is a faster signal than the genomic motifs. For example, the study by the group of Australian scientists has shown that Z-DNA serves as a marker in training to dispel fear. Apparently, Z-DNA evolved into an evolution in cases where a rapid response to events was required. We plan to start joint projects with experimental groups to test the predictions. ‘

The authors demonstrated a new method of predicting Z-DNA segments using omix data and in-depth learning methods. The genome identification generated by the neural network will help scientists conduct experiments to detect Z-DNA, and the full spectrum is just beginning to emerge.

###

Disclaimer: AAAS and EurekAlert! they are not responsible for the accuracy of press releases posted to EurekAlert! by sending institutions or for using any information through the EurekAlert system.

.Source