The 3Rs of the genome: Reading, writing, and regulation

IMAGE

IMAGE: Researchers have precisely mapped the binding sites of more than 400 proteins on the yeast genome using ChIP-exo. The method (above) uses an antibody to ‘fish out’ a specific DNA-linked protein … view more

Reputation: Pugh Lab, Cornell and Mahony Lab, Penn State

A great effort has been made to plot the exact binding sites of more than 400 different types of proteins on the yeast genome on the most complete and high-resolution map of chromosome architecture and gene regulation. The study reveals two specific gene control architectures, extending the traditional model of gene regulation. As established genes are called, those that perform basic ‘housekeeping’ functions and are almost always active at low levels need just a basic set of regulatory controls; but those activated by environmental markers have a more specialized architecture, called educational genes. This discovery in yeast could open the door to a better understanding of the architecture of human genome regulation.

A paper outlining the research by Penn State and Cornell University scientists appears March 10, 2021 in the journal Nature.

“When I first learned about DNA, I was taught to think of the genome as a library that contains all the books that have ever been written,” said Matthew J. Rossi, senior a research professor at Penn State and the first author of the paper. “The genome is stored as part of a complex of DNA, RNA and proteins, called ‘chromatin.’ The interaction of the proteins and the DNA controls when and where genes are expressed to produce RNA (i.e. read a book to learn or do something specific). But what I was always thinking with that complexity was, how do you find the right book when you need it? That is the question we are trying to answer in this study. “

How a cell selects the right book depends on regulatory proteins and their interaction with DNA in chromatin, known as the regulatory architecture of the genome. Yeast cells can respond to changes in their environment by altering this regulatory architecture to turn different genes on or off. In heterogeneous organisms, such as humans, the differentiation between muscle cells, neurons, and all other cell types is determined by the regulation of the set of genes that these cells have. expressing. It is therefore essential to understand the mechanisms that govern this different gene expression in order to understand responses to the environment, organic development and evolution.

“Proteins need to be recruited and accumulated at genes in order to be passed on,” said B. Franklin Pugh, professor of molecular biology and genetics at Cornell University and director of the research project beginning when he was a professor at Penn State. “We have put together the most complete and high-resolution map of these proteins showing the locations where they bind to the yeast and showing aspects of how they interact with -work together to regulate gene expression. “

The team used a method called ChIP-exo, a high-resolution version of ChIP-seq, to ​​map the binding sites of about 400 different proteins that interact with the yeast genome, some in a few places and others at thousands of places. In ChIP-exo, proteins are chemically bound to the DNA within living cells, thus locking them into position. The chromosomes are then taken from cells and shredded into smaller pieces. Antibodies are used to capture specific proteins and the piece of DNA to which they are attached. The protein-DNA interaction is then determined by attaching the DNA attached to the protein and mapping the sequence back to the genome.

“In traditional ChIP-seq, the DNA fragments attached to the proteins are still relatively large and variable in length – ranging from anywhere from 100 to 500 pairs of coins longer than the actual protein binding site, “said William KM Lai, research professor at Cornell University and author of the paper. “In ChIP-exo, we perform an additional step of trimming the DNA with an enzyme called exonuclease. This removes excess DNA that is not protected by the cross-linked protein. , allowing us to find a much more detailed location for the event link and to better see protein interactions. “

The team performed more than 1,200 individual ChIP-exo tests yielding billions of individual data points. Analysis of the big data impacted Penn State’s high-end computing collections and necessitated the development of several state-of-the-art bioinformatic devices incorporating a multicolor computing workflow designed to identify patterns. and organization of regulatory proteins in the yeast genome.

The analysis, which is similar to picking up several types of traits on the ground from hundreds of satellite images, revealed a surprisingly small number of unique protein concentrations that are reused across the yeast genome.

“The accuracy and completeness of the data allowed us to identify 21 protein concentrations and also without identifying specific regulatory control markers at housekeeping genes,” said Shaun Mahony, assistant professor of biochemistry and molecular biology at Penn State and author of the paper. “The computational methods we developed to analyze this data could be a starting point for further development for gene control studies in more complex organisms.”

The traditional model of gene regulation involves proteins called ‘transcription factors’ that bind to specific DNA sequences to control nearby gene expression. However, the researchers found that most genes in yeast do not adhere to this model.

“We were surprised to discover that housekeeping genes lacked a protein-DNA architecture that allowed the binding of specific transcription factors, which is a hallmark of educational genes,” Pugh said. “It seems that these genes just need a common set of proteins that allow humans to access the DNA and transcription without much need for regulation. Whether or not this pattern holds up in heterogeneous organisms such as humans are still to be seen.It is much more complicated to propose, but as a continuation of the yeast genome before the human genome was ordered, I am sure we will be able to control architecture see the human genome at high resolution. “

###

In addition to Rossi, Pugh, Lai, and Mahony, the research team includes Prashant K. Kuntala, Naomi Yamada, Nitika Badjatia, Chitvan Mittal, Guray Kuzu, Kylie Bocklund, Nina P. Farrell, Thomas R. Blanda, Joshua D. Mairose, Ann V. Basting, Katelyn S. Mistretta, David J. Rocco, Emily S. Perkinson, and Gretta D. Kellogg.

This work was supported by the U.S. National Institutes of Health, the U.S. National Institutes of Science, the Penn State Institute for Computer and Data Sciences, and Penn State’s Advanced CyberInfrastructure (ROAR).

Disclaimer: AAAS and EurekAlert! they are not responsible for the accuracy of press releases posted to EurekAlert! by sending institutions or for using any information through the EurekAlert system.

.Source