Development of new methods for designing RNA molecules that fold into desired spatial structures and their use for development of new functional RNAs and for prediction of noncoding RNAs in transcriptome sequences (2017/25/B/NZ2/01294); 1 494 250 PLN; 2018-2021. PI: J.M.Bujnicki, vice-PI: T.Wirecki

Ribonucleic acid (RNA) molecules are master regulators of cells. They are involved in a variety of molecular processes: they transmit genetic information, they sense and communicate responses to cellular signals, and even catalyze chemical reactions. These functions of RNAs depend on their ability to assume one or more structures, which is encoded by the ribonucleotide sequence. One of the fundamental challenges of biology and chemistry is to design molecules that form desired structures and carry out desired functions. The computational design of RNA requires solving the so-called RNA inverse folding problem: given a target structure, identify one or more sequences that fold into that structure (and do not fold into any other structure). Nonetheless, RNA design is challenging, especially for molecules with complex structures. In particular, there is a scarcity of methods for designing RNA 3D structures, and they have severe restrictions – for instance they usually require a fixed RNA structural framework and only allow the RNA bases to change, but keep the sequence length and the shape of the RNA chain fixed. In the project, we are developing a new software package for computational design of RNA sequences, which takes into account 3D structure, conformational changes, and binding of RNA molecules to each other. 

We have developed two prototypical methods for RNA sequence design: DesiRNA for secondary-structure based design which allows designing oligomers and alternative structures, and SimRNA-Design for 3D based design, “mutating” the RNA sequence during 3D folding simulations. We are further developing the two methods, and we plan to combine them into one package for designing of RNAs composed of one or multiple strands, and capable of switching between different 3D structures (including changes of the global shape, patterns of canonical and non-canonical base pairs, and oligomeric states). The new program will enable changing sequence length in the form of small insertions and deletions. The design of RNA molecules with such flexibility is entirely out of reach for currently existing programs. The utility of the new computational method will be tested by the experimental validation of designed RNAs. First, selected designed RNA molecules will be synthesized, and their structure(s) will be analyzed. A combination of computational design, structural modeling, and experimental analyses will thereby lead to the development of new, artificial, functional RNAs. Second, the new method will be used to enrich alignments of naturally occurring RNAs (e.g., riboswitches, ribozymes) with artificial sequences, aiming to improve the methodology of remote homology detection, as it was earlier done for protein sequence alignments. We will use the combination of natural and artificially designed sequences to improve the sequence profile/covariation information for known RNA families with members of a known 3D structure. These sequence profiles extended by structure-based sequence design will aid in the searches for previously unknown members of these RNA families in genomic sequence databases. Structural and functional predictions (e.g., new candidates for functional RNAs) will also be validated experimentally.

The project is carried out by an interdisciplinary team of researchers, including computer programmers, researchers specializing in computer simulations and data analysis, and biochemists who analyze RNA molecules experimentally