MODOMICS is the first comprehensive database system for biology of RNA modification. It integrates information about the chemical structure of modified nucleosides, their localization in RNA sequences, pathways of their biosynthesis and enzymes that carry out the respective reactions (together with their protein cofactors). Also included are the protein sequences, the structure data (if available), selected references from scientific literature, and links to other databases allowing to obtain comprehensive information about individual modified residues and proteins involved in their biosynthesis.
The MODOMICS database contains the following types of items:
A collection of naturally occurring modified RNA nucleosides. For each modification the name, short name and one-letter abbreviation are provided.
Modifications can be browsed according to the residue they originate from and by a chemical type of reaction. The originating choices include:
Detailed information for each modification includes:
Each modification is linked to the pathways section, to reactions in which it is a substrate or product, and the list of enzymes identified so far that catalyze its formation in various organisms.
*HPLC retention times (reversed phase chromatography (C18) with acetonitrile/ammonium acetate as mobile phase) are normalized to guanosine to account for different LC systems, gradients, column sizes, flow rates etc. For the elution order, cytosine, uridine, guanosine, adenosine and the late eluting m6A were chosen as references. Absolute retention times in the referenced chromatogram are: C: 4.6 min, U: 6.3 min, G: 10.9 min, A: 15.6 min, m6A: 21.5 min. The product ions show the typical neutral loss(es) for the respective nucleoside.
Here, we present six pathway graphs showing how modifications emerge from the different unmodified residues in precursor RNA (as defined above for the modified nucleoside residues). Placing the mouse cursor over a modification’s short name allows for displaying its chemical structure. Arrows connecting two modifications are colored according to the chemical type of a reaction. Dashed arrows indicate putative reactions. All arrows are clickable and linked to reaction-dedicated web-pages. Via these reaction pages the users can access information about specific modifications and enzymes from the corresponding sections of Modomics.
The graphs are interactive. It is possible to zoom and move the whole graph as well as to change the graph layout. Graphs may be downloaded as pictures, pdf or xml files.
A list of experimentally validated and predicted modification reactions. It can be filtered according to the originating base and the chemical type of reaction (see above under Modifications). Detailed information on each reaction comprises enzymes that have been experimentally proven to catalyze it, chemical structures of substrate(s) and product(s), information about cofactors, and other information in the free text format. Putative enzymes are not indicated. Reactions or pathways existing in a particular organism can be also accessed from the Protein entries in Modomics.
A collection of tRNA, rRNA, snRNA and snoRNA sequences that are known to be modified at multiple positions. For families of homologous RNAs multiple sequence alignments adapted from RFAM, Comparative RNA Website, and Transfer RNA databases, are available. Modifications within sequences are indicated in blue and marked with one-letter abbreviations (for their meaning the users have to consult the column RNA Mods abbrev. under Modifications entry). Upon clicking a given modified base within a sequence, the corresponding page in Modified nucleotides of Modomics is shown, and from there the modification pathway is made accessible. A gene encoding an enzyme responsible for the formation of the particular selected modified nucleotide may be unknown; the list of enzymes given in the table contains all known enzymes catalyzing the modification in RNA but not necessarily the one pointed in the RNA sequence.
Uppercase and lowercase letters in rRNA sequences indicate regions of grater and lesser confidence in alignment, respectively (according to the Comparative RNA Website database).
The sequences can be downloaded in text format (“Download as ASCII” option) or displayed in Jalview applet.
The “Draw modification profile” (available for rRNA and tRNA sequences) allows to display mapping of the modified positions on secondary structure diagrams of RNA molecules. The mapping is done based on the sequence alignments. For rRNAs a reference structure of E. coli SSU and LSU rRNAs is used. While for tRNAs a consensus secondary structure diagram is used. It is possible to map onto the diagram information from a user-selected set of sequences available in MODOMICS. In such a case, the percentage of modified ribonucleosides of any type in each alignment position is calculated and displayed. The resulting diagrams can be downloaded as image files.
The collection of proteins involved in RNA modification processes. Contains both functional enzymes and protein-co-factors necessary for multi-protein enzymatic activities. The Proteins table can by filtered by species and enzyme type (methyltransferase, pseudouridine synthase etc…) or by organism where a given protein has been identified. The users can choose which table columns are displayed. The choices include:
At individual protein level the following detailed information are given:
A census of human and yeast snoRNAs, involved in RNA-guided RNA modification by the C/D box and H/ACA box ribonucleoproteins, linked to the corresponding modification sites in human and yeast RNAs. The list of Guide RNAs can be browsed by organism and/or type of modification that is found in the target position Information included for each snoRNA, if available:
A catalogue of “building blocks” for the chemical synthesis of naturally occurring modified nucleosides. The collected data includes chemical structures of precursors used for the synthesis and relevant literature references. Each building block is characterized by the IUPAC name and CAS number. A list of relevant publications is also presented.
There are three possibilities for searching desired information in Modomics:
Hits and query-hit alignments from the results of the search done on MODOMICS protein or nucleic acid sequences collections can be downloaded in fasta format.