Bujnicki lab - Home
  • RNA has recently emerged as an attractive target for new drug development. Our team is developing new methods to study the interactions between RNA and ligands. Recently, we have developed a new machine learning method called AnnapuRNA to predict how small chemical molecules interact with structured RNA molecules. Research published in PLoS Comput Biol. 2021 Feb 1;17(2):e1008309. doi: 10.1371/journal.pcbi.1008309. Read More
  • 1

About Laboratory Of Bioinformatics And Protein Engineering

Our group is involved in theoretical and experimental research on nucleic acids and proteins. The current focus is on RNA sequence-structure-function relationships (in particular 3D modeling), RNA-protein complexes, and enzymes acting on RNA.
 
We study the rules that govern the sequence-structure-function relationships in proteins and nucleic acids and use the acquired knowledge to predict structures and functions for uncharacterized gene products, to alter the known structures and functions of proteins and RNAs and to engineer molecules with new properties.
 
Our key strength is in the integration of various types of theoretical and experimental analyses. We develop and use computer programs for modeling of protein three-dimensional structures based on heterogenous, low-resolution, noisy and ambivalent experimental data. We are also involved in genome-scale phylogenetic analyses, with the focus on identification of proteins that belong to particular families. Subsequently, we characterize experimentally the function of the most interesting new genes/proteins identified by bioinformatics. We also use theoretical predictions to guide protein engineering, using rational and random approaches. Our ultimate goal is to identify complete sets of enzymes involved in particular metabolic pathways (e.g. RNA modification, DNA repair) and to design proteins with new properties, in particular enzymes with new useful functions, which have not been observed in the nature.
 
We are well-equipped with respect to both theoretical and experimental analyses. Our lab offers excellent environment for training of young researchers in both bioinformatics and molecular biology/biochemistry of protein-nucleic acid interactions.


More Good Science

Statistical potentials for RNA-Protein docking
 

Irina Tuszynska, Janusz M. Bujnicki
 
Docking methods are widely used to predict complexes of macromolecular structures. The biggest challenge of docking is to find near native structures among a set of alternative structures. Statistical potentials are used in methods for molecular interaction prediction. We provide two knowledge-based potentials for discrimination of near native protein-RNA decoys - DARS-RNP and QUASI-RNP. The QUASI-RNP potential uses quasichemical method to describe a reference state, while the DARS-RNP potential uses DARS (Decoys As Reference State).
 
Usage

Our knowledge-based potentials may be used on any Linux operating system. Simply download DARS-RNP potential or QUASI-RNP potential on your computer. Minimum requirements: Python version >= 2.6, the BioPython library version >=1.45 and the Numeric library.
 
Example:
Discrimination of native-like structures from unbound docking decoys of signal recognition particle (SRP) with RNA.
 
Unbound docking decoys were obtained by a low resolution docking with GRAMM of a signal recognition particle (SRP) (pdb code 1LNG, a complex with RNA) to RNA (pdb code 1Z43, apo form). As a reference we used the native structure of this complex (PBD code 1LNG). We produced 10000 decoys with the grid step = 3.0, and repulsion and attraction parameters 15 and 0, respectively. The grid-step radius was used as a projection of an atom. The systematic search through the rotational coordinates was performed every 15 degree.
 
We used QUASI-RNP potential to assess decoys and cluster 100 best scored structures.
 
python QUASI_potential_3.py –f list_10000.txt > QUASI_potential_3.out
 
Next we make the scatter plot of energy (from the output file) - RMSD (rmsd_file) dependence (Figure 1). The first biggest cluster consists of native-like structures (Figure 2).

 

Figure 1. An energy-RMSD dependence for 10000 decoys generated by GRAMM program for the complex of SRP with RNA. Cluster is colored red.


Figure 2. A native structure of SRP complex (shades of blue: dark blue - RNA, light blue - protein) and the best scored structure from the biggest cluster (shades of red: red - RNA, salmon pink - protein). RNA molecules are superimposed.

Statistical geometry algorithm implementation in Python
 
Tomasz Puton, Sandra Smit, Kristian Rother, Jaap Heringa & Janusz M. Bujnicki
 
The implementation of the statistical geometry in sequence (binary and quaternary) space algorithm written in Python.
 
It is mainly applied in biology and sequence analysis in the context of evolution, e.g. for evaluating evolutionary models.
 
The algorithm allows for checking divergence of a given sequence alignment. It allows you to check whether your sequences (RNA, DNA, protein) follow a tree-like pattern of divergence or a bundle-like pattern.This is the main capability of the library. It is important to perform the test in order to see whether a tree can be built for a set of sequences (if they follow bundle-like divergence, building a tree doesn't make sense at all).It also allows for checking how various positions in an alignment of many related sequences are randomized, and therefore concluding which are constrained in the process of evolution. This can be done by splittingsequence alignment positions into two separate sequence alignments and then measuring the divergence within each group.
 
The original description of the statistical geometry algorithm in sequence space can be found in the paper:
http://www.ncbi.nlm.nih.gov/pubmed/3413065
 
And an example analysis here:
http://www.ncbi.nlm.nih.gov/pubmed/2497522
 
However, also a very good starting point to understanding the algorithm is the Biophysical Chemistry paper by Kay Nieselt-Struwe reviewing the statistical geometry in sequence space and all its variants:
http://www.ncbi.nlm.nih.gov/pubmed/9362556
 
Download
Download source code - stat_geo_1.0.zip 

FiltRest3D - A Standalone Program

Michał J. Gajda, Marta Kaczor, Irina Tuszynska, Anastasia Yu. Bakulina, and Janusz M. Bujnicki
 
 
What is FiltRest3D?
 
 
Automatic methods for protein structure prediction (fold-recognition, de novo folding, and docking programs) produce large sets of alternative models. These large model sets often include many native-like structures, which are scored as high as false positives. Such native-like models can be more easily identified based on data from experimental analyses used as structural restraints (e.g. identification of nearby residues by crosslinking, chemical modification, site-directed mutagenesis, deuterium exchange coupled with mass spectrometry etc.). We present a simple server for scoring and ranking of models according to their agreement with user-defined restraints.
 
 
How to use it?
 
 
Program may be used through a web server, or downloaded and installed locally on any Linux* system, that is standard in most bioinformatics labs.
Source code is licensed on General Public License. Downloadable archive contains license, program code in Python, examples in example/ directory, and helper scripts in utils/. Minimum requirements is Python version >= 2.3, and BioPython library version >=1.41. Some functionality of the program my require installation of downloadable bioinformatics software, like Stride and DSSP - details are provided in installation instructions.
 
* Software should work correctly on any Unix system, including Mac OS-X, but authors didn't have an opportunity to test it on any other platform.
 
Examples of methods for model-building with the use of spatial restraints 

 
MolProbity is a web server recommended as a complementary resource for testing models for the presence of high-resolution features.
 
Example: discrimination of native-like complexes from low-resolutions docking decoys of pseudouridine synthase TruA from Thermus thermophilus.


The example file set of low-resolution docking decoys was obtained by a docking with GRAMM {Vakser 1995} of a TruA enzyme structure (PDB code 1VS3, apo form) to its tRNA substrate (PDB code 2V0G, a complex with an unrelated protein). As a reference we used the native structure of this complex (PBD code 2NR0)(Figure 1). We produced 30000 decoys with the grid step = 3.5 , and repulsion parameter = 20. The grid-step radius was used as a projection of an atom. The systematic search through the rotational coordinates was performed every 10 degree.

For discrimination of native-like complexes with FILTREST3D we used five distance restraints to prepare restraints file. First, we chose protein residues R50 and N52 known to be involved in catalysis of isomerization of U39 in tRNA and introduced two specific amino acid-nucleotide restraints. Second, we identified additional putative RNA-binding residues (R23, H119, R162.) that were both predicted as RNA-binding by the PPRINT webserver (Figure 2A) and were located in regions of positive electrostatic potential (Figure 2B), as calculated with the APBS tool with the PyMol program {DeLano 2002} for performing electrostatic calculations. For these three residues we defined a general restraint for interactions with any nucleotide of the whole tRNA molecule.


Then filtrest program was run:


Filtrest –r restr.txt –d list.txt– o filtrest.out


The filtering took 500 minutes with the standalone version of the program running on a Linux workstation with 3,06 GHz processor. In the output file 2 decoys satisfies restrains completely. They exhibit the RMSD to the native structure of 22.36 Å and 26.56 Å (Figure 3). They are similar to each other and can be considered native-like.


Figure 1: The native structure of pseudouridine synthase TruA in complex with leucyl tRNA, which was used as a reference.

 

 

Figure 2: The analysis of the protein surface of TruA. A – Regions, which according to PPRINT server could interact with RNA are colored yellow. B – Electrostatic map of the protein surface. The positive charged regions are blue, while negative charged regions are red.

Picture A                                                                                                       Picture B

 

Figure 3: A,B – Each of two best scored decoys, which were found by Filtrest3D superimposed on the reference structure of complex. Only proteins are superimposed. Reference structure is colored in blue tones, while decoys are colored in red tones. All atoms, which were used to make restraints have VDW representation. Cα of marked amino acids are colored white, while O3’ atoms of nucleic acids are yellow. C – Two best scored decoys. Proteins are superimposed.

Picture A

Picture B

 

Picture C


Restraints file syntax
 
A detailed manual for restraints file syntax is here. Online help is available for both web server and command line interface.

 

Model of MiaE.
 
MiaE is a hydroxylase responsible for introducing posttranscriptional modification in position 37 in tRNA. The hydroxylation of the i6 group leads to appearance of hypermodification N6-(cis-4-hydroxyi-sopentenyl)-2-thiomethyladenosine (ms2io6A, also called 2-methyltio-cis- ribozeatin) and this process is dependent on the presence of the molecular oxygen (O2). For MiaE, we confidently predict that it shares the three-dimensional fold with the ferritin- like four-helix bundle proteins and that it has a similar active site and mechanism of action to diiron carboxylate enzymes, in particular, methane monooxygenase (E.C.1.14.13.25) that catalyses the biological hydroxylation of alkanes. The crystal structure of PpMiaE (Pseudomonas putida) was published [2] giving us a possibility for direct comparison of our model of StMiaE (Salmonella typhimurium) [1] with the native structure. Our modeling appeared to be very successful and predicted correct protein topology and reviled the structure of regions which were not present in a crystal structure of native protein.

References: 
1. Kaminska KH, Baraniak U, Boniecki M, Nowaczyk K, Czerwoniec A, Bujnicki JM. Structural bioinformatics analysis of enzymes involved in the biosynthesis pathway of the hypermodified nucleoside ms(2)io(6)A37 in tRNA. Proteins. 2008 Jan 1;70(1):1-18. 
2. Joint Center for Structural Genomics (JCSG). Crystal structure of putative tRNA-(ms(2)io(6)a)-hydroxylase (NP_744337.1) from Pseudomonas putida KT2440 at 2.05 A resolution. To be published.

Download structures:
StMiaE model
Native structure of PpMiaE - 2ITB

Read our manuscript:
Download PDF 

Gallery:


Comparison of structures of our model and the native structure (PDB code: 2ITB). Structures are colored according to the sequence index (N-terminus - blue, C-terminus - red). The model is of very good quality in regions of catalytic core. 


Regions with corresponding secondary structure elements are colored in the same way. The structure of regions which were not present in native protein are colored in a dark gray. 


Correctly predicted and functionally important residues of StMiaE. The diiron cluster in StMiaE is shown as gold spheres. Catalytic residues (E59, E137, H140, E190, E219 and H222) are colored in red.