Bujnicki lab - Statistical geometry algorithm implementation in Python
  • RNA has recently emerged as an attractive target for new drug development. Our team is developing new methods to study the interactions between RNA and ligands. Recently, we have developed a new machine learning method called AnnapuRNA to predict how small chemical molecules interact with structured RNA molecules. Research published in PLoS Comput Biol. 2021 Feb 1;17(2):e1008309. doi: 10.1371/journal.pcbi.1008309. Read More
  • 1

About Laboratory Of Bioinformatics And Protein Engineering

Our group is involved in theoretical and experimental research on nucleic acids and proteins. The current focus is on RNA sequence-structure-function relationships (in particular 3D modeling), RNA-protein complexes, and enzymes acting on RNA.
 
We study the rules that govern the sequence-structure-function relationships in proteins and nucleic acids and use the acquired knowledge to predict structures and functions for uncharacterized gene products, to alter the known structures and functions of proteins and RNAs and to engineer molecules with new properties.
 
Our key strength is in the integration of various types of theoretical and experimental analyses. We develop and use computer programs for modeling of protein three-dimensional structures based on heterogenous, low-resolution, noisy and ambivalent experimental data. We are also involved in genome-scale phylogenetic analyses, with the focus on identification of proteins that belong to particular families. Subsequently, we characterize experimentally the function of the most interesting new genes/proteins identified by bioinformatics. We also use theoretical predictions to guide protein engineering, using rational and random approaches. Our ultimate goal is to identify complete sets of enzymes involved in particular metabolic pathways (e.g. RNA modification, DNA repair) and to design proteins with new properties, in particular enzymes with new useful functions, which have not been observed in the nature.
 
We are well-equipped with respect to both theoretical and experimental analyses. Our lab offers excellent environment for training of young researchers in both bioinformatics and molecular biology/biochemistry of protein-nucleic acid interactions.


More Good Science

Statistical geometry algorithm implementation in Python
 
Tomasz Puton, Sandra Smit, Kristian Rother, Jaap Heringa & Janusz M. Bujnicki
 
The implementation of the statistical geometry in sequence (binary and quaternary) space algorithm written in Python.
 
It is mainly applied in biology and sequence analysis in the context of evolution, e.g. for evaluating evolutionary models.
 
The algorithm allows for checking divergence of a given sequence alignment. It allows you to check whether your sequences (RNA, DNA, protein) follow a tree-like pattern of divergence or a bundle-like pattern.This is the main capability of the library. It is important to perform the test in order to see whether a tree can be built for a set of sequences (if they follow bundle-like divergence, building a tree doesn't make sense at all).It also allows for checking how various positions in an alignment of many related sequences are randomized, and therefore concluding which are constrained in the process of evolution. This can be done by splittingsequence alignment positions into two separate sequence alignments and then measuring the divergence within each group.
 
The original description of the statistical geometry algorithm in sequence space can be found in the paper:
http://www.ncbi.nlm.nih.gov/pubmed/3413065
 
And an example analysis here:
http://www.ncbi.nlm.nih.gov/pubmed/2497522
 
However, also a very good starting point to understanding the algorithm is the Biophysical Chemistry paper by Kay Nieselt-Struwe reviewing the statistical geometry in sequence space and all its variants:
http://www.ncbi.nlm.nih.gov/pubmed/9362556
 
Download
Download source code - stat_geo_1.0.zip