Table of Contents
- 1. General Questions
- 1.1 What is PyRy3D?
- 1.2 How to cite PyRy3D?
- 1.3 What is PyRy3D license?
- 1.4 What system does PyRy3D work on?
- 1.5 How long does it take to run simulation?
- 1.6 What language is PyRy3D written in?
- 1.7 How to get help with the program?
- 1.8 What command-line options are available
- 1.9 How to use PyRy3D commands interactively?
- 1.10 How to write an input script?
- 2. Preparation of input files for PyRy3D
- 2.1 What are the input data needed?
- 2.2 How to prepare the set of structures?
- 2.3 How to prepare the sequence file?
- 2.4 Which density map formats are supported?
- 2.5 Which SAXS data formats are supported?
- 2.6 What is the restraints format?
- 2.7 What kind of restraints are supported?
- 2.8 How to prepare configuration file with simulation parameters?
- 2.9 Is it possible to prepare input files automatically?
- 2.10 Is it possible to incorporate secondary structure restraints into analysis?
- 3. Quick start
- 3.1 How to run simple simulation?
- 3.2 How to change simulation parameters?
- 3.3 How to investigate simulation results?
- 3.4 What is trafl file and how to save and analyse trajectory from simulation?
- 3.5 What kind of simulation algorithms can I use?
- 3.6. What kind of reduction methods are available?
- 3.7. What kind of structure representations for components can I provide as input files?
- 3.8. What kind of structure representations for components are available?
- 4. Analysing macromolecular complexes
- 5. Advanced options of PyRy3D
- 6. Tips & Tricks
1. General Questions
1.1 What is PyRy3D?
PyRy3D is a software tool for modeling structures for large macromolecular complexes. With the use of Monte Carlo simulation, and distance restrains derived from experiments, PyRy3D samples conformational space to identify the best fit of complex components structures in a density map of a whole complex.
1.2 How to cite PyRy3D?
PyRy3D is not published yet. If you have used our program please refer to program website: http://genesilico.pl/pyry3d.
1.3 What is PyRy3D licence?
PyRy3D is free of charge for academic and commercial users. The code is published under the conditions of the GPL license.
1.4 What system does PyRy3D work on?
PyRy3D has been tested and run on both Linux and Windows. Both require Python and BioPython to be installed. The program has not been tested on Mac.
1.5 How long does it take to run a single simulation?
Preparation of input files for modeling involves manual work and can be time consuming since a user has to prepare a file with sequences, archive with pdb structures, file with distance restraints and configuration file with simulation parameters. This process can be accelerated by using the PyRy3D GUI (also known as PyRy3D Chimera Extension). Once everything is prepared, the simulation can be started. Usually time of single simulation strongly depends on complex size and number of simulations to perform. However even for large systems (composed of tens of components) a 100 000 simulations steps take no more than a day on average workstation.
1.6 What language is PyRy3D written in?
1.7 How to get help with the program?
1.8 What command-line options are available
To use PyRy3D in console, please follow rules according to this example:
python pyry3d.py <option1> <argument1> <option2> <argument2> ...
|-h||none||help||Prints available options.|
|-s||.fasta file||multi FASTA file||Reads file with sequences.|
|-d||.tar archive with .pdb files||.tar archive with pdb files||Reads all structures from pdb files.|
|-r||Restraints file||text file in filtrest3D format||Parses restraints provided by the user.|
|-c||text file||Text file with simulation parameters||Parses simulation parameters.|
|-m||.ccp4 file||ccp4 density map file||Reads file with a density map.|
|-x||.pdb file||.pdb file with ab initio reconstruction of SAXS data||Reads file with an ab initio model from SAXS.|
|-y||.dat file with SAXS curve||Name of SAXS curve file||Reads SAXS curve.|
|-t||trajectory file name||Name of trajectory file||Creates trajectory file where steps of simulation are stored.|
|-o||output folder name||Simple string||Creates folder of a particular name where all results will be stored)|
|-f||filename||Simple string||creates a fullatom representation for the best model|
|-v||filename||Simple string||saves history of moves performed during simulation into text file|
|-e||filename||Simple string||creates a picture with energy plot of the simulation|
1.9 How to use PyRy3D commands interactively?
To test PyRy3D options one can use Python shell. Please type:
python from pyry3d import *Than you can use all PyRy3D commands. In case of any problems with import of pyry3d module please double check if the program is installed properly on your computer.
1.10 How to write an input script?
An input scripts should contain:
from pyry3d import *Followed by selected pyry3d commands in Python convention. To execute any script with PyRy3D from console please execute a command:
python <my_script>On Windows, you need the proper path settings, so that the system finds the Python.exe file.
2. Preparation of input files for PyRy3D
2.1 What are the input data needed?
Before you start simulation with PyRy3D you should prepare input files for analyses. To build complex model you will need:
-- File with components sequences (obligatory)
-- Archive with components structures (obligatory)
-- File with electron density map or ab initio SAXS model (optional)
-- File with simulation parameters you would like to use (default file can be used)
-- File with distance restraints from experiments (optional)
2.2 How to prepare the set of structures?
Structures of complex components must be provided as a tar or tgz archive with pdb files inside. Name of archive and pdb files of structures do not matter. All pdb files should contain one chain only.
2.3 How to prepare the sequence file?
For all complex components full sequence must be provided. All sequences should be stored in multi FASTA file where first line for each sequence starts with “>” followed by chain name of given component (chain name must be exactly the same as a chain name in corresponding pdb file for particular structure). Sequence should be provided in one-letter residue convention. Example:
>A PDAWER >B AAARRPSSW >C TGNKLP >X_protein METATTT >Y_nucleic AAAAAAAAAACAUTION! For components with no 3D structure coordinates sequences should be called as follow: ChainName_MoleculeType, e.g: X_protein or Y_nucleic.
2.4 Which density map formats are supported?
At the moment the program as a complex shape descriptor uses only .ccp4 file format with density map coordinates.
2.5 Which SAXS data formats are supported?
Alternatively PyRy3D can parse ab initio models from SAXS data reconstruction in .pdb format.
2.6 What is the restraints format?
To encode experimental information about interactions between complex components PyRy3D uses Filtrest3D format created by genesilico group. For details please check:
[filtrest3D restraints format description]
2.7 What kind of restraints are supported?
PyRy3D support the following types of restraints:
All restraints are defined by the user in the text file and must be encoded in Filtrest3D format. For each restraint a user can define different weight (which provides information about quality and importance of a particular information) To include restraints in a modeling process please run the program with "-r" command:
python pyry3d.py -s sequences -d structures -r restraints.txt
2.8 How to prepare configuration file with simulation parameters?
Example file with default simulation parameters is included with PyRy3D installation package (config_file.txt). Parameters are stored in regular text file where each line starts with name of parameter followed by its value e.g.:
- distance (between atoms, residues or fragments)
- accessibility to solvent
- relation between two distances
- is a selection of simulation annealing as a simulation method.
2.9 Is it possible to prepare input files automatically?
Yes!! PyRy3D has InputGenerator module which facilitates preparation of input files for the program. User provides a folder with structures and program renames it chains if necessary; renumber its residues from 1 and generates FASTA file with sequences and file with simulation parameters with default values. Please keep in mind that with the use of this module you cannot prepare files with disordered regions! CAUTION! InputGenerator is available and commonly used via UCSF Chimera Extension.
2.10 Is it possible to incorporate secondary structure restraints into analysis?
Not yet ;-(
3. Quick start
3.1 How to run simple simulation?
The easiest way to run simple simulation is to execute the following command:
pyry3d.py -s seq_file -d structures_archive -m map_file -r restraints_file -c config_file -o output_folder_name
3.2 How to change simulation parameters?
To change simulation parameters please edit configuration file. It contains values of all simulation parameters encoded in simple text file format.
Parameter_name space parameter_value
The selection of parameters is available:
parameter name default value available values description SIMMETHOD SimulatedAnnealing "Genetic" or "SimulatedAnnealing" for simulated annealing (default) or "ReplicaExchange" for replica exchange simulation algorithm REDUCTMETHOD roulette roulette, cutoff, tournament is available only for genetic simulation mode REPLICAEXCHANGE_FREQ 2 integers each X steps replicas will be exchanged; by default every 10% of simulation steps REPLICATEMPERATURES 400 375 350 325 300 275 250 225 200 175 150 125 100 list of integer values of any size list of temperatures for all replicas ANNTEMP 100 from range 1 to 100 annealing temperature in simulated annealing procedure STEPS 100 min 1, max number is not limited Corresponds to number of simulation steps to perform COMPONENT_REPRESENTATION ca CA - only calfas/c4' (default); cacb - coarse grain, 3p - 3points, fa - full atom Type of structure representation GRIDRADIUS 1.0 no limits set Radius of single grid cell. SIMBOX 2.0 no limits set Parameters indicated how many times simulation box diameters is bigger than the longest distance inside a density map MAXROT 5 from 1 to 360 Maximal rotation angle for single component move MAXTRANS 5 5 5 no limits set Maximal translation vector for single component move
Scoring function parameters:
parameter name default value available values description OUTBOX 1 in range from 0 to 10 Weight of penalty for atoms/residues outside simulation area MAP_FREESPACE 1 in range from 0 to 10 Weight of penalty for free spaces inside density map CLASHES 1 in range from 1 to 10 Weight of penalty for collisions (only CA and C4' atoms are considered) CLASHES_ALLATOMS 1 in range from 1 to 10 Weight of penalty for collisions (all atoms are considered) RESTRAINTS 1 in range from 0 to 10 Weight of penalty for violation of distance restraints DENSITY 1 in range from 0 to 10 Weight of penalty for occupation of map points with low density values CHI2 1 in range from 0 to 10 Weight of penalty for disagreement with SAXS curve RGE 1 in range from 0 to 10 Weight of penalty for disagreement with user defined radius of gyration (RG_VAL) SYMMETRY 1 in range from 0 to 10 Weight of penalty for violation of symmetry restraints
parameter name default value available values description ROTATION_FREQ 0.25 in range from 0 to 1 frequency of rotations ROTATION_COV_FREQ 0.25 in range from 0 to 1 frequency of rotations around covalent bonds TRANSLATION_FREQ 0.25 in range from 0 to 1 frequency of translations EXCHANGE_FREQ 0.25 in range from 1 to 1 frequency of components exchange SIMUL_DD_FREQ 0.25 in range from 0 to 1 frequency of simulation of disordered fragments ROTATION_ALL_FREQ 0.25 in range from 0 to 1 frequency of rotations where all components are moved simultaneously ROTATION_WHOLE_FREQ 0.25 in range from 0 to 1 frequency of rotations where all components are moved simultaneously around common centre of mass TRANSLATION_ALL_FREQ 0.25 in range from 0 to 1 frequency of rotations where all components are moved simultaneously
Information about complex:
parameter name default value available values description KVOL 1 in range from 1 to 10 how many complex volumes will describe density map, e.g. KVOL=2 indicated that map volume will be twice as big as complex volume calculated from its sequence THRESHOLD 0.0 Value set must occur in a density map float value describing density map threshold SAXSRADIUS 0.0 positive float value float value describing dammy atom radius RG_VAL 0.0 positive float value float value describing radius of gyration for a complex CRYSOL_PATH 0.0 string path to CRYSOL binaries MOVE_STATE no default values, parameter is optional e.g. D movable 5 5 5 0.1 0.1 0.1 10 10 10 0.1 0.1 1 5 30 To fix a molecule use "fixed" parameter Indicates limited values of moves (rotations, translations) for particular component COVALENT_BONDS no default values, parameter is optional ChainName [ChainBound1, ChainBound2] [AtomBoundNumber1, AtomBoundName1] [AtomBoundNumber2, AtomBoundName2], e.g. A ['Z','D'] [10,'CA'] [11,'CA'] is used to indicate which components are linked by a covalent bond START_ORIENTATION False True/False True if user sets start conformation, False if not!! IDENTIFY_DISORDERS False True/False True if user wants PyRy3D to add missing or disordered fragments into the structures, False if not!!
Specific parameters to control simulation progress:
parameter name default value available values description WRITE_N_ITER 100 - how ofter PyRy3D should save a model on disk PARAMSCALINGRANGES 0 25 50 - at what point of simulation should parameter scaling ranges kick in PARAMSCALINGR1 50 100 - first scaling range (numbers refer to steps) PARAMSCALINGR2 25 50 - second scaling range (numbers refer to steps) PARAMSCALINGR3 0 25 - third scaling range (numbers refer to steps)
#use density map equal to 10 complex volumes
KVOL 10#use density map with density values >= threshold equal to 1.5
THRESHOLD 1.5#perform only 5 simulation steps
STEPS 5#assign default value for a particular component
PARAM_NAME X#do not move component “B” during simulation – it is already well fit into a density map
MOVE_STATE B fixed
#limit movements of component "B"
MOVE_STATE B 5 5 5 1 1 1 20 20 20 100 100 100 0 0
- 5 5 5 - refers to rotation around X, Y, Z axis in single move (X - up and down; Y- right, left, Z - back, forward)
- 1 1 1 - refers to translation in single move
- 20 20 20 - refers to maximal change in X, Y, Z coordinates during simulation (due to rotations)
- 100 100 100 - refers to maximal change in X, Y, Z coordinates during simulation (due to translations)
- 0 0 - are specific to rotation around covalents bonds
3.3 What is trafl file and how to save and analyse trajectory from simulation?
Trajectory files is a simple text file to store coordinates of complex atoms and information about complex energy assigned during simulation. To generate trajectory from simulation run PyRy3D with “-t” option:
python pyry3d.py options -t trajectory_filenameTo view trajectory please use pro viewer included in PyRy3D distribution. Run the program:
./pro traject.traflhaving pro viewer open use “1” and “3” buttons to follow simulation steps forwards and “2” and “4” in backwards direction. To view energy plot press “e” button.
3.4 How to investigate simulation results?
All models generated by PyRy3D are saved on a hard disk as plain pdb files. All files are named in the following way:
OutfolderName_ComplexScore_IterationNumber_SimulationTemperature.pdb protein_-100_15_10.2.pdbTo visualize results just open a density map in favourite viewer (PyMOL, Chimera) and load chosen pdb file into it. Apart from regular pdb files with complexes you can also save trajectory file where all simulation steps are stored in reduced representation. To view such trajectory please use the procedure mentioned in 3.3.
3.5 What kind of 3D sampling algorithms can I use?
PyRy3D facilitates three different algorithms to generate new complexes:
-- genetic algorithm,
-- simulated annealing,
-- replica exchange. The program was extensively tested only for simmulated annealing protocol. All the other algorithms work, but have been tested on small number of examples.
3.6 What kind of reduction methods are available?
For genetic algorithm you can select on of three available reduction metods:
3.7 What kind of structure representations for components can I provide as input files?
Input components can be delivered to PyRy3D in any convenient structure representation e.g full atom, only Calfas (for protein) and C4' (for nucleic acids) or any other crude models. Program parses structures and encodes them into residue descriptors. Bear in mind that use of full atom representation enables higher accuracy of complex scoring with slower pace of simulation. From the other hand Calfas representation is very fast but simulation scores for complexes will be relatively lower.
3.8 What kind of structure representations for components are available?
PyRy3D can produce:
-- full atom,
-- only Calfas (for protein) and C4' (for nucleic acids),
-- 2 points (CACB for proteins and C4'P for nucleic acids),
-- 3 points (CA, CB, centroid trace only for proteins and C4', P, N1/N9 for nucleic acids). Bear in mind that use of full atom representation enables higher accuracy of complex scoring with slower pace of simulation. From the other hand Calfas representation is very fast but simulation scores for complexes will be relatively lower.
4. Analysing macromolecular complexes structures
4.1 How to calculate isosurface for given complex?
Use KVOL parameter in config_file and set it to 1 value. A program will calculate a threshold value of the density map which corresponds to complex volume. If you do not want to wait for the simulation to finish to get the results simply set STEPS to 0 and you will get the result immediately.
4.2 How to calculate complex volume and molecular weight?
Use KVOL parameter in config_file and set it to 1 value. A program will calculate a volume of your complex and molecular weight. As KVOL you can provide any float value in order to use larger/smaller piece of the density map.
4.3 How to find disordered regions inside the complex?
Simply, provide full sequences of components in FASTA file and cut regions with disorders from structure files. Program will detect missing fragments and simulate their conformations during complex building prodedure. PyRy3D detects:
-- N terminal disorder,
-- internal disorder,
-- C terminal disorder,
-- sequences with no 3D structures provided. In order to verify whether PyRy3D detects all fragments with no structure correctly simply run the program with STEPS parameter set to 0. Program will print all detected disordered regions with their sequences and localization in the sequence.
4.4 How to create a ranking of complexes models?
The best way to do this is to use PyRy3D Chimera Extension and apply create ranking procedure. A program will rank a list of complexes according to PyRy3D scoring function. To get a score for one particular complex run PyRy3D with STEPS parameter set to 0 value.
4.5 How to simulate structure for regions with no structure?
PyRy3D enables to include components with no structural data into complex building process. To do that simply provide components sequence in FASTA file (sequence name as ">CHAIN_MOLTYPE" eg. >A_protein or >B_nucleic). Do not provide any structure with chain name CHAIN. Program will take the sequence and in each simulation step will simulate a structure for this component based on sequence only.
5. Advanced options of PyRy3D
5.1 Can I fix a particular component during simulation?
Yes. If you know exact position of particular component inside a density map you can fix it during simulation. In order to do this please add the following line into configuration file:
MOVE_STATE state_name chain_name max_rot_angle_X max_rot_angle_Y max_rot_angle_Z max_trans_vector_X max_trans_vector_Y max_trans_vector_Z max_rot_angle_X_SUM max_rot_angle_Y_SUM max_rot_angle_Z_SUM max_trans_vector_X_SUM max_trans_vector_Y_SUM max_trans_vector_Z_SUM rot_cov rot_cov_sumwhere:
first three values refer to rotation angle around particular axis in single simulation step
next three values refer to translation vector in particular direction (axis) in single simulation step
values 7-9 refer to rotation angle sum around particular axis in whole simulation
values refer 10-12 to translation vectors around particular axis in whole simulation last two values refer to maximal rotation angle for rotations around covalent bond in single simulation step; and sum of rotations in whole simulation Examples:
to disable any moves for component "A"
MOVE_STATE fixed A
to disable rotation along any axis for component "A"
MOVE_STATE movable A 0 0 0 0.1 0.1 0.1 10 10 10 0.1 0.1 1 5 30
to define moves' boundaries use
MOVE_STATE movable A 5 5 5 0.1 0.1 0.1 10 10 10 0.1 0.1 1 5 30
to define moves' boundaries use
MOVE_STATE A movable 5 x NL 0.1 x 0.1 360 360 360 1 1 1 3 NL #max_rot_angle; axis *3; maxtrans, sum_rots, sum_trans, around_line rot_single, rot_sum NL - no limits - do not limit moves (rotations or translations) in this direction X - use default values herewhere: state_name: “fixed” for frozen chains and “movable” for chains with restricted moves chain_name: name of chain as it appears in structure of particular component max_rot_angle_X: maximal rotation in radians around X axis max_trans_vector_X: maximal translation vector along X axis
5.2 Can I change simulation score weights and penalties during simulation?
In each simulation step newly born complex is scored according to the following formula:
score = (Woutbox * outbox_penalty) + (Wmap_freespace * map_freespace_penalty) + (Wclashes * clashes_penalty) + (Wrestraints * restraints_penalty)To manipulate scoring procedure you can choose values of weights elements. For example if you wish to penalize collisions between components please assign higer value for COLLISIONS parameter in configuration file. To exclude a particular scoring function element just set its weight to 0.
5.3 How to create an energy plot from simulation?
Run PyRy3D with -e option from command line. After simulation two plots will be saved on your disc. One with general PyRy3D score (total scores for all saved complexes). Second showing tendency of scoring function elements as well (clashes penalty, restraints etc.)
5.4 How to get history of moves applied during complex building?
Run PyRy3D with -v option from command line. The program will save in text file all moves applied to each complex component (saved on disc).
5.5 How to define covalent bonds within a complex?
in PyRy3D user can define which components are "dependent". In these cases when one component is moved all linked are also moved in the same way.
COVALENT_BONDS A ['Z','D'] [10,'CA'] [11,'CA'] #component A is bound with components Z and D, the bound is defined by atoms number 10 and 11 and their Calfa atoms COVALENT_BONDS A ['E'] [5,'CA'] [7,'CA'] #component A is bound with component E, the bound is defined by atoms number 5 and 7 and their Calfa atoms COVALENT_BONDS Z ['A'] [14,'CA'] [15,'CA'] #component A is bound with components Z and D, the bound is defined by atoms number 10 and 11 and their Calfa atoms COVALENT_BONDS C  #component C can be moved independently
6. Tips & tricks
6.1 Can I test simulation parameters interactively?
Yes. To facilitate our users graphical visualization of PyRy3D applications we have created PyRy3D Chimera Extension. It enables to:
- prepare input files for simulation
- run short PyRy3D simulation and visualize results in Chimera viewer
- build complex models interactively and score them with PyRy3D scoring function
- prepare a ranking of models of complexes
- AND MANY MANY MORE FANTASTIC FEATURES!!
6.2 Can I check how the scoring function assess models interactively?
Yes! For this particular case the best way is to use our graphical interface PyRy3D Chimera Extension. You will find detailed information how to do this here
6.3 How can I speed up my simulation?
-- use coarse grain models of components,
-- assign smaller value to simgrid,
-- assign larger gridradius values,
-- choose different threshold/kvol descriptor value,
-- check PyRy3D Chimera Extension to set up parameters values best describing your system.