Template C++ / Python API for developping structural bioinformatics applications.
User Manual
Protein_sequence_annotator
Authors:R. Tetley and F. Cazals
Introduction
When processing large amounts of protein sequences, having indicators on certain sites of interest, such as transmembrane parts or binding sites can be interesting. As an example, one could want to find proteins wich contain a transmembrane region that is at least 30 amino-acids long. Such is the purpose of sequence annotations, allowing users to search for certain characteristics in annotated sequence. We provide a set of two Python modules: one which defines annotated sequences and provides some standard annotators and one to filter a set of protein sequences using properties on these annotations.
Pre-requisites
An annotation on a protein sequence is a triplet composed of a feature key, a residue sequence number range (or list) and a description. provides a list of standard features: http://www.uniprot.org/help/sequence_annotation
The feature key defines the type of feature
The range locates the feature on the sequence
The description (optional) gives additional information on the feature
The Python module SBL/Sequence_annotators.py provides the SBL::Sequence_annotators::Annotated_sequence class. Such an object should be initialized with a name as well as a fasta sequence. An annotator should then be used to add sequence annotations.
Annotations using Phoebius
phoebius is a combined transmembrane topology and signal peptide prediction method [114] , based upon profile Hidden Markov models.
An annotator has a single function, annotate, which takes an annotated sequence object in argument, and produces annotations by exploiting its fasta sequence
For example, the phobius annotator, when given an annotated sequence object, will write its fasta sequence to a file, run the exectuable, and parse the results. These results will be used to annotate the sequence.
import re #regular expressions
import string
import sys #misc system
import os
#This class runs the Phobius executable to annotate a sequence
Users can design their own annotator by creating a functor object which has a member function annotate which fills the features field of the annotated sequence.
Sequence filters
Through the SBL/Sequence_filters.py module, we provide a set of filters which allow to filter a set of annotated sequences by using criterions on their annotations.
A sequence filter object is a functor containing the filter member function, which takes an annotated sequence as argument and returns true if the given sequence follows the defined restrictions.