Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
T_Alignment_engine< SequenceOrStructure, AlignerAlgorithm, FT > Class Template Reference

Base engine for making alignments between structures and sequences. Base engine for making alignments between structures and sequences. More...

#include <Alignment_engine.hpp>

Classes

class  Is_lower_name_pair
 Predicate for comparing two pairs whatever the order of Alignment_unit_rep in each pair. More...
 
class  Matrix_function
 Functor for the substitution matrix. Is imposed the following properties: (i) symmetry, (ii) undefined values are 0. Thus, it is just enough to specify non null values on only one triangle of the matrix. More...
 

Public Types

enum  Sequence_length_type
 Enum for selecting the length of the sequence to consider when computing the similarity / identity percentages. More...
 
typedef SequenceOrStructure Sequence_or_structure
 Type for representing the sequences or the structures to align.
More...
 
typedef AlignerAlgorithm Aligner_algorithm
 Core algorithm performing the alignment. More...
 
typedef SequenceOrStructure::Alignment_unit Alignment_unit
 Type for a unit (e.g residue or nucleotid) More...
 
typedef SequenceOrStructure::Alignment_unit_name Alignment_unit_name
 Representation of the name of a unit. More...
 
typedef SequenceOrStructure::Alignment_unit_rep Alignment_unit_rep
 Representation of a unit for index purposes. More...
 
typedef AlignerAlgorithm::Score_type Score_type
 Representation of the score of the algorithm. More...
 
typedef std::pair< Alignment_unit_rep, Alignment_unit_repAligned_pair
 Representation of two aligned units. More...
 
typedef std::vector< Aligned_pairAlignment_type
 Representation of an alignment as a sequence of aligned units. More...
 
typedef std::pair< Alignment_unit_name, Alignment_unit_nameName_pair
 Representation of two unit names. More...
 
typedef std::map< Name_pair, FT, Is_lower_name_pairSubstitution_matrix
 Representation of the storage of values of the subtitution matrix. More...
 

Constructors

 T_Alignment_engine (const SequenceOrStructure &sos_1, const SequenceOrStructure &sos_2, const AlignerAlgorithm &aligner=AlignerAlgorithm(), std::ostream &log=std::cout, unsigned verbose=0)
 Constructor initializing the input sos and possibly the aligner algorithm. More...
 

Accessors

const AlignerAlgorithm & get_aligner (void) const
 
AlignerAlgorithm & get_aligner (void)
 
const SequenceOrStructure & get_first_sos (void) const
 Returns the first sequence or structure of the alignment. More...
 
const SequenceOrStructure & get_second_sos (void) const
 Returns the second sequence or structure of the alignment. More...
 

Algorithm

void align ()
 
void align (unsigned verbose, std::ostream &out)
 

Outputs

const Score_typeget_score (void) const
 
Score_typeget_score (void)
 
Alignment_typeget_alignment (void)
 
const Alignment_typeget_alignment (void) const
 

Analysis

void set_substitution_matrix (const Substitution_matrix &matrix)
 The input is a map from pairs of Alignment_unit_name to the associated score. Note that by construction, it is enough to specify non null values on a half of the matrix, diagonal excluded. More...
 
void load_substitution_matrix (const std::string &matrix_filename, unsigned nb_units_names)
 Loads directly the substitution matrix from a file of the format : Alignment_unit_name Alignment_unit_name Value. If duplicated values, the last one is taken. More...
 
void set_blosum_30 (void)
 Set the current matrix as the BLOSUM30 matrix as defined in Seqan 2.0. More...
 
void set_blosum_45 (void)
 Set the current matrix as the BLOSUM45 matrix as defined in Seqan 2.0. More...
 
void set_blosum_62 (void)
 Set the current matrix as the BLOSUM62 matrix as defined in Seqan 2.0. More...
 
void set_blosum_80 (void)
 Set the current matrix as the BLOSUM80 matrix as defined in Seqan 2.0. More...
 
void set_pam_40 (void)
 Set the current matrix as the PAM40 matrix as defined in Seqan 2.0. More...
 
void set_pam_120 (void)
 Set the current matrix as the PAM120 matrix as defined in Seqan 2.0. More...
 
void set_pam_200 (void)
 Set the current matrix as the PAM200 matrix as defined in Seqan 2.0. More...
 
void set_pam_250 (void)
 Set the current matrix as the PAM250 matrix as defined in Seqan 2.0. More...
 
void set_vtml_200 (void)
 Set the current matrix as the VTML200 matrix as defined in Seqan 2.0. More...
 
void set_standard_substitution_matrix (SBL::CSB::Alignment_substitution_matrix_type matrix_type)
 Set the current matrix as one of the standard substitution matrices. More...
 
Substitution_matrixget_substitution_matrix (void)
 
FT get_substitution_score (const Alignment_unit_name &p, const Alignment_unit_name &q) const
 
FT get_identity_percentage (Sequence_length_type type=ALIGNMENT_SEQUENCE_LENGTH) const
 
FT get_similarity_percentage (Sequence_length_type type=ALIGNMENT_SEQUENCE_LENGTH) const
 
void statistics (std::ostream &out, Sequence_length_type type) const
 
void print_alignment_txt (std::ostream &out) const
 Print the alignment in txt format. More...
 
void print_alignment_dot (std::ostream &out) const
 Print the alignment as a dot graph. More...
 

Detailed Description

template<class SequenceOrStructure, class AlignerAlgorithm, class FT = double>
class SBL::CSB::T_Alignment_engine< SequenceOrStructure, AlignerAlgorithm, FT >

Base engine for making alignments between structures and sequences. Base engine for making alignments between structures and sequences.

It provides a generic interface for wrapping existing algorithms aligning pairs of sequences or structures. It is designed such that it provides all common statistics to both type of algorithms. For a data structure that is more specific to structural alignments, see the class T_Alignement_engine_for_structures

Template Parameters
SequenceOrStructureRepresentation of a sequence or structure (sos) : it requires in particular to define the type Alignment_unit for the base representation of a residue or a nucleotid to align, the type Alignment_unit_name for the representation of the name of the unit (e.g ALA), Alignment_unit_rep for indexing the units in the sos, the method get_name() to return the name of the unit, size() to return the length of the sequence, and the operator [Alignment_unit_rep] for accessing to the corresponding unit of the sos.
AlignerAlgorithmBase functor for the algorithm that makes the alignment between the two input sos : it requires to define a type Score_type for the score (e.g a double, or a pair of doubles if two scores are required), to take two input sos and an output iterator over pairs of Alignment_unit_rep for the output alignment, and to return the score of the algorithm
FTNumber type representation used for all statistics except the algorithm score.

Member Typedef Documentation

◆ Aligned_pair

Representation of two aligned units.

◆ Aligner_algorithm

typedef AlignerAlgorithm Aligner_algorithm

Core algorithm performing the alignment.

◆ Alignment_type

typedef std::vector<Aligned_pair> Alignment_type

Representation of an alignment as a sequence of aligned units.

◆ Alignment_unit

typedef SequenceOrStructure::Alignment_unit Alignment_unit

Type for a unit (e.g residue or nucleotid)

◆ Alignment_unit_name

typedef SequenceOrStructure::Alignment_unit_name Alignment_unit_name

Representation of the name of a unit.

◆ Alignment_unit_rep

typedef SequenceOrStructure::Alignment_unit_rep Alignment_unit_rep

Representation of a unit for index purposes.

◆ Name_pair

Representation of two unit names.

◆ Score_type

typedef AlignerAlgorithm::Score_type Score_type

Representation of the score of the algorithm.

◆ Sequence_or_structure

typedef SequenceOrStructure Sequence_or_structure

Type for representing the sequences or the structures to align.

◆ Substitution_matrix

Representation of the storage of values of the subtitution matrix.

Member Enumeration Documentation

◆ Sequence_length_type

Enum for selecting the length of the sequence to consider when computing the similarity / identity percentages.

Constructor & Destructor Documentation

◆ T_Alignment_engine()

T_Alignment_engine ( const SequenceOrStructure &  sos_1,
const SequenceOrStructure &  sos_2,
const AlignerAlgorithm &  aligner = AlignerAlgorithm(),
std::ostream &  log = std::cout,
unsigned  verbose = 0 
)
inline

Constructor initializing the input sos and possibly the aligner algorithm.

Member Function Documentation

◆ get_first_sos()

const SequenceOrStructure & get_first_sos ( void  ) const
inline

Returns the first sequence or structure of the alignment.

◆ get_second_sos()

const SequenceOrStructure & get_second_sos ( void  ) const
inline

Returns the second sequence or structure of the alignment.

◆ load_substitution_matrix()

void load_substitution_matrix ( const std::string &  matrix_filename,
unsigned  nb_units_names 
)
inline

Loads directly the substitution matrix from a file of the format : Alignment_unit_name Alignment_unit_name Value. If duplicated values, the last one is taken.

◆ print_alignment_dot()

void print_alignment_dot ( std::ostream &  out) const
inline

Print the alignment as a dot graph.

◆ print_alignment_txt()

void print_alignment_txt ( std::ostream &  out) const
inline

Print the alignment in txt format.

◆ set_blosum_30()

void set_blosum_30 ( void  )
inline

Set the current matrix as the BLOSUM30 matrix as defined in Seqan 2.0.

◆ set_blosum_45()

void set_blosum_45 ( void  )
inline

Set the current matrix as the BLOSUM45 matrix as defined in Seqan 2.0.

◆ set_blosum_62()

void set_blosum_62 ( void  )
inline

Set the current matrix as the BLOSUM62 matrix as defined in Seqan 2.0.

◆ set_blosum_80()

void set_blosum_80 ( void  )
inline

Set the current matrix as the BLOSUM80 matrix as defined in Seqan 2.0.

◆ set_pam_120()

void set_pam_120 ( void  )
inline

Set the current matrix as the PAM120 matrix as defined in Seqan 2.0.

◆ set_pam_200()

void set_pam_200 ( void  )
inline

Set the current matrix as the PAM200 matrix as defined in Seqan 2.0.

◆ set_pam_250()

void set_pam_250 ( void  )
inline

Set the current matrix as the PAM250 matrix as defined in Seqan 2.0.

◆ set_pam_40()

void set_pam_40 ( void  )
inline

Set the current matrix as the PAM40 matrix as defined in Seqan 2.0.

◆ set_standard_substitution_matrix()

void set_standard_substitution_matrix ( SBL::CSB::Alignment_substitution_matrix_type  matrix_type)
inline

Set the current matrix as one of the standard substitution matrices.

◆ set_substitution_matrix()

void set_substitution_matrix ( const Substitution_matrix matrix)
inline

The input is a map from pairs of Alignment_unit_name to the associated score. Note that by construction, it is enough to specify non null values on a half of the matrix, diagonal excluded.

◆ set_vtml_200()

void set_vtml_200 ( void  )
inline

Set the current matrix as the VTML200 matrix as defined in Seqan 2.0.