Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
User Manual

Molecular_conformation

Authors: F. Cazals and T. Dreyfus and C. Le Breton

Introduction

Molecular conformations

The conformation of a molecule plainly refers to the Cartesian coordinates of its atoms/particles. For a $n$ atom molecule, there are therefore $d=3N$ Cartesian coordinates. These coordinates are either provided as a collection of 3D points, or in the Point_d format.

Classes in this package provide containers / data structures to store the Cartesian coordinates of a collection of atoms/particles. Before reviewing these classes, the following remarks are in order:

  • The mapping between an atom id/particle id and the entry in the Molecular_conformation is handled in the class SBL::CSB::Polypeptide_chain_representation from Protein_representation.
  • The conversion between Cartesian coordinates and internal coordinates is handled in the package Molecular_coordinates.
There are two main routes to obtain molecular conformations. Experimentally, techniques such as X ray crystallography, NMR or cryo-electron microscopy deliver (ensembles of) conformations. From the modeling perspective, conformations are usually obtained using techniques such as molecular dynamics or Monte Carlo simulations.


todo: mapping moved into SBL::CSB::Protein_representation


Conformations outside the molecular realm

Conformations can also be used outside the molecular realm. For example, the Landscape_explorer can be used to explore polynomial functions rather than the potential energy of a given molecule. In such a case, conformations can be used without any biological meaning, and a natural choice is to represent a conformation with a D-dimensional points. Such programs usually offer a restricted set of options, in particular it is only possible to load conformations from a plain text file listing D-dimensional points.

Implementation and functionalities

Overall specification

More precisely, it provides a unique Traits class listing types and static methods, that is definable for any data structure that could represent a the conformation. In this way, any data structure that could fit the concept of a conformation can be used by simply implementing this unique Traits class for this target data structure.


A model of a conformation can virtually be any type, yet, it has to provide a number of operations. In practice, this compliance is achieved via Partial Template Specialization , applied to the class SBL::CSB::T_Conformation_traits< Conformation > .

The types and operations that must be provided are:

  • FT representing the number type used for the coordinates,
  • Conformation representing the conformation itself,
  • Coordinates_const_iterator, an iterator over the coordinates,
  • dimension(C) returning the number of coordinates of C,
  • begin(C) returning the first iterator over C,
  • end(C) returning the last iterator over C,
  • at(C, i) returning ith coordinate of C,
  • build(dim, begin, end, C) builds the conformation C of dimension dim and coordinates defined in the range (begin, end),
  • build(C, c) builds the conformation c from the conformation C.

In the SBL, there are two particular cases where the conformation needs to be enhanced–see details below:

  • when the conformation is elevated : the supplemental static method get_height(C) returns the height of C, and the static method set_height(C, h) sets the height of C to be h;
  • when computing the energy of the conformation : the supplemental static method get_covalent_structure(C) returns the covalent structure of C, and the static method set_covalent_structure(C, S) sets the covalent structure of C to be S.
Some parts of the SBL also requires I/O operations over the conformations, and the operators << and >> needs tobe defined for printing / loading the conformations. This is the case when loading the conformations using the package MolecularGeometryLoader. The SBL uses heavily the Boost Serialization library for serializing to / deserializing from XML archives : some modules require to define the the global method serialize(ar, C, flags), where ar is the archive where is serialized / deserialized the conformation, C is the conformation, and flags is a long integer flag for versioning the archive. See Boost Serialization for more details.


Note that in the cases above, the data structures are minimalistic and can be easy replaced by any other data structures, provided the right traits class. For exemple, if one wants to use the type std::list<double> as a conformation type (not recommanded due to the random access time), one has to define the class SBL::CSB::T_Conformation_traits<std::list<double>>


Conformations with coordinates

  • Core/Molecular_conformation/include/SBL/CSB/Conformation_traits_point_d.hpp:
    class specializing SBL::CSB::T_Conformation_traits by storing the conformation as Point_d; that is the 3n Cartesian coordinates are sequentially stored in the Point_d
    CGAL Point_d
  • Core/Molecular_conformation/include/SBL/CSB/Conformation_traits_vector.hpp: class specializing SBL::CSB::T_Conformation_traits by storing the conformation as a vector of coordinates of a user-specified type.
Core/Molecular_conformation/include/SBL/CSB/ Conformation_traits_vector_specialized.hpp : provides SBL::CSB::T_Conformation_as_vector
FC says: todo: remove?



Enhanced conformations

  • Core/Molecular_conformation/include/SBL/CSB/Conformation_traits_with_implicit_height.hpp: provides the struct SBL::CSB::T_Conformation_with_implicit_height which enhances a conformation with coordinates by adding a height and also takes a functor to apply a function to the height.

Loaders

SBL::IO::T_Conformation_loader is the loader for molecular conformations, either acquired experimentally or by simulation. It processes files either in PDB and mmCIF format, or point files (SBL data format), or GROMACS conformation format (if found upon compilation). After successful loading, it gives access to a container of conformations.

Conformations can be encoded in several ways, which are described by traits, see the package Molecular_conformation .