Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
User Manual

ParticleTraits

Authors: F. Cazals and T. Dreyfus

Introduction

In the SBL, a number of low-level algorithms are based upon combinatorial, geometric and topological representations, so that these algorithms do not require any specific biophysical information. This fact allows developing several programs taking as input the same geometric objects, yet, in different biological contexts.

The iconic example is the calculation of molecules surfaces and volumes: such a calculation requires a set of balls, which may be atoms or pseudo-atoms. For this reason, the SBL proposes two programs, namely $\text{\vorlumeEP}$ for atoms, and $\text{\vorlumeET}$ for a collection of balls out of any biomolecular context. Such programs make up an application.

Another example related to the problem of detecting and modeling of interfaces between partners in a complex, using e.g. the program. Again, this problem is geometric in nature, since one essentially seeks pairs of balls belonging to nearby partners. For example, $\text{\intervorEABW}$ provides such a calculation for proteins; and $\text{\intervorEIGAGW}$ uses the same geometric machinery to provide more accurate information for the case where the partners have a complex hierarchical structure (such as that of an antibody, whose binding site involves six loops, namely the Complementarity Determining Regions).


In fact, the very generic nature of prerequisites on objects makes it possible to use very generic algorithms within the SBL.

A particle is an elementary physical object. It is of geometric nature, but is potentially endowed with other properties. To handle these pieces of information, the ParticleTraits package define classes to:

Using existing models in existing Applications : I/O

Guidelines

From the user standpoint, using programs resorting to models of ParticleTraits has two implications:

  • (i) the models used by these programs represent different biophysical particle types (e.g. an atom and a pseudo-atom representing a residue). Thus, the output will typically have different attributes.
  • (ii) the models may refer to the same object, yet, with different attributes. For example, an atom may be accompanied by the SSE it belongs to in one case, but not the other. Thus, the level of detail of pieces of information provided may vary.

To ease things, when an application is instantiated with several models of ParticleTraits, the name of the corresponding programs contains a key-word identifying the type of the particle (see section Example: Running a Program using Particles).

Example: Running a Program using Particles

The following example shows the differences between two programs of an application using two different models of the concept ParticleTraits. Let consider the programs of the application Space_filling_model_surface_volume, that computes the volume and the surface area of an input molecule. The input molecule may be stored in files with different formats, as the PDB file format, or as a simple list of 3D spheres in a plain text file. Depending on the format, the way to load and to represent a particle will not be the same, so that two programs are provided:

  • $\text{\vorlumeEP}$ uses the model SBL::Models::T_Atom_with_flat_info_and_annotations_traits, that defines an atom with "flat" information, an annotated name, an annotated radius and a list of dynamic annotations; it will load a PDB file, build the particles from the loaded atoms, and annotate the particles with default or loaded names and radii (and possibly optional annotations);
sbl-vorlume-pdb.exe -v -c -f demos/data/1vfb.pdb
  • $\text{\vorlumeET}$ uses the model SBL::Models::T_Geometric_particle_traits that defines an atom as a simple 3D sphere; it will load a plain text file and build the particles from the 3D spheres listed in the input file.
sbl-vorlume-txt.exe -v -c -f demos/data/spheres.txt

The output files of $\text{\vorlumeEP}$ will be naturally decorated with the information contained in the input PDB file. For $\text{\vorlumeET}$, the output files will be minimalist since only geometric information is contained in the input plain text file.

Using existing models to develop novel applications

Geometric Representation

The class SBL::Models::T_Geometric_particle_traits< GeometricKernel , GeometricRepresentation > defines a type of particle that is purely geometric and has no attached biophysical information.

The geometric part of the SBL is based upon the CGAL library: the template parameter GeometricKernel determines the geometric primitives used for representing a particle (number type, point type, etc...), as explained in the documentation of the CGAL library. The second template parameter GeometricRepresentation is the geometric representation of the particle, which may be represented by CGAL objects such as Weighted_point, Point_3, Sphere_3. The default geometric representation used is the Weighted_point type from CGAL type.

Atoms

The SBL provides two main data structures for handling atoms, that are presented just below:

ESBTL Molecular Atom as Particle Type

The ESBTL provides the hierarchical data structure ESBTL::Molecular_atom to represent an atom in a hierarchical manner–namely the atom belongs a residue which is itself in a chain which itself belongs to a molecular model. The SBL provides the model SBL::Models::T_Atom_with_hierarchical_info_traits< GeometricKernel , SystemItems > for using the class ESBTL::Molecular_atom within the SBL algorithms. The template parameter GeometricKernel is a class defining the number type FT used to represent the coordinates of the atoms, and the type Point_3 used to represent the atoms themselves as geometric entities. It has a default type for defining geometric objects with float coordinates and calculations. The template parameter SystemItems is a class defining the base classes used to represent atoms, residues, chains and models in a molecule (see the ESBTL documentation). It has the default type from the ESBTL library.

Note that this data structure is only partially serializable, see section Partial Serialization.


SBL Atom as Particle Type

Due to the previous remark, in order to serialize an atom, the entire molecule containing it has to be serialized. For this reason, in the SBL, the ESBTL Molecular Atom can only be partially saved in a file, and cannot be loaded from a file. In order to have a completely serializable data structure, the SBL provides a "flat" data structure, in the sense that all the information on the atom is contained in the atom. The SBL provides the model SBL::Models::T_Atom_with_flat_info_traits< GeometricKernel , SystemItems >. The two template parameters are the same ones as explained in section ESBTL Molecular Atom as Particle Type .

Pseudo-atoms

The ESBTL provides a pseudo-atom data structure ESBTL::Coarse_atom to represent residues with a fix number of pseudo-atoms. As for the ESBTL atom data structure, the ESBTL pseudo-atom data structure is not fully savable: the SBL provides a "flat" pseudo-atom data structure SBL::Models::T_Pseudo_atom_per_residue_spec_with_flat_info and a model of ParticleTraits for this pseudo-atom class: SBL::Models::T_Pseudo_atom_per_residue_spec_with_flat_info_traits< GeometricKernel , SystemItems >.

Particles with Annotations

As explained in the user manual of the package ParticleAnnotator, a particle may be decorated with annotations: these annotations may be compulsory or optional, depending on the context.

The class SBL::Models::T_Particle_with_annotations_traits< ParticleTraitsBase , AnnotationsType > inherits from the base traits class ParticleTraitsBase and re-implements the class Particle_type by simply adding an annotations attribute of type AnnotationsType .

Each of the predefined particle traits class without annotation has its own version with annotations, using the class SBL::Models::T_Particle_with_annotations_traits:

In any case, such classes have fixed annotations inheriting from the template parameter Annotations (if not void) such that:

Note that all the corresponding traits classes define also a Default_annotator type that is the annotator type for the default annotations (the template parameter Annotations is void).

Particles with System's Label

As explained in the user manual of the package MolecularSystemLabelsTraits, when particles are part of a molecular structure, the particles are associated to one of the partner of the structure, or to interfacial water, or even to extra particles. The class SBL::Models::T_Particle_with_system_label_traits< ParticleTraitsBase , PartnerLabelsTraits , MediatorLabelsTraits , ExtraLabelsTraits , IsSerializedLabel > provides a type of particle enriched with a system's label. The template parameters are, in their order of appearance:

  • the base traits class used to derive the enriched particle type,
  • a boolean tag to save the system's label when serialazing the particle (default is false).
[Advanced] Suppose that the tag IsSerializedLabel is false. If the particle type is serializable, saved particles may be loaded by another application using particle devoid of labels. For example, the application Space_filling_model_shelling_diagram_surface_encoding requires particle belonging to partners, so as to compute binding patches on these partners. Those patches can then be loaded by a program which does not use the notion of partner. Example such programs can be found in the application Space_filling_model_shelling_diagram_comparison.


Example: Instantiating particle Traits

This example shows the source file of the program $\text{\vorlumeEP}$, and explains how to use different models of ParticleTraits for this application:

#include <boost/archive/xml_oarchive.hpp>
#include <boost/serialization/set.hpp>
#include <SBL/Models/Atom_with_flat_info_and_annotations_traits.hpp>
#include <SBL/Models/PDB_file_loader.hpp>
#include "Space_filling_model_surface_volume_traits.hpp"
#include "Space_filling_model_surface_volume_workflow.hpp"
typedef std::vector<Particle_traits::Particle_type> Particles_container;
class Particles_builder
{
public:
template <class OutputIterator>
OutputIterator operator()(Molecular_geometry_loader& loader, OutputIterator out)const
{
return typename Particle_traits::Atom_with_flat_infos_builder()(loader.get_geometric_model(), out);
}
};
typedef T_Space_filling_model_surface_volume_traits<Particle_traits, Molecular_geometry_loader,
Particle_annotator, Particles_container,
Particles_builder> Traits;
//multiple archives used for skipping useless info on particles in the xml file
#ifdef WINDOWS_PLATFORM
#else
#endif
struct Residue_stats
{
int resid;
char chain_id;
std::string name;
double volume;
double area;
friend bool operator<(const Residue_stats& r1, const Residue_stats& r2)
{
if(r1.chain_id == r2.chain_id)
return r1.resid < r2.resid;
else
return r1.chain_id < r2.chain_id;
}
friend class access;
template <class Archive>
void serialize(Archive& ar, const unsigned int& i)
{
ar & boost::serialization::make_nvp("residue_sequence_number", resid);
ar & boost::serialization::make_nvp("residue_name", name);
std::ostringstream oss;
oss << chain_id;
std::string str = oss.str();
ar & boost::serialization::make_nvp("chain_identifier", str);
ar & boost::serialization::make_nvp("volume", volume);
ar & boost::serialization::make_nvp("area", area);
}
};
template<class Archive, class VolumeEngine>
void print_volumes_for_residue(Archive& ar, VolumeEngine& volume)
{
std::set<Residue_stats> residues;
for(typename VolumeEngine::Alpha_complex_3::Finite_vertices_iterator it = volume.get_alpha_complex().finite_vertices_begin(); it != volume.get_alpha_complex().finite_vertices_end(); it++)
{
Residue_stats stats;
stats.resid = it->get_particle().residue_sequence_number();
stats.chain_id = it->get_particle().chain_identifier();
if(residues.find(stats) == residues.end())
{
stats.name = it->get_particle().residue_name();
stats.volume = CGAL::to_double(volume.volume(it));
stats.area = CGAL::to_double(volume.area(it));
residues.insert(stats);
}
else
{
stats = *residues.find(stats);
stats.volume += CGAL::to_double(volume.volume(it));
stats.area += CGAL::to_double(volume.area(it));
residues.erase(stats);
residues.insert(stats);
}
}
ar & boost::serialization::make_nvp("residues", residues);
}
int main(int argc, char *argv[])
{
Workflow workflow("sbl-vorlume-pdb", "This program takes as input a PDB file.");
workflow.start(argc, argv);
Workflow::Surface_volume_module& module = workflow.get_volume_module();
std::string f_name = workflow.get_output_prefix() + "_residues_volume.xml";
std::ofstream out_xml(f_name.c_str());
out_xml << std::fixed;
boost::archive::xml_oarchive* xml = new boost::archive::xml_oarchive(out_xml);
if(module.get_surface_volume().exact != NULL)
print_volumes_for_residue(*xml, *module.get_surface_volume().exact);
else
print_volumes_for_residue(*xml, *module.get_surface_volume().inexact);
delete xml;
out_xml.close();
return 0;
}

This source file is organized as follows:

  • (i) first, a number definitions occur: the model of ParticleTraits to be used and its particle type, the way to load those particles, and the way they are built.
  • (ii) second, the program use the Workflow framework for instantiating an Application Traits class and an Application Workflow clas.
  • (iii) third, the main function creates a Workflow and start it.

Note that it is sufficient to change the models used in (i) for making a new program with a new type of particle. When particles are atoms, one may use the model SBL::Models::T_Atom_with_flat_info_and_annotations_traits as shown:

typedef std::vector<Particle_traits::Particle_type> Particles_container;
class Particles_builder
{
public:
template <class OutputIterator>
OutputIterator operator()(Molecular_geometry_loader& loader, OutputIterator out)const
{
return typename Particle_traits::Atom_with_flat_infos_builder()(loader.get_geometric_model(), out);
}
};

When particles are simple 3D spheres, one may use the model SBL::Models::T_Geometric_particle_traits as shown:

typedef CGAL::Exact_predicates_inexact_constructions_kernel K;
typedef std::vector<Particle_traits::Particle_type> Particles_container;
class Particles_builder
{
public:
template <class OutputIterator>
OutputIterator operator()(Molecular_geometry_loader& loader, OutputIterator out)const
{
return std::copy(loader.get_geometric_model().begin(), loader.get_geometric_model().end(), out);
}
};

Developing new models of ParticleTraits concept

The C++ concept ParticleTraits has five requirements:

  • the Geometric_kernel type: defines a number of geometric primitives for dealing with 3D geometric objects (in particular, the 3D points and the number type of the coordinates of the 3D points),
  • the Particle_type type: a serializable data structure representing the particle; it can be either a new data structure, or a simple typedef of an existing data structure,
  • the Get_geometric_representation type: a functor returning the geometric representation of a particle (e.g a 3D point, or a 3D weighted point),

Example: Defining Custom Particle Traits

This example shows the model SBL::Models::T_Geometric_particle_traits of the concept ParticleTraits representing a particle as a simple 3D geometric object (a weighted point by default):

template <class GeometricKernel = CGAL::Exact_predicates_inexact_constructions_kernel,
#ifdef CGAL_VERSION_GEQ_4_10
class GeometricRepresentation = typename GeometricKernel::Weighted_point_3 >
#else
class GeometricRepresentation = CGAL::Weighted_point<typename GeometricKernel::Point_3, typename GeometricKernel::FT> >
#endif
class T_Geometric_particle_traits
{
public:
typedef T_Geometric_particle_traits<GeometricKernel, GeometricRepresentation> Self;
typedef GeometricKernel Geometric_kernel;
typedef GeometricRepresentation Particle_type;
class Get_geometric_representation
{
public:
template <class ParticleType>
inline const GeometricRepresentation& operator()(const ParticleType& A)const
{
return A;
}
};//end class Get_geometric_representation
};//end class T_Geoemtric_particle_traits

While the three main types discussed in section Developing new models of ParticleTraits concept are defined in this header file, the serialization of the default type (the weighted point) is included from another header file (SBL/IO/Weighted_point_serialize.hpp), and the weighted-points are by default drawable using the package Molecular_viewers.

There is an additional type SBL::Models::T_Geometric_particle_traits::Annotator_default that is not a requirement but that is used for simplifying the code when instantiating an application with several models of ParticleTraits.