Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
User Manual

Authors: F. Cazals and T. Dreyfus and S. Loriot

Space_filling_model_surface_volume

Goals: Computing Surfaces, Volumes, and Packing Properties

Surface areas and volumes of molecular models are fundamental quantities in structural bioinformatics. Two widely used models are the van der Waals model and the solventaccessible models". The programs of Space_filling_model_surface_volume compute the surface area and the volume of a model defined by a union of balls, following the two aforementioned models. The underlying algorithm partitions the volume into the restrictions of the balls to their respective Voronoi cells [29] , exploiting properties of the $ \alpha $-complex of the balls. Two points need to be stressed.

Surface area and volume computations are numerically challenging, due to the geometric constructions involved . Unlike all previous approaches which have systematically used fixed precision floating point (FP) calculations, programs of Space_filling_model_surface_volume use advanced numerics (interval arithmetic and calculations on degree two algebraic numbers) to certify the surface areas and volumes reported. That is, all results are returned in the form of an interval of FP numbers, which is certified to contain the true result. This strategy has been used to shown that (i) calculations based solely on FP numbers systematically incur errors — of typically 20%, and (ii) that the running time overhead of applications of Space_filling_model_surface_volume is of about two.

The computation of restrictions being based on the $ \alpha $-complex of the balls processed, it is possible to compute along the way information on cavities of the model.

To recap, the main features of Space_filling_model_surface_volume are as follows:

  • Calculation of surface areas and volumes for a whole molecular model, or on a per ball basis – see package Union_of_balls_surface_volume_3. For the particular case of an atomic model of a protein, surface areas and volumes are also reported on a per residue level.

  • Calculation of the Betti numbers $ \beta_0, \beta_1, \beta_2 $ of the union of balls, which respectively characterize the number of connected components, the number of tunnels, and the number of cavities (i.e. pockets inside the molecule) – see package Betti_numbers.

  • Dissection of the surface area into its connected components. In particular, for a connected molecule, the surface areas of the cavities are returned – see package Union_of_balls_surface_volume_3.

  • Computation of the volume of the cavities, if any – see package Union_of_balls_surface_volume_3.

    A union of five balls in 3D.

    This union decomposes into five restrictions, each bounded by spherical caps and planar faces. The 1-skeleton of the restrictions are represented by green curves.

    The software Vorlume used to be distributed as a stand-alone program. It now embedded in the Space_filling_model_surface_volume application of the SBL .


Using Vorlume: Computing Surface Area and Volume of a Family of 3D Balls

This section presents the program $ \text{\vorlumeET} $.

Pre-requisites

We consider a collection of $ n $ balls, namely $ {B_i}_{i=1,\dots,n} $. We assume that the coordinates of the center, and the radius of each ball are rational numbers. We wish to compute the volume of the domain $ \calF = \uballs $, as well as the surface area of its boundary $ \buballs $.

We denote $ V_i $ the Voronoi region of ball $ B_i $ in the Voronoi (power) diagram of the balls, and we define the restriction $ R_i $ of the ball as its intersection with its Voronoi region, that is $ R_i = V_i \cap B_i $. A restriction is termed exposed if it contributes to the boundary of the union.

As proved in [29] and illustrated on Fig. Restrictions, it is sufficient to compute restrictions to report surfaces and volumes:

  • The volume of the union of balls is the sum of the volumes of the restrictions.
  • The boundary of the union is the union of the boundaries contributed by the restrictions.

These facts also account for the difficulty of computing surfaces and volumes, from a numerical standpoint. To understand why, it should be observed that two types of points are found on the boundary of a restriction:

  • points found at the intersection of three Voronoi facets. Such points are weighted circumcenters of the Delaunay triangulation of the input balls. These points have rational coordinates if one assumes that the input balls have a rational specification.
  • points on $ \buballs $ found at the intersection of three spheres. The coordinates of these points are degree two algebraic numbers.

The coordinates of these points appear in the formulae for the volume and surface area of a restriction, whence the need to compute them accurately. In a nutshell, these points can be handled in two ways, namely using interval arithmetic or using exact number types (rational numbers or algebraic numbers). The former is a fast strategy providing coarse approximations of the coordinates, while the latter provides more precise approximations, at the cost of more intensive calculations. Combining these strategies, we define the following three levels of precision:

  • 1: approximating boundary points and weighted circumcenters using interval arithmetic;
  • 2: approximating boundary points from their exact counterparts and weighted circumcenters using interval arithmetic;
  • 3: approximating boundary points and weighted circumcenters from their exact counterparts.

In any case, a surface area or a volume is returned as an interval certified to contain the exact unknown value. See [29] for the typical width of these intervals.

Input : Specifications and File Types

The main input of $ \text{\vorlumeET} $ is a collection of balls from which the (power) Voronoi diagram will be computed. The file format used is a text file listing the balls. A basic calculation is launched as follows:

> sbl-vorlume-txt.exe -f data/spheres.txt --directory results --verbose --output-prefix --log --boundary-viewer vmd
The main options of the program $ \text{\vorlumeET} $ are:
-f string: file listing the input family of 3D balls
–boundary-viewer string(=none): create a file for visualizing the boundary arcs with vmd or pymol
-E: switch to more exact but longer calculations


The option -E switches to a computation mode using an exact representation for the intersection points on the boundary of the union of balls – instead of using a representation based on interval arithmetic. This results in more precise surface areas and volumes, but a more time consuming computation–see package Union_of_balls_surface_volume_3 for more details.


File Name

Description

3D Spheres text file A list of 3D spheres (a line contains the coordinates and the radius of one sphere)
Input files for the run described in section Input : Specifications and File Types .

Output : Volume and Surface Areas

The main output is the list of volumes and surface areas per restriction of ball. When a ball is associated to an atom, information on the corresponding atom are also attached. The output volumes and surface areas are by default represented as floating numbers with ten digits, (one can switch to the interval representation with the option -I, and change the number of digits using -D <value> ). All these data are stored in a XML file for an easy parsing using PALSE. An example of serialization is available in the user manual of Multiple_archives_serialization.

The sum of all volumes and areas are also recorded into the log, together with a number of statistics (the number of simplices in the underlying triangulation, the number of faces on the boundary of the union of balls, the Betti numbers, etc..).

Preview>File Name

Description

General: log file

Log file

Log file containing high level information on the run of $ \text{\vorlumeET} $ General

Module boundary

Click it Boundary vmd file

Visualization state file of VMD for halfedges of the boundary of the input 3D balls

Moduel surface / volume

Surfaces-Volumes XML file

Description of volumes and surface areas of the input txt file

Output files for the runs described in section Input : Specifications and File Types, classified by modules – see Fig. fig-vorlume-workflow .

For visualizing the halfedges of the boundary, we recommend you to install VMD (see the VMD web site): first load the input PDB file, then load the output visualization state files. The loading of visualization state files may be long, but is a lot faster using the fastload VMD script delivered with the library (in the scripts/vmd directory).

Using Vorlume: Computing Surface Area and Volume of a Molecule

This section presents the program $ \text{\vorlumeEP} $ . The pre-requisites being identical to those of $ \text{\vorlumeET} $, the reader is referred to section Pre-requisites.

We just note that for the particular case of an atomic model of a protein or protein complex, surface areas and volumes are reported on a per atom basis, but also for individual residues–in two different files.

Input : Specifications and File Types

The main input of $ \text{\vorlumeEP} $ is a molecular structure loaded from a PDB file. In this case, the ESBTL library is used for deducing the set of balls corresponding to the molecule. Various options are available for tunning the set of balls representing the molecule (see the class SBL::Models::T_PDB_file_loader). Note that a default radius of $ 1.4 \AA $ is added to all atoms to define the Solvent Accessible Model of the input molecule. A basic calculation is launched as follows:

> sbl-vorlume-pdb.exe -f data/1vfb.pdb --directory results --verbose --output-prefix --log --boundary-viewer vmd
The main options of the program $ \text{\vorlumeEP} $ are:
-f string: PDB file of the molecule to coarse grain
–boundary-viewer string(=none): create a file for visualizing the boundary arcs with vmd or pymol
-E: switch to more exact but longer calculations


File Name

Description

1vfb PDB file PDB file of an Immunoglobuline-Antibody complex
Input files for the run described in section Input : Specifications and File Types .

Output : Volume and Surface Areas per Particle

The output is identical to $ \text{\vorlumeET} $, except that the volumes are classified by particle.

Preview>File Name

Description

General: log file

Log file

Log file containing high level information on the run of $ \text{\vorlumeEP} $

Module boundary

Click it Boundary vmd file

Visualization state file of VMD for halfedges of the boundary of the input 3D particles

Module surface / volume

Surface-Volumes XML file

Description of volumes and surface areas of the input PDB file

Surface-Volumes per Residue XML file

Description of volumes and surface areas per residue in the input PDB file

Output files for the runs described in section Input : Specifications and File Types .

Visualization, Plugins, GUIs

The SBL provides VMD and PyMOL plugins to use the programs of Space_filling_model_surface_volume . The plugins are accessible in the Extensions menu of VMD or in the Plugin menu of PyMOL . Upon termination of a calculation launched by the plugin, the following visualizations are available:

  • (for VMD only) the edges (circle arcs) found on the boundary of the union of balls representing the atoms.

The new command "sbl_vorlume" is also defined: given a selection of atoms, it returns the volume and the solvent accessible surface area of this selection. For the sake of illustration, let us assume that the selection consists of all atoms. For VMD, the syntax is:

> sbl_vorlume [atomselect top "all"]

For PyMOL, the syntax is:

> sbl_vorlume("all")

Programmer's Workflow

The programs of Space_filling_model_surface_volume described above are based on generic C++ classes, so that additional versions can easily be developed.

In order to derive such versions, there are two important ingredients: the workflow class and the traits class.

The Traits Class

T_Space_filling_model_surface_volume_traits:

This class defines the main types used in the modules of the workflow. It is templated by the classes of the concepts required by these modules. This design makes it possible to use the same workflow within different(biophysical) contexts to make new programs. To use the workflow T_Space_filling_model_surface_volume_workflow , one needs to define:

  • what is a particle, and how to represent it,
  • how to load particles from an input file,
  • how to annotate the particles,
  • how to iterate over the set of particles,
  • how to build a particle if non-trivial steps are required (e.g., building pseudo-atoms from residues in a protein).
Template Parameters
ParticleTraitsModel of the concept ParticleTraits.
MolecularGeometryLoaderFollow the concept MolecularGeometryLoader and can be either SBL::Models::T_PDB_file_loader when the input is a PDB file, or SBL::Models::T_Spheres_3_file_loader when the input is a file listing spheres.
ParticleAnnotatorFollow the concept ParticleAnnotator.
ParticlesContainerType of container for particles used in the $ \alpha $-complex module. This is used in particular for iterating either over particles inserted in this container, or over an existing container that needs to be wrapped by this data structure. The type ParticlesContainer just needs to define the methods begin() and end(), and the method push_back(const_reference).
ParticlesBuilderFunctor building the input particles, as defined in the concept ParticleTraits.

The Workflow Class

T_Space_filling_model_surface_volume_workflow: