Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
User Manual

Authors: F. Cazals and T. Dreyfus

Space_filling_model_interface

Goal: Dissecting Macro-molecular Interfaces

In the sequel, we consider a molecular structure which can be decomposed into molecular systems, called systems for short. In the simplest case, each system corresponds to one partner in a complex. In a more general setting, as we shall see, a system provides a hierarchical decomposition of a partner. Note that the notion of system is oblivious to the precise geometric representation: in short, a system is defined by a collection of balls, which may be atoms or pseudo-atoms. For the sake of generality, these balls are called particles. At this stage, we just notice that when the particles represent atoms, the geometric model used is the solvent accessible model.

In this context, Space_filling_model_interface is a package of applications computing a parameter free representation of macro-molecular interfaces between systems, based on the $ \alpha $-complex of the particles. The benefits of such a model, over classical ones such as the distance footprint (the sets of particles within a prescribed distance threshold), or the particles losing solvent accessibility have been discussed in detail in the literature (see [28] , [32] ) and can be summarized as follows:

  • The interface model is parameter free;
  • It detects pairs of interface particles, either from the partners forming a complex, or from mediators (e.g. water molecules squeezed in-between the partners);
  • It defines a cell complex (a union of surfaces) separating the partners, and amenable to a precise description in terms of patches (connected components), surface area, curvature, particle composition;
  • It qualifies interface particles and Voronoi tiles as a function of their depth at the interface, thus generalizing the classical core-rim models.

These features are bound to two geometric constructions, namely the Voronoi diagram of the particles, and the associated $ \alpha $-complex. In particular, the rich mathematical structure of these constructions allows deriving all the aforementioned parameters without any third party algorithm or construction. These parameters proved instrumental to unveil novel insights on the location of conserved residues at interfaces, on the polarity of interface residues, or on interfacial water dynamics that were obtained in [18] , [25] , [28] .

Space_filling_model_interface is based on generic c++ code, so that new applications are easily obtained for specific biophysical models, e.g. for specific atomic or coarse-grain models. By default, the following atomic resolution applications of Space_filling_model_interface are provided:

  • $ \text{\intervorEABW} $ , to model atomic resolution binary complexes which may involve interfacial water molecules.
  • $ \text{\intervorEIGAGW} $ , to model atomic resolution immunoglobulin - antigen complexes (IG-Ag), the IG being decomposed hierarchically into its heavy and light chains.

Applications of Space_filling_model_interface for coarse-grain models with one or two pseudo-atoms per residue is also provided.

Using Intervor: Dissecting the Interface of a Binary Complex

In this section, we introduce the basic application of Space_filling_model_interface using the program $ \text{\intervorEABW} $, which is used to model the interaction between two partners $ A $ and $ B $, mediated by interfacial water molecules $ W $. The input is assumed to be a high resolution crystal structure of the complex, so that the particles represent atoms, using their atomic group radii.

We represent this atomic structure by its solvent accessible model, that is, the radius of each particle is enlarged by the radius of a water probe, typically $ r_w=1.4\AA $.

These constructions are illustrated for the case of an IG (partner $ A $) - Ag (partner $ B $) complex whose interaction is mediated by water molecules ( $ W $).

Pre-requisites

Molecular System's Labels

To specify a binary complex, we need to identify the partners and qualify their geometry: these two aspects correspond to the specification of molecular systems and of a molecular geometry.

The molecular systems' decomposition refers to the assignment of labels to the particles. Each molecular system actually has a label falling on one of the three following categories:

  • The partners' labels: they correspond to the particles of the molecules whose interaction is under scrutiny. For a binary complex, there are two labels, which we call $ A $ and $ B $ by default.
  • The mediators' labels: they correspond to particles sandwiched in-between the partners. The most common mediator is the water molecule, generally called mediator $ W $. Since, as noticed above, we work with group radii to account for the absence of hydrogen atoms, a water molecule reduces to its oxygen atom.
  • The extras' labels: they correspond to the remaining particles (e.g. ions, metabolites). If any, all extra particles may be labeled $ X $.

Geometric Model

The molecular geometry refers to the geometric representation of the complex studied. In this case, since we take as input a high resolution crystal structure, we use a solvent accessible model. This model is used to build two geometric structures:

  • the regular triangulation of the particles, from which the dual Voronoi diagram can be obtained;
  • the $ \alpha $-complex associated to this regular triangulation, for $ \alpha = 0 $.

For a thorough description of these construtions and of the software used, the reader is referred to the manuals of the CGAL library , and in particular to the Regular_triangulation_3 and Alpha_shape_3 packages.

Voronoi Interfaces

In this section, we sketch the interface model, which is precisely exposed in [32] , and which is illustrated on Fig. Voronoi interfaces.

Assume that the particles from the molecular complex studied have been assigned labels. In the sequel, we focus on edges of the $ 0 $-complex such that the two particles found at the endpoints of each such edge have different labels.

An interface mediator particle is a particle whose label is of the mediator type, and which is sandwiched in-between the partners, in the following sense: this particle makes at least one edge with each partner in the $ 0 $-complex.


The typical case is that of water molecules: while water molecules sandwiched in-between the partners at the interface are of interest to study the interaction, those only bound to one partner are not — see Fig. Voronoi interfaces (b). Thus, in the sequel, mediator only refers to oxygen atom of a sandwiched water molecule, as just defined.

An interface edge is an edge of the $ 0 $-complex such that the labels of its two particles are different, and are either both partner labels, or partner label and mediator label.


An interface Voronoi facet is the dual in the Voronoi diagram of an interface edge of two particles.


The Voronoi interface associated to two labels is the set of Voronoi facets dual of the interface edges. Our focus is actually on three types of interfaces:

  • the bicolor interface(s), corresponding to Voronoi interfaces between the partners. On our example, there is one such interface, the $ AB $ interface.
  • the mediated interfaces, corresponding to Voronoi interfaces between any partner and the mediator. On our example, there are two such interfaces, namely $ AW $ and $ BW $. Gluing mediated interfaces $ AW $ and $ BW $ results on the mediated interface $ AW-BW $.
  • the tricolor interfaces, corresponding to the merge of bicolor and mediated interfaces of the partners and mediator. On our example, there is one such interface, namely $ ABW=AB\cup (AW-BW) $.
These notions are illustrated on Fig. Voronoi interfaces. Note in particular that on the toy system of Fig. Voronoi interfaces (a), there is only one mediator particle (in grey) at the interface, namely $ w_1 $. Note also that on the real IG-Ag complex of Fig. Voronoi interfaces (b c d), the interface $ AB $ has three connected components, which get sealed by the interfaces involving the water molecules.


Voronoi Interfaces

(a) 2D illustration (b c d) Illustration for an IG-Ag complex, whose interaction is mediated by water molecules.

(a) The Voronoi diagram of four balls (one red particle, one blue particle, two gray water molecules is shown in solid lines). Each ball has been clipped to its Voronoi region, defining its Voronoi restriction . The $ \alpha $-complex, for $ \alpha=0 $, identifies pairs of restrictions which intersect, here three pairs: $ (a_1,b_1) $ defines the $ AB $ interface between the two partners, $ (a_1,w_1) $ and $ (b_1,w_1) $ respectively define the $ AW $ and $ BW $ interfaces. (b) Side view of the IG-Ag complex, PDB code 1vfb, with the interface particles identified. Note the water molecules (gray) squeezed in-between the two partners. (c) The bicolor interface $ AB $ features one Voronoi facet for each pair of particles labelled $ A $ and $ B $ found in the $ 0 $-complex. Each such Vononoi facet corresponds to a direct interaction between the partners. Note also the interfacial water molecules punching holes at this interface. (d) The tricolor interface, defined as the union of the $ AB $ and $ AW-BW $ interfaces. The Voronoi facets have been color coded as a function of their distance to the boundary of the Voronoi interface.

Input: Specifications and File Types

The IG found in file 1vfb.pdb consists of two chains tagged as A and B in the PDB file, while the antigen (lysozyme) is chain C. The main workflow of the $ \text{\intervorEABW} $ program is shown on Fig. Workflow. Note that a default radius of $ 1.4 \AA $ is added to all atoms to define the Solvent Accessible Model of the input molecule. Thus, the calculation is launched as follows:

> sbl-intervor-ABW-atomic.exe -f data/1vfb.pdb -P AB -P C --directory results --verbose --output-prefix --log --interface-viewer vmd
The main options of the program $ \text{\intervorEABW} $ are:
-f string: PDB file of the input molecule
-P string: specifiation of the chains of one partner (use it twice for both partners)
–interface-viewer string(=none): create a file for visualizing the interfaces with vmd or pymol


File Name

Description

1vfb pdb file

Immunoglobulin-antigen complex

Input files for the run described in section Input: Specifications and File Types .

Output: Interface Particles and Voronoi Interfaces

PreviewFile Name

Description

General: log file

Log file

Log file containing high level information on the run of $ \text{\intervorEABW} $

Module structure classifier: molecular system's labels and geometric model

AB structure classifier xml file

XML archive listing particles per partner and contacts between partners

Module molecular interfaces builder: Voronoi interfaces

AB Voronoi Interfaces xml file

XML archive containing the serialized Voronoi interfaces between A and B

Click it AB Voronoi Interfaces VMD file

Visualization state file of the Voronoi interfaces between A and B for the VMD software

Output files for the runs described in section Input: Specifications and File Types, classified by modules – see Fig. fig-intervor-workflow .

For visualizing the Voronoi Interfaces, we recommend you to install VMD (see the VMD web site): first load the input PDB file, then load the output visualization state files. The loading of visualization state files may be long, but is a lot faster using the fastload VMD script delivered with the library (in the scripts/vmd directory).

Analysis: Interface Statistics with PALSE

Bicolor and mediated interfaces come with a number of built-in geometric and topological statistics, which are provided per connected components — see [32] for details:

  • Topology: number of patches (i.e connected components or c.c.), holes in the patches;
  • Geometry: number of interface edges, surface area and curvature of the Voronoi facets, buried surface area of the interface atoms, length of the perimeter of the Voronoi interface.
  • Biochemistry: number of particles per patch, tagging of these particles as a function of their types and properties–the PDB atom names and user defined annotations as explained in ParticleAnnotator.

Tricolor interfaces also have specific built-in statistics:

  • Topology: variation of the number of patches upon merging the bicolor and mediated interfaces, as the latter typically bridge gaps between connected components of bicolor interfaces.
  • Shelling order: an interface can be shelled into concentric shells, so as to define a depth of the particles at the interface, thus going beyond the traditional dissection of interface particles into a core and a rim.

Assuming that we run intervor over a number of binary complexes, one would want to know statistics about the number of interface particles, or the interface surface area across the complexes. The following python script uses PALSE over the generated xml files to get these statistics:

from PALSE import *

data_base = PALSE_xml_DB()
data_base.load_from_directory(".")
number_of_atoms_at_AB = data_base.get_leftmost_data_values_from_database("AB_interface_output/AB_interface_statistics/number_of_particles", int)
PALSE_statistic_handle.Rhist2d(number_of_atoms_at_AB, "hist-interface-atoms-AB.eps")

surface_areas_at_AB_with_A = data_base.get_leftmost_data_values_from_database("AB_interface_output/total_area_with_A", float)
surface_areas_at_AB_with_B = data_base.get_leftmost_data_values_from_database("AB_interface_output/total_area_with_B", float)

surface_areas_at_AB = []
for i in range(len(surface_areas_at_AB_with_A)):
    surface_areas_at_AB.append(surface_areas_at_AB_with_A[i] + surface_areas_at_AB_with_B[i])
PALSE_statistic_handle.Rhist2d(surface_areas_at_AB, "hist-interface-areas-AB.eps")

Using Intervor: Dissecting the Interface of a Complex whose Partners have a Hierarchical Structure

This section presents the program $ \text{\intervorEIGAGW} $.

Motivation

The case of binary interactions mediated by water molecules easily generalizes to the case where each of the partners has a hierarchical structure i.e. can be partitioned into sub-systems.

Consider e.g. an immunoglobulin whose isotype is IgD or IgE or IgG (the IG involves one monomer of two heavy and two light chains, see e.g. the Wikipedia Antibody or the ImMunoGeneTics information system (IMGT) pages), and assume this IG makes up an IG-Ag complex. Such an IG may be considered in three guises:

  • it may be decomposed into its heavy and light chains;
  • or it may be decomposed even further by partitioning each variable domain into three Complementary Determining Regions (CDR) and four framework regions.

Thus, if this IG is involved in an IG-Ag complex, this interaction may be studied at three different levels. As a pre-requisite to do so, we introduce hierarchical decompositions of molecular systems.

Pre-requisites

Molecular System's Labels Forest

To study interfaces, the complex under scrutiny admits a hierarchical decomposition into molecular systems, each of them being identified by a label. That is, we assume that the trees encoding these labels correspond to the partners, the mediators, and the extras. Recall that the molecular system associated with a label $ l $ is denoted system( $ l $).

Note that each category of labels (the partners, the mediators and the extras) yields a forest of labels whose leaves are called primitive labels. The forest is possibly reduced to a tree, itself possibly reduced to a singleton.

In the following, we call $ p $ the number of primitive partners and $ m $ the number of primitive mediators, and we let $ X $ stand for the extra particles. By default, these extra particles are identified but not included in the geometric model of the complex. Note also that in this example, there is a single mediator, namely water.

Primitive and Hierarchical Interfaces

The Voronoi interfaces involving only primitive labels are called primitive interfaces , while all other Voronoi interfaces are called hierarchical interfaces . As illustrated on Fig. Hierarchical interfaces, the two goals are:

  • To provide information on the $ \binom{p}{2} $ primitive bicolor interfaces. Note that for a pair of partners' labels having a common ancestor $ l $, the corresponding information corresponds to contacts within the system( $ l $).
  • To provide multi-scale information, exploiting the hierarchical structure of each hierarchical label.

Assuming that the partners' forest and the mediators' forest are given, we first compute all the primitive interfaces in three steps, illustrated on Fig. Hierarchical interfaces for the case of an IG-Ag complex:

  • First, each particle is assigned its primitive label.
  • Second, a primitive bicolor interface is computed for each couple out of the $ \binom{p}{2} $ pairs involving two primitive partners, and a primitive mediated interface is computed for each triple out of the $ m\binom{p}{2} $ triples involving two primitive partners and one primitive mediator.
  • Third, $ m\binom{p}{2} $ primitive tricolor interfaces are created, by merging the appropriate primitive bicolor and mediated interfaces.
    The three kind of primitive interfaces (bicolor, mediated, tricolor) can also be defined for the hierarchical labels, exploiting the hierarchy induced by the partners' forest and the mediators' forest. We illustrate this process for our IG-Ag complex.
The previous mechanism is illustrated on the IG-Ag complex, which involves three primitive partner labels, namely $ H, L, Ag $. These labels result in three primitive bicolor and three primitive mediated interfaces, as represented on the matrices of Fig. Hierarchical interfaces.


Note that the interfaces provide information on intra-molecular contacts thanks to the interface $ LH $. Note also that the hierarchical interfaces involving $ IG $ and $ Ag $ are decomposed as follows:
Hierarchical bicolor interface: $ IGAg = (L \cup H)Ag = LAg \cup HAg $
Hierarchical mediated interface: $ IGW-AgW =(L \cup H)W-AgW = (LW-AgW) \cup (HW-AgW) $
Hierarchical tricolor interface: $ IGAgW = IGAg\cup (IGW-AgW) $"

Hierarchical interfaces

Decomposition of the IG-Ag complex into molecular systems and hierarchical interfaces (Top Left) Schematic representation of an IG-Ag complex with the different labels of the molecular systems involved. (Top Right) Corresponding partners' and mediators' forests. Note that the partners' forest has two trees. (Bottom) Representation of all possible primitive interfaces (bicolor and mediated on the matrix, tricolor under the matrix).

Input: Specifications and File Types

We use again the 1vfb immunoglobulin, but now, in addition to the previous results, we want to investigate the interfaces between the heavy and light chains of 1vfb. The syntax is almost identical to the one we saw before, except that the heavy and light chains have to be specified separately, namely: -igL A -igH B -ag C).

> sbl-intervor-IGAgW-atomic.exe -f data/1vfb.pdb --igL A --igH B --ag C  --directory results --verbose --output-prefix --log --interface-viewer vmd
The main options of the program $ \text{\intervorEIGAGW} $ are:
-f string: PDB file of the input molecule
–igL string: specifiation of the light chains of the immunoglobulin
–igH string: specifiation of the heavy chains of the immunoglobulin
–ag string: specifiation of the chains of the antigen
–interface-viewer string(=none): create a file for visualizing the interfaces with vmd or pymol


File Name

Description

1vfb pdb file

Immunoglobulin-antigen complex

Input files for the run described in section Input: Specifications and File Types .

Output: Primitive and Hierarchical Interfaces

PreviewFile Name

Description

General

Log file

Log file containing high level information on the run of $ \text{\intervorEIGAGW} $

Structure classifier

IGAgW structure classifier xml file

XML archive listing particles per partner and contacts between partners

Molecular interfaces builder

AgIG Voronoi Interfaces xml file

XML archive containing the serialized Voronoi interfaces between Ag and IG

Click it AgIG Voronoi Interfaces VMD file

Visualization state file of the Voronoi interfaces between Ag and IG for the VMD software

AgH Voronoi Interfaces xml file

XML archive containing the serialized Voronoi interfaces between the heavy chain of IG and Ag

Click it AgH Voronoi Interfaces VMD file

Visualization state file of the Voronoi interfaces between the heavy chain of IG and Ag for the VMD software

AgL Voronoi Interfaces xml file

XML archive containing the serialized Voronoi interfaces between the light chain of IG and Ag

Click it AgL Voronoi Interfaces VMD file

Visualization state file of the Voronoi interfaces between the light chain of IG and Ag for the VMD software

LH Voronoi Interfaces xml file

XML archive containing the serialized Voronoi interfaces between the light and heavy chains of IG

Click it LH Voronoi Interfaces VMD file

Visualization state file of the Voronoi interfaces between the light and heavy chains of IG for the VMD software

Output files for the runs described in section Input: Specifications and File Types.

The generated log is identical to that generated by $ \text{\intervorEABW} $ , except that there are statistics for each non-empty (bicolor, mediated or tricolor) interface. Furthermore, for each couple of partners with a non-empty bicolor interface, one triple of files (XML, VMD and PDB) is generated.

Visualization, Plugins, GUIs

The SBL provides VMD and PyMOL plugins to use the programs of Space_filling_model_interface . The plugins are accessible in the Extensions menu of VMD or in the Plugin menu of PyMOL . Upon termination of a calculation launched by the plugin, the following visualizations are available:

  • the Voronoi (bicolor, mediated and tricolor) interfaces,
  • the selection of atoms involved in the different interfaces,
  • (optionally) the Voronoi tricolor interface colored by shelling order.

Algorithms and Methods

The algorithm used in the applications of Space_filling_model_interface are standard ones, which we only briefly sketch in the sequel — see [32] for a detailed treatment.

Contacts

The classification of contacts between particles relies on the $ \alpha $-complex of the particles for $ \alpha $=0. More precisely, the only information required is the status (singular, regular, interior) of an edge of the regular triangulation of the particles in the $ 0 $-complex. See the Alpha_shape_3 package from CGAL for further details.

Interfaces

From a topological standpoint, the bicolor and mediated interfaces are so-called manifold interfaces [87] . In a nutshell, this means that a triangle from the regular triangulation contributes at most two edges to this interface. By the Voronoi - Delaunay duality, this also means that the topological neighborhood of a point located in the relative interior of a Voronoi edge bounding an interface Voronoi facet is either a topological disk or half a topological disk. Note that a tricolor interface may not be a manifold interface.

From an algorithmic standpoint, the computation of primitive interfaces relies on two algorithms:

  • The basic construction is that of a manifold interface. To enumerate the connected components of a manifold interface, we perform a breadth first exploration of the Voronoi facets of that interface. Using the duality between a Voronoi diagram and the associated Delaunay (regular) triangulation, this only requires collecting the interface Delaunay edges contributed by a triangle from the $ \alpha $-complex.

In particular, bicolor and mediated interfaces are manifold interfaces. See [87] for the details.

  • To build tricolor interfaces from bicolor ones, bicolor interfaces are build first, and are then merged using the Union-Find algorithm.
  • The treatment of hierarchical interfaces is also straightforward. For example, a hierarchical bicolor interface remains manifold: the primitive manifold interface of $ AB $ obtained by considering $ A $ as a primitive partner is exactly the same than the hierarchical manifold interface of $ AB $. This is also true for the hierarchical mediated interface.

Curvatures

The curvature of each connected component of all Voronoi interfaces (hierarchical or primitive, bicolor, mediated or tricolor) is computed twice, one calculation for each partner. First, all the Voronoi edges in the interior of the connected component involving the selected partner are gathered. We orientate the Delaunay edges dual of the Voronoi faces incident to the Voronoi edges such that the particles in the selected partner are the sources of these Delaunay edges. Then, we compute the curvature associated to a Voronoi edge using:

  • the orientated angle between the two Delaunay edges,
  • the length of the Voronoi edge (finite since it is in the interior of the interface).

The sum of the curvatures of all the interior Voronoi edges of a connected component defines its curvature, and the sum of the curvature of all connected components of an interface defines the curvature of the interface. Note that an unsigned version of the curvature is also computed (when summing the curvatures of interior Voronoi edges, the absolute value of the curvature is considered).

Programmer's Workflow

The programs of Space_filling_model_interface described above are based upon generic C++ classes, so that additional versions can easily be developed for protein or nucleic acids, both at atomic or coarse-grain resolution.

In order to derive such versions, there are two important ingredients, that are the workflow class, and its traits class.

The Traits Class

T_Space_filling_model_interface_traits:

This class defines the main types used in the modules of the workflow. It is templated by the classes of the concepts required by these modules. This design makes it possible to use the same workflow within different(biophysical) contexts to make new programs. To use the workflow T_Space_filling_model_interface_workflow , one needs to define:

  • what is a particle (atoms or pseudo-atoms),
  • what are the partners (e.g a binary complex, an IG/Ag complex, etc.),
  • what are the mediators (e.g water molecules),
  • how to annotate the particles,
  • how to build a particle if non-trivial steps are required (e.g., building pseudo-atoms from residues in a protein).
Template Parameters
ParticleTraitsBaseTraits class defining the type of particles (atoms or pseudo-atoms) following the concept ParticleTraits. It is a base for the class SBL::Models::T_Particle_with_system_label_traits that adds a system's label to each particle.
PartnerLabelsTraitsTraits class defining the type of system's labels for the partners following the concept MolecularSystemLabelsTraits.
MediatorLabelsTraitsTraits class defining the type of system's labels for the mediators following the concept MolecularSystemLabelsTraits.
ParticleAnnotatorFollow the concept ParticleAnnotator.
ParticlesBuilderFunctor building the input particles, as defined in the concept ParticleTraits.
IsSerializedLabelTag for serializing the system labels with the particles.

The Workflow Class

T_Space_filling_model_interface_workflow: