![]() |
Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
|
Tutorial guiding the installation of the library.
There are four main modes to use the SBL:
We briefly present the conda channel for the SBL.
Delivery. Using conda makes it possible to install the SBL, and in particular all executables, for all linux environments and MacOS. We now review the required knowledge.
Conda and miniconda. Conda is a cross-platform multi-language environment management system, see Conda .
There are several distributions, including Anaconda and Miniconda. The latter just ships the repository management system, which is sufficient for our purposes. Thus, visit the Miniconda distribution per OS page and install miniconda for your operating system. In the sequel, we assume that the Conda directory created is named miniconda2 .
Conda channels, the sbl chanel, conda environment, conda packages. We briefly mention the main conda concepts used thereafter:
A local environment has all required resources (libraries and dependencies in particular) to compile and run executables from a library. It may be seen as a virtual environment, except that all files are located in one's Conda directory. Two nice features are the following ones:
The SBL conda channel.
To distribute an environment, one creates a Conda channel, from which packages are distributed.
Note that the web page of a package informs on which operating systems are supported.
In the following, we explain how to install the SBL via its conda channel. See also Conda package management commands.
Installing / updating the SBL package.
conda create --name my-sbl-conda-env
conda init (OR) source /etc/profile.d/conda.{csh,zsh,etc} conda activate my-sbl-conda-env conda deactivate
conda install sbl -c sbl -c conda-forge -c biobuilds
conda update -c sbl sbl
Singularity Singularity is a containerization paradigm allowing for fast, isolated, secure, and portable execution of any application on all Linux systems. It's more agile and reliable than a virtual machine, while being extremely easy to use and deploy.
Singularity image files are read-only files containing one or more applications plus the whole environment (libraries, environment, etc.) needed for those applications to run. A priori, the container's filesystem is completely isolated from the one of the machine hosting it. Nonetheless, there are specific locations that are bound by default to the host filesystem: $HOME, /proc, /dev, /tmp, /sys, and $PWD paths and all their subdirectories will be mapped to the corresponding ones on the host machine, in order to allow for communication between the container and the machine. The Singularity User Guide is helpful in explaining how other locations on the host machine can be mapped into the container.
In order to browse the content of a Singularity image file, a shell can be instantiated with the command
singularity shell <container>.simg
where <container>.simg is the name of the image file.
Nonetheless, is not needed to access the container's shell for using the container's executables: any resource included in a container can be used via the command
singularity exec <container>.simg <command>
where <command> is any command the container shell can parse.
A Singularity image file containing the whole installation of SBL is maintained by ABS. In order to use it:
singularity exec SBL.simg <sbl-command>You can browse SBL commands and library files by entering the container shell with
singularity shell SBL.simg
The section presents the dependencies required to install and compile the SBL. The following are two important remarks:
set(SBL_USE_LIBS ESBTL CGAL MPFR GMP Boost Eigen)
These libraries provide classes implementing number types and the accompanying operations, allowing the development of algorithms with specific arithmetic requirements:
Note that GMP and MPFR are mandatory to use CGAL . For each library, if one of them is installed in a non standard location, the <LIBRARY_NAME>_DIR environment variable needs to point to the root directory of that library.
In the SBL, interval arithmetics is managed using the Boost library. However, in order to use multi-precision interval arithmetic, the library MPFI has to be installed. It is used in particular to compute the volume of unions of balls in SBL. When MPFI is used for the compilation of a program, the C++ macro SBL_WITH_MPFI is automatically defined, enforcing the use of MPFI rather than Boost for managing multi-precision interval arithmetic. If MPFI is installed in a non standard location, the MPFI_DIR environment variable needs to point to the root directory of the library.
The reference C++ Boost libraries provide various tools used throughout the library — see Boost home page for more details.
While the generic components of Boost are directly integrated in Core, the following non generic Boost packages are used in the applications:
If Boost is installed in a non standard location, the BOOST_ROOT environment variable needs to point to the root directory of Boost .
The Computational Geometry Algorithms Library (CGAL) provides various core geometric constructions. In particular, CGAL is used in large number of packages in Core. A number of libraries are provided with the CGAL library, as the GMP and MPFR libraries. A detailed explanation on how to install the CGAL library is provided on the CGAL installation guide.
Note that if the CGAL library is installed in a non standard location, the CGAL_DIR environment variable must point to the root directory of CGAL.
The Easy Structural Biology Template Library (ESBTL) library is a generic C++ library (header-only) for parsing and managing data in a PDB file. It also provides geometric representations of molecules using CGAL.
The ESBTL library is provided with the SBL library as third party–no installation required. To use one's version of ESBTL, just set the environment variable ESBTL_DIR to the root directory of the custom ESBTL.
The Eigen library is used for linear algebra, in particular to represent matrices and compute eigenvalues.
If the Eigen is installed in a non standard location, the EIGEN_DIR environment variable needs to point to the root directory of the Eigen library.
%ii-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%-%
Ipopt (Interior Point OPTimizer; a.k.a. coin-or-Ipopt) is a library for non-linear optimization. It is used in the package Real_value_function_minimizer for finding local minima of real value functions. If the library is installed in a non-standard location, the IPOPT_ROOT environment variable needs to point to the root directory of the library.
LBFGS++ (Limited-memory BFGS) is a header-only C++ library that implements the eponym algorithm–see the Download or clone page. It is used in the package Real_value_function_minimizer for finding local minima of real value functions. Since it is header-only, it is easy to install and to use, making an easy alternative to the Ipopt library. If the library is installed in a non-standard location, the LBFGSPP_DIR environment variable needs to point to the include directory of the library.
The binary program lp_solve is used to solve the linear programs arising when comparing energy landscapes. The environment variable LP_SOLVE must be set to indicate the location of the binary.
Seqan is a C++ header-only library providing a collection of sequence alignment algorithms. It is used in the package Alignment_engines. If the library is installed in a non-standard location, the SEQAN_DIR environment variable needs to point to the include directory of the library.
FLANN (Fast Library for Approximate Nearest Neighbors) is a library offering a collection of approximate nearest neighbor algorithms and methods for peaking the best algorithm to use ependening on the input dataset. It is wrapped in the class SBL::GT::T_ANN_FLANN_wrapper of the package Spatial_search for comparing / replacing the other algorithms implemented in the package. If the library is installed in a non-standard location, the FLANN_DIR environment variable needs to point to the root directory of the library.
Gromacs is used for loading trajectories of conformations from XTC files in the package MolecularGeometryLoader. If the library is installed in a non-standard location, the GROMACS_DIR environment variable needs to point to the root directory of the library.
rapidxml is a header only C++ XML parser used in the SBL for loading force field parameters from XML files in the package Molecular_potential_energy . If the library is installed in a non-standard location, the RAPIDXML_DIR environment variable needs to point to the root directory of the library.
OpenMP is used to parallelize loops in the SBL. It is particularly used for parallelizing the run of collections of modules within workflows – see Modules when using workflows. When OpenMP is used for the compilation of a program, the C++ macro SBL_WITH_OPENMP is automatically defined, enforcing the use of OpenMP for parallelizing the for loops.
The source code is available from the following tarball.
It may also be obtained by cloning the read-only git repository as follows:
> git clone https://gitlab.inria.fr/abs/sbl.git
In the sequel, we assume that the environment variable SBL_DIR points to the directory containing the source code.
To compile and install the library from this source code, CMake is used. The version 2.6 or latter of CMake is recommended. Note that the following installation requires root privileges: if you do not have them, refer to section Non standard installation directory.
The installation runs through four steps:
> mkdir build_sbl; cd build_sbl
> cmake \<path/to/your/sbl/git/directory\> -DSBL_APPLICATIONS=ON
> make; make install
This last step will compile the programs (if SBL_APPLICATIONS is set to ON). It will also copy files around, into the standard locations indicated below (or into the directory pointed at by the CMAKE_INSTALL_PREFIX, see below):
Note that if a new version of the library is available, the installation must be carried out again upon updating the git repository.
> make; make uninstall
In this section, we show the various options for compiling the different parts of the library, and installing it.
When installing the SBL library, one may not have the root privileges, may want to install the SBL into a local directory. Doing so merely requires setting the cmake variable CMAKE_INSTALL_PREFIX to your local install directory when running cmake:
> cmake \</path/to/sbl/directory\> -DCMAKE_INSTALL_PREFIX=\</path/to/local/install/directory\> > make > make install
Given this target directory, executables are installed in the bin sub-folder of the target install directory while Python modules are installed in the python sub-folder.
> export PATH=$PATH:\</path/to/local/install/directory\>/bin > export PYTHONPATH=$PYTHONPATH:\</path/to/local/install/directory\>/pythonWith zsh for example
export PYTHONPATH=${PYTHONPATH}:${SBL_DIR}/python/SBL export PYTHONPATH=${PYTHONPATH}:${SBL_DIR}/python
Within a package from Core, examples are short programs showing the basic functionality provided in that package.
In addition, tests can be used in such packages, by compiling and running short test programs checking various functionalities of the packages. For compiling all the examples and the tests of the SBL library while installing it, just turn ON the tags SBL_EXAMPLES and SBL_TESTS when running cmake:
> cmake \</path/to/sbl/directory\> -DSBL_EXAMPLES=ON -DSBL_TESTS=ON > make
Then, to test all the packages from Core, just run the tests with the following command:
> make test
Note that the previous does not require any installation step since the examples and tests are only compiled locally and only the sources of the examples are installed for documentation (in /usr/share/doc).
It is possible to compile the programs in Debug or in Release mode using the cmake variable CMAKE_BUILD_TYPE. However, since the SBL library is only made of headers and programs, we recommend the Debug mode only for the developers. For compiling in Debug mode, one can run the following command:
> cmake \</path/to/sbl/directory\> -DCMAKE_BUILD_TYPE=Debug > make
Note that by default, the Release mode is used. Note also that to debug symbols from other libraries, if these are not header-only, compiling them in Debug mode is mandatory.
It is possible to create static versions of the programs by setting to ON the cmake variable BUILD_STATIC_SBL:
> cmake \</path/to/sbl/directory\> -DBUILD_STATIC_SBL=ON > make
Note also that only the programs are compiled in static mode since the examples and tests are not installed. If one wants to compile static examples or static tests, such compilations should be done locally.
The SBL library provides VMD plugins for visualizing the output of the programs (details in section VMD (Visual Molecular Dynamics)).
It should be recalled that VMD plugins involve three ingredients:
To automatically install the VMD plugins, i.e. create or update the sblvmdplugins directory and create or update the .vmdrc file, proceed as follows with cmake:
> cmake \</path/to/sbl/directory\> -DSBL_VMD_PLUGINS=ON > make ; make install
The SBL library provides also PyMOL plugins (details in section PyMOL (Python Based Molecular Visualization System)). The installation works as for the VMD plugins, but using instead the cmake variable SBL_PYMOL_PLUGINS :
> cmake \</path/to/sbl/directory\> -DSBL_PYMOL_PLUGINS=ON > make ; make install
The previous command looks for a folder .pymol in one's home directory, creates it if necessary, and installs the plugins. It manages the directory architecture used by PyMOL and creates or updates initialization files if necessary.
The documentation is written in Doxygen format, and can be compiled as follows using the script sbl-doc-manager.py from the scripts directory at the root of the project. This script produces the documentation and prints out the path to the index.html file, to be opened with a web browser:
> sbl-doc-manager.py -w <path/to/sbl/directory> -d <path/to/output/directory>
Note that the option -w can be omitted if the environment variable $SBL_DIR is set.
> cd \</path/to/installed/sbl/directory\>/share/doc/SBL; doxygen;In doing so, the files are generated in the current directory. The benefits of using the aforementioned script is that it also performs a number of house-keeping tasks (moving pictures, creating symbolic links, etc).