Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
|
Authors: F. Cazals and T.Dreyfus
This package aims to use the serialization paradigm of the Boost library for saving and reconstructing data on several files:
– the main archive, that is a xml file containing the main data structure
– the secondary archive, that is a plain text or binary file containing secondary information mandatory for reconstructing the data from the main archive.
For a complete description of the Serialization package of the Boost library, we refer the reader to the Boost user manual . The serialization is done on two steps:
For the first step, there are two ways for describing the serialization of a data:
Furthermore, it is possible to split the serialization onto two methods: one for loading (method load), and one for saving (method save). Examples of splitted serialization are given in the section Splitted Serialization .
For the second step, Boost provides several archives corresponding to different file formats (binary, plain text, xml). To each format corresponds two type of archives: for loading (i.e input archives) and for saving (i.e output archives). The archives can be used in the same way as a stream with the operators << and >>. However, for the xml archives, a name has to be associated to the data to serialize: this name is used as a tag in the xml file encapsulating the serialized data. To associate a name to a data, Boost provides the boost::serialization::make_nvp method.
The two main classes are SBL::IO::T_Multiple_archives_serialization_xml_oarchive for output archives and SBL::IO::T_Multiple_archives_serialization_xml_iarchive for input archives. These two classes are described in the following.
In order to save in several archives a data structure, this package provides the class SBL::IO::T_Multiple_archives_serialization_xml_oarchive< DataType , OutputArchive , IsLessData > :
An example of use of SBL::IO::T_Multiple_archives_serialization_xml_oarchive is given in section Output XML Archive .
There are three ways for creating an object of type SBL::IO::T_Multiple_archives_serialization_xml_oarchive< DataType , OutputArchive >:
– by giving output streams for the xml output archive and the secondary archive,
– by giving output streams only for the xml output archive,
– by giving another archive with already serialized data in the secondary archive.
In the latter case, no secondary archive will be created, and all information on the data that are not stored in the xml output archive will be lost: this is useful when there is no need of reconstruction from the xml archive, and it is too heavy to save all the information of all data in the xml archive.
During the serialization of a data structure, the data is stored in a map with its serialization index, i.e an index identifying the data in the map. If a secondary archive exists, the data is also stored in this archive. Note that a unique index and a unique copy of each data is stored, even if the data exists in multiple copies.
It is also possible to call the method SBL::IO::T_Multiple_archives_serialization_xml_oarchive::store_data for storing the data.
This step is exactly the same as usual, except that the index of serializing objects of type DataType will be also saved. Since all the data has to be already stored, pointers and references over objects of type DataType are treated exactly in the same manner (that is not the case for the boost archives). Note that if one wants to serialize partially an object of type DataType in the xml archive (and not only its index), a mechanism exists for selecting the information to put in the xml archive in any case: the method SBL::IO::is_main_archive allows to determine if a given archive is the main one or not. In the serialization method of the class DataType , one should test the type of the archive using this method in order to select what are the information to put or not in the main archive.
In order to load a data structure from several archives, this package provides the class SBL::IO::T_Multiple_archives_serialization_xml_iarchive< DataType , OutputArchive >. It works in the same manner as SBL::IO::T_Multiple_archives_serialization_xml_oarchive, but with the following differences:
– the class uses a map from the indices to the objects of type DataType . Since the indices are natural numbers from 0 to n-1, n being the number of stored data, the map is a std vector. When loading the data from the secondary archive, new objects are created and pointers are stored in this vector, at the position corresponding to their index. It is possible to access to the set of all data with the method SBL::IO::T_Multiple_archives_serialization_xml_iarchive::get_data that fill a container of DataType using an output iterator.
– there is no possibility to use the class SBL::IO::T_Multiple_archives_serialization_xml_iarchive without secondary archive.
An example of use of SBL::IO::T_Multiple_archives_serialization_xml_oarchive is given in section Input XML Archive
This example show how to serialize a simple data structure. First, it defines in a non intrusive way the serialization of the data structure. Then, in the main method, it saves the data in a xml archive, and then load from this xml archive the same data.
This example show how to split the serialization of a simple data structure. First, it defines the save and load methods in the data structure. Then, in the main method, it saves the data in a xml archive, and then load from this xml archive the same data.
This example shows how to save in two archives a simple data structure: a text archive will contain the list of serialized data to retain, and a xml archive will contain the main data structure where the serialized data is replaced by an index.
This example shows how to load from two archives a simple data structure: a text archive contains the list of serialized data, and a xml archive contains the main data structure where the serialized data is replaced by an index.