Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
User Manual

Multiple_archives_serialization

Authors: F. Cazals and T.Dreyfus

Introduction

This package aims to use the serialization paradigm of the Boost library for saving and reconstructing data on several files:

– the main archive, that is a xml file containing the main data structure

– the secondary archive, that is a plain text or binary file containing secondary information mandatory for reconstructing the data from the main archive.

Serialization from the Boost library

For a complete description of the Serialization package of the Boost library, we refer the reader to the Boost user manual . The serialization is done on two steps:

  • first, the description of the information of a data structure to save into / load from an archive.
  • second, the description of the archive file that contains or will contain the information related to a data structure;

For the first step, there are two ways for describing the serialization of a data:

  • in the second version, a method serialize has to be declared in the class itself : this is called the intrusive serialization (see section Splitted Serialization).

Furthermore, it is possible to split the serialization onto two methods: one for loading (method load), and one for saving (method save). Examples of splitted serialization are given in the section Splitted Serialization .

For the second step, Boost provides several archives corresponding to different file formats (binary, plain text, xml). To each format corresponds two type of archives: for loading (i.e input archives) and for saving (i.e output archives). The archives can be used in the same way as a stream with the operators << and >>. However, for the xml archives, a name has to be associated to the data to serialize: this name is used as a tag in the xml file encapsulating the serialized data. To associate a name to a data, Boost provides the boost::serialization::make_nvp method.

Multiple Archives Serialization

The two main classes are SBL::IO::T_Multiple_archives_serialization_xml_oarchive for output archives and SBL::IO::T_Multiple_archives_serialization_xml_iarchive for input archives. These two classes are described in the following.

Output

In order to save in several archives a data structure, this package provides the class SBL::IO::T_Multiple_archives_serialization_xml_oarchive< DataType , OutputArchive , IsLessData > :

  • the first template argument is the type of data that will be stored in the secondary archive. When saving a data, an index is created and associated to this data in a map from the data to its index.
  • the second template argument is the type of the second archive that will contain the data. It is by default a text output archive, but can be any kind of the boost output archive.
  • the third template argument is an ordering over the data, since we need to store them in a map. By default, it takes the natural ordering over the data.

An example of use of SBL::IO::T_Multiple_archives_serialization_xml_oarchive is given in section Output XML Archive .

Constructing the Archive

There are three ways for creating an object of type SBL::IO::T_Multiple_archives_serialization_xml_oarchive< DataType , OutputArchive >:

– by giving output streams for the xml output archive and the secondary archive,

– by giving output streams only for the xml output archive,

– by giving another archive with already serialized data in the secondary archive.

In the latter case, no secondary archive will be created, and all information on the data that are not stored in the xml output archive will be lost: this is useful when there is no need of reconstruction from the xml archive, and it is too heavy to save all the information of all data in the xml archive.

Saving the Data

During the serialization of a data structure, the data is stored in a map with its serialization index, i.e an index identifying the data in the map. If a secondary archive exists, the data is also stored in this archive. Note that a unique index and a unique copy of each data is stored, even if the data exists in multiple copies.

It is also possible to call the method SBL::IO::T_Multiple_archives_serialization_xml_oarchive::store_data for storing the data.

Serializing the Data Structure

This step is exactly the same as usual, except that the index of serializing objects of type DataType will be also saved. Since all the data has to be already stored, pointers and references over objects of type DataType are treated exactly in the same manner (that is not the case for the boost archives). Note that if one wants to serialize partially an object of type DataType in the xml archive (and not only its index), a mechanism exists for selecting the information to put in the xml archive in any case: the method SBL::IO::is_main_archive allows to determine if a given archive is the main one or not. In the serialization method of the class DataType , one should test the type of the archive using this method in order to select what are the information to put or not in the main archive.

Input

In order to load a data structure from several archives, this package provides the class SBL::IO::T_Multiple_archives_serialization_xml_iarchive< DataType , OutputArchive >. It works in the same manner as SBL::IO::T_Multiple_archives_serialization_xml_oarchive, but with the following differences:

– the class uses a map from the indices to the objects of type DataType . Since the indices are natural numbers from 0 to n-1, n being the number of stored data, the map is a std vector. When loading the data from the secondary archive, new objects are created and pointers are stored in this vector, at the position corresponding to their index. It is possible to access to the set of all data with the method SBL::IO::T_Multiple_archives_serialization_xml_iarchive::get_data that fill a container of DataType using an output iterator.

– there is no possibility to use the class SBL::IO::T_Multiple_archives_serialization_xml_iarchive without secondary archive.

An example of use of SBL::IO::T_Multiple_archives_serialization_xml_oarchive is given in section Input XML Archive

Examples

Non Intrusive Serialization

This example show how to serialize a simple data structure. First, it defines in a non intrusive way the serialization of the data structure. Then, in the main method, it saves the data in a xml archive, and then load from this xml archive the same data.

#include <boost/archive/xml_iarchive.hpp>
#include <boost/archive/xml_oarchive.hpp>
#include <boost/serialization/nvp.hpp>
#include <boost/serialization/list.hpp>
#include <iostream>
#include <fstream>
struct Item
{
int index;
std::list<double> weights;
};//end struct Item
namespace boost
{
namespace serialization
{
//Non intrusive way to serialize the data structure.
template <class Archive>
void serialize(Archive& ar, Item& a, const unsigned BOOST_PFTO int flags = 0)
{
ar & make_nvp("index", a.index);
ar & make_nvp("weights", a.weights);
}
}//end namespace serialization
}//end namespace boost
int main()
{
Item a;
a.index = 1;
a.weights.push_back(2);
//Open first the file, then the archive.
std::ofstream out("archive.xml");
boost::archive::xml_oarchive* ar_oxml = new boost::archive::xml_oarchive(out);
*ar_oxml << boost::serialization::make_nvp("Item", a);
//Close first the archive, then the file.
delete ar_oxml;
out.close();
Item b;
//Open first the file, then the archive.
std::ifstream in("archive.xml");
boost::archive::xml_iarchive* ar_ixml = new boost::archive::xml_iarchive(in);
*ar_ixml >> boost::serialization::make_nvp("Item", b);
//Close first the archive, then the file.
delete ar_ixml;
in.close();
std::cout << "index: " << b.index << " and weight: " << b.weights.front() << std::endl;
return 0;
}
void serialize(Archive &ar, Traits::Elevated_point &p, unsigned version)
Definition: example_Morse_theory_based_analyzer_NNG.cpp:90
Definition: example_Morse_theory_based_analyzer_NNG.cpp:86

Splitted Serialization

This example show how to split the serialization of a simple data structure. First, it defines the save and load methods in the data structure. Then, in the main method, it saves the data in a xml archive, and then load from this xml archive the same data.

#include <boost/archive/xml_iarchive.hpp>
#include <boost/archive/xml_oarchive.hpp>
#include <boost/serialization/nvp.hpp>
#include <boost/serialization/list.hpp>
#include <iostream>
#include <fstream>
class Item
{
public:
int index;
std::list<double> weights;
private:
//Allows boost to access to the methods for serializing the data structure.
friend class boost::serialization::access;
//How to save this data structure in an archive.
template <class Archive>
void save(Archive& ar, const unsigned BOOST_PFTO int flags = 0)const
{
ar & boost::serialization::make_nvp("index", index);
ar & boost::serialization::make_nvp("weights", weights);
}
//How to load this data structure from an archive.
template <class Archive>
void load(Archive& ar, const unsigned BOOST_PFTO int flags = 0)
{
ar & boost::serialization::make_nvp("index", index);
ar & boost::serialization::make_nvp("weights", weights);
}
//Boost macro generating the code for splitting the serialization
BOOST_SERIALIZATION_SPLIT_MEMBER()
};//end class Item
int main()
{
Item a;
a.index = 1;
a.weights.push_back(2);
//Open first the file, then the archive.
std::ofstream out("archive.xml");
boost::archive::xml_oarchive* ar_oxml = new boost::archive::xml_oarchive(out);
*ar_oxml << boost::serialization::make_nvp("Item", a);
//Close first the archive, then the file.
delete ar_oxml;
out.close();
Item b;
//Open first the file, then the archive.
std::ifstream in("archive.xml");
boost::archive::xml_iarchive* ar_ixml = new boost::archive::xml_iarchive(in);
*ar_ixml >> boost::serialization::make_nvp("Item", b);
//Close first the archive, then the file.
delete ar_ixml;
in.close();
std::cout << "index: " << b.index << " and weight: " << b.weights.front() << std::endl;
return 0;
}
void load(Archive &ar, CGAL::Point_3< K > &p, const unsigned BOOST_PFTO int version)
Loads the coordinates of a 3D point from an archive.
Definition: Point_3_serialize.hpp:85
void save(Archive &ar, const CGAL::Point_3< K > &p, const unsigned BOOST_PFTO int version)
Saves the coordinates of a 3D point in an archive.
Definition: Point_3_serialize.hpp:73

Output XML Archive

This example shows how to save in two archives a simple data structure: a text archive will contain the list of serialized data to retain, and a xml archive will contain the main data structure where the serialized data is replaced by an index.

#include <SBL/IO/Multiple_archives_serialization_xml_oarchive.hpp>
#include <iostream>
#include <fstream>
//Simple structure with an index.
class Item
{
public:
int index;
static int count;
Item():index(count++) {}
//Necessary ordering over the items for sorting them in the main output xml archive.
friend inline bool operator<(const Item& u, const Item& v)
{
return u.index < v.index;
}
};//end struct Item
int Item::count = 0;
//Wrapper taking a list of items.
struct Items
{
std::list<Item> list;
};//end struct Items
namespace boost {
namespace serialization {
// Serialization of Item: if the archive is the main one, we do not serialize the index.
template<class Archive>
void serialize(Archive& ar, Item& item, const unsigned BOOST_PFTO int version)
{
ar & boost::serialization::make_nvp("index", item.index);
}
// Serialization of Items.
template<class Archive>
void serialize(Archive& ar, Items& items, const unsigned BOOST_PFTO int version)
{
ar & boost::serialization::make_nvp("list", items.list);
}
} // namespace serialization
} // namespace boost
int main()
{
// Items to serialize
Items items;
items.list.push_back(Item());
items.list.push_back(Item());
items.list.push_back(Item());
// Open the files for the archive and the secondary archive (secondary is not mandatory).
std::ofstream out("archive.xml");
std::ofstream sec_out("secondary_archive.txt");
// Open the main archive.
// Register the items in the internal map (and in the secondary archive, if any).
// for(std::list<Item>::const_iterator it = items.list.begin(); it != items.list.end(); it++)
// xml->store_data(*it);
// Serialize the items in the xml.
xml << boost::serialization::make_nvp("items", items);
return 0;
}
XML output archive decoupling the data to put in the archive and data to possibly put in a secondary ...
Definition: Multiple_archives_serialization_xml_oarchive.hpp:178
static bool is_main_archive(Archive &ar)
Check that the input archive is a main archive.
Definition: Multiple_archives_serialization_xml_archive.hpp:61
bool operator<(const ESBTL::Molecular_atom< SystemItems, Point3 > &a, const ESBTL::Molecular_atom< SystemItems, Point3 > &b)
Definition: Atom_with_hierarchical_info_traits.hpp:96

Input XML Archive

This example shows how to load from two archives a simple data structure: a text archive contains the list of serialized data, and a xml archive contains the main data structure where the serialized data is replaced by an index.

#include <SBL/IO/Multiple_archives_serialization_xml_iarchive.hpp>
#include <iostream>
#include <fstream>
#include <boost/serialization/list.hpp>
//Simple structure with an index.
struct Item
{
int index;
static int count;
Item():index(count++) {}
};//end struct Item
int Item::count = 0;
//Wrapper taking a list of items.
struct Items
{
std::list<Item> list;
};//end struct Items
namespace boost {
namespace serialization {
// Serialization of Item: if the archive is the main one, we do not serialize the index.
template<class Archive>
void serialize(Archive& ar, Item& item, const unsigned BOOST_PFTO int version)
{
ar & boost::serialization::make_nvp("index", item.index);
}
// Serialization of Items.
template<class Archive>
void serialize(Archive& ar, Items& items, const unsigned BOOST_PFTO int version)
{
ar & boost::serialization::make_nvp("list", items.list);
}
} // namespace serialization
} // namespace boost
int main()
{
//Items to load from the xml.
Items items;
// Open the files for the archive and the secondary archive (not mandatory).
std::ifstream in("archive.xml");
std::ifstream sec_in("secondary_archive.txt");
// Open the main archive.
// Load the items from the xml.
*xml >> boost::serialization::make_nvp("items", items);
for(std::list<Item>::const_iterator it = items.list.begin(); it != items.list.end(); it++)
std::cout << "Item : " << (*it).index << std::endl;
// Close the archives and the files.
delete xml;
in.close();
sec_in.close();
return 0;
}
XML input archive coupling the data from a secondary archive to the main xml archive.
Definition: Multiple_archives_serialization_xml_iarchive.hpp:170