Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
Bibliography
[1]

D. Agarwal, J. Araujo, C. Caillouet, F. Cazals, D. Coudert, and S. Pérennes. Connectivity inference in mass spectrometry based structure determination. In H.L. Bodlaender and G.F. Italiano, editors, European Symposium on Algorithms (Springer LNCS 8125), pages 289–300, Sophia Antipolis, France, 2013. Springer.

[2]

D. Agarwal, C. Caillouet, D. Coudert, and F. Cazals. Unveiling contacts within macro-molecular assemblies by solving minimum weight connectivity inference problems. Molecular and Cellular Proteomics, 14:2274–2282, 2015.

[3]

N. Akkiraju and H. Edelsbrunner. Triangulating the surface of a molecule. Discrete Applied Mathematics, 71(1):5–22, 1996.

[4]

N. Akkiraju and H. Edelsbrunner. Triangulating the surface of a molecule. Discrete Appl. Math., 71:5–22, 1996.

[5]

F. Alber, F. Förster, D. Korkin, M. Topf, and A. Sali. Integrating diverse data for structure determination of macromolecular assemblies. Ann. Rev. Biochem., 77:11.1–11.35, 2008.

[6]

N. Amenta, S. Choi, and R. Kolluri. The power crust, unions of balls, and the medial axis transform. Computational Geometry : Theory and Applications, 19(2):127–153, 2001.

[7]

R. Andonov, N. Yanev, and N. Malod-Dognin. An efficient lagrangian relaxation for the contact map overlap problem. WABI, pages 162–173, 2008.

[8]

R. Andonov, N. Malod-Dognin, and N. Yanev. Maximum contact map overlap revisited. J. of Computational Biology, 18:27–41, 2011.

[9]

D. Arthur and S. Vassilvitskii. k-means++: The advantages of careful seeding. In ACM-SODA, page 1035. Society for Industrial and Applied Mathematics, 2007.

[10]

J. Baker, A. Kessi, and B. Delley. The generation and use of delocalized internal coordinates in geometry optimization. The Journal of chemical physics, 105(1):192–212, 1996.

[11]

A. Banyaga and D. Hurtubise. Lectures on Morse Homology. Kluwer, 2004.

[12]

David J Barlow and JM Thornton. Ion-pairs in proteins. Journal of molecular biology, 168(4):867–885, 1983.

[13]

G. Biau, F. Chazal, D. Cohen-Steiner, L. Devroye, and C. Rodriguez. A weighted k-nearest neighbors density estimate for geometric inference. Electronic Journal of Statistics, 5(204-237), 2011.

[14]

P. Bille. A survey on tree edit distance and related problems. Theoretical computer science, 337(1-3):217–239, 2005.

[15]

J.-D. Boissonnat and M. Yvinec. Algorithmic geometry. Cambridge University Press, UK, 1998. Translated by H. Brönnimann.

[16]

J-D. Boissonnat, C. Wormser, and M. Yvinec. Curved voronoi diagrams. In J.-D. Boissonnat and M. Teillaud, editors, Effective Computational Geometry for curves and surfaces. Springer-Verlag, Mathematics and Visualization, 2006.

[17]

R. Bott. Morse theory indomitable. Publications Mathématiques de l'IHÉS, 68(1):99–114, 1988.

[18]

B. Bouvier, R. Grunberg, M. Nilgès, and F. Cazals. Shelling the Voronoi interface of protein-protein complexes reveals patterns of residue conservation, dynamics and composition. Proteins: structure, function, and bioinformatics, 76(3):677–692,

[19]

P. Braun, M. Tasan, M. Dreze, M. Barrios-Rodiles, I. Lemmens, H. Yu, J. Sahalie, R. Murray, L. Roncari, A-S. De Smet, K. Venketesan, J-F. Rual, J. Vandenhaute, M.E. Cusick, T. Pawson, D.E. Hill, J. Tavernier, J.L. Wrana, F.P. Roth, and M. Vidal. An experimentally derived confidence score for binary protein-protein interactions. Nature methods, 6(1):91–97, 2008.

[20]

Scott Brown, Nicolas J. Fawzi, and Teresa Head-Gordon. Coarse-grained sequences for protein folding and design. Proc Natl Acad Sci U S A, 100(19):10712–10717, Sep 2003.

[21]

P. M. M. De Castro, F. Cazals, S. Loriot, and M. Teillaud. Design of the cgal spherical kernel and application to arrangements of circles on a sphere. Computational Geometry: Theory and Applications, 42(6-7):536–550,

[22]

F. Cazals and D. Cohen-Steiner. Reconstructing 3D compact sets. Computational Geometry Theory and Applications, 45(1-2):1–13, 2011.

[23]

F. Cazals and T. Dreyfus. Multi-scale geometric modeling of ambiguous shapes with toleranced balls and compoundly weighted α-shapes. In B. Levy and O. Sorkine, editors, Symposium on Geometry Processing, pages 1713–1722, Lyon, 2010.

[24]

F. Cazals and A. Lhéritier. Beyond two-sample-tests: Localizing data discrepancies in high-dimensional spaces. In P. Gallinari, J. Kwok, G. Pasi, and O. Zaiane, editors, IEEE/ACM International Conference on Data Science and Advanced Analytics, Paris, 2015. Preprint: Inria tech report 8734.

[25]

F. Cazals and N. Malod-Dognin. Shape matching by localized calculations of quasi-isometric subsets,with applications to the comparison of protein binding patches. In L. Wessels and M. Loog, editors, International Conference on Pattern Recognition in Bioinformatics, Delft, the Netherlands, 2011. Lecture Notes in Computer Science 7036.

[26]

F. Cazals and D. Mazauric. Mass transportation problems with connectivity constraints, with applications to energy landscape comparison. Submitted, 2016. Preprint: Inria tech report 8611.

[27]

F. Cazals, D. Mazauric, R. Tetley, and R. Watrigant. Comparing clusterings using matchings between clusters of clusters. Inria tech report 9063.

[28]

F. Cazals, F. Proust, R. Bahadur, and J. Janin. Revisiting the Voronoi description of protein-protein interfaces. Protein Science, 15(9):2082–2092, 2006.

[29]

F. Cazals, H. Kanhere, and S. Loriot. Computing the volume of union of balls: a certified algorithm. ACM Transactions on Mathematical Software, 38(1):1–20, 2011.

[30]

F. Cazals, T. Dreyfus, S. Sachdeva, and N. Shah. Greedy geometric algorithms for collections of balls, with applications to geometric approximation and molecular coarse-graining. Computer Graphics Forum, 33(6):1–17, 2014.

[31]

F. Cazals, T. Dreyfus, D. Mazauric, A. Roth, and C.H. Robert. Conformational ensembles and sampled energy landscapes: Analysis and comparison. J. of Computational Chemistry, 36(16):1213–1231, 2015.

[32]

F. Cazals. Revisiting the Voronoi description of protein-protein interfaces: Algorithms. In T. Dijkstra, E. Tsivtsivadze, E. Marchiori, and T. Heskes, editors, International Conference on Pattern Recognition in Bioinformatics, pages 419–430, Nijmegen, the Netherlands, 2010. Lecture Notes in Bioinformatics 6282.

[33]

textsc Cgal, Computational Geometry Algorithms Library. http://www.cgal.org.

[34]

F. Chazal, L.J. Guibas an dS.Y. Oudot, and P. Skraba. Persistence-based clustering in Riemannian manifolds. In ACM SoCG, pages 97–106, 2011.

[35]

F. Chazal, L. Guibas, S. Oudot, and P. Skraba. Persistence-based clustering in riemannian manifolds. J. ACM, 60(6):1–38, 2013.

[36]

Y. Cheng. Mean shift, mode seeking, and clustering. IEEE PAMI, 17(8):790–799, 1995.

[37]

John D Chodera and David L Mobley. Entropy-enthalpy compensation: role and ramifications in biomolecular ligand recognition and design. Biophysics, 42:121–142, 2013.

[38]

C. Chothia and J. Janin. Principles of protein-protein recognition. Nature, 256:705–708, 1975.

[39]

C. Chothia and others. Structural invariants in protein folding. Nature, 254(5498):304–308, 1975.

[40]

Michael B Cohen, Yin Tat Lee, Gary Miller, Jakub Pachocki, and Aaron Sidford. Geometric median in nearly linear time. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, pages 9–21. ACM, 2016.

[41]

T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press, Cambridge, MA, 1990.

[42]

T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein. Introduction to algorithms. MIT press, 2009 (3rd edition).

[43]

S. Dasgupta and K. Sinha. Randomized partition trees for exact nearest neighbor search. JMLR: Workshop and Conference Proceedings, 30:1–21, 2013.

[44]

Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried Schwarzkopf. Computational Geometry: Algorithms and Applications. Springer-Verlag, Berlin, 1997.

[45]

C. J. A. Delfinado and H. Edelsbrunner. An incremental algorithm for Betti numbers of simplicial complexes. In Proc. 9th Annu. Sympos. Comput. Geom., pages 232–239, 1993.

[46]

F. Despa, D.J. Wales, and S.R. Berry. Archetypal energy landscapes: Dynamical diagnosis. The Journal of chemical physics, 122(2):024103, 2005.

[47]

D. Devaurs, T. Siméon, and J. Cortés. Enhancing the transition-based rrt to deal with complex cost spaces. In IEEE International Conference on Robotics and Automation (ICRA), pages 4120–4125. IEEE, 2013.

[48]

D. Devaurs, M. Vaisset, T. Siméon, and J. Cortés. A multi-tree approach to compute transition paths on energy landscapes. In Workshop on Artificial Intelligence and Robotics Methods in Computational Biology, AAAI'13, pages pp–8, 2013.

[49]

M. do Carmo. Differential Geometry of Curves and Surfaces. Prentice-Hall, 1976.

[50]

Jason E Donald, Daniel W Kulp, and William F DeGrado. Salt bridges: geometrically specific, designable interactions. Proteins: Structure, Function, and Bioinformatics, 79(3):898–915, 2011.

[51]

S. Dongen. Performance criteria for graph clustering and markov cluster experiments. 2000.

[52]

Andreas Döring, David Weese, Tobias Rausch, and Knut Reinert. Seqan an efficient, generic c++ library for sequence analysis. BMC bioinformatics, 9(1):11, 2008.

[53]

Jonathan PK Doye and Claire P Massen. Characterizing the network topology of the energy landscapes of atomic clusters. The Journal of chemical physics, 122(8):084105, 2005.

[54]

T. Dreyfus, V. Doye, and F. Cazals. Assessing the reconstruction of macro-molecular assemblies with toleranced models. Proteins: structure, function, and bioinformatics, 80(9):2125–2136,

[55]

T. Dreyfus, V. Doye, and F. Cazals. Probing a continuum of macro-molecular assembly models with graph templates of sub-complexes. Proteins: structure, function, and bioinformatics, 81(11):2034–2044,

[56]

J. Dunitz. Win some, lose some: enthalpy-entropy compensation in weak intermolecular interactions. Chemistry & biology, 2(11):709–712, 1995.

[57]

H. Edelsbrunner and J. Harer. Computational topology: an introduction. AMS, 2010.

[58]

H. Edelsbrunner. Geometry and topology for mesh generation. Cambridge Univ. press., 2001.

[59]

D. Eisenberg and A.D. McLachlan. Solvation energy in protein folding and binding. Nature, 319:199–203, 1986.

[60]

David Eisenberg, Morgan Wesson, and Mason Yamashita. Interpretation of protein folding and binding with atomic solvation parameters. Chem. Scr. A, 29:217–221, 1989.

[61]

Geza Fogarasi, Xuefeng Zhou, Patterson W Taylor, and Peter Pulay. The calculation of ab initio molecular geometries: efficient optimization by natural internal coordinates and empirical correction by offset forces. Journal of the American Chemical Society, 114(21):8191–8201, 1992.

[62]

R. Forman. Morse theory for cell complexes. Advances in Mathematics, 134:90–145, 1998.

[63]

Damien Francois, Vincent Wertz, and Michel Verleysen. The concentration of fractional distances. IEEE Trans. on Knowledge and Data Engineering, 19(7):873–886, 2007.

[64]

A. Gavezzoti. Molecular aggregation. Oxford, 2007.

[65]

M. Gerstein and F.M. Richards. Protein geometry: volumes, areas, and distances. In M. G. Rossmann and E. Arnold, editors, The international tables for crystallography (Vol F, Chap. 22), pages 531–539. Springer, 2001.

[66]

D. Goldman, S. Istrail, and C. Papadimitriou. Algorithmic aspects of protein structure similarity. In Foundations of Computer Science, 1999. 40th Annual Symposium on, pages 512–521. IEEE, 1999.

[67]

L. Györfi and A. Krzyzak. A distribution-free theory of nonparametric regression. Springer, 2002.

[68]

F. Hamprecht, C. Peter, X. Daura, W. Thiel, and W.F. van Gunsteren. A strategy for analysis of (molecular) equilibrium simulations: Configuration space density estimation, clustering, and visualization. The Journal of Chemical Physics, 114(5):2079–2089, 2001.

[69]

Sariel Har-Peled. Geometric approximation algorithms. Number 173. American Mathematical Soc., 2011.

[70]

L. Hascoet and V. Pascual. The tapenade automatic differentiation tool: principles, model, and specification. ACM Transactions on Mathematical Software (TOMS), 39(3):20, 2013.

[71]

H.P. Hratchian and H.B. Schlegel. Finding minima, transition states, and following reaction pathways on ab initio potential energy surfaces. Theory and applications of computational chemistry: the first forty years, 4:195–249, 2005.

[72]

L. Jaillet, F.J. Corcho, J-J. Pérez, and J. Cortés. Randomized tree construction algorithm to explore energy landscapes. Journal of computational chemistry, 32(16):3464–3474, 2011.

[73]

J. Janin, R. P. Bahadur, and P. Chakrabarti. Protein-protein interaction and quaternary structure. Quarterly reviews of biophysics, 41(2):133–180, 2008.

[74]

N. Karmarkar. A new polynomial-time algorithm for linear programming. In Proceedings of the sixteenth annual ACM symposium on Theory of computing, pages 302–311. ACM, 1984.

[75]

R. M. Karp. Reducibility among combinatorial problems. In Complexity of Computer Computations, pages 85–103, 1972.

[76]

P.L. Kastritis, J.P.G.L.M. Rodrigues, G.E. Folkers, R. Boelens, and A.M.J.J. Bonvin. Proteins feel more than they see: Fine-tuning of binding affinity by properties of the non-interacting surface. J.M.B., 426:2632–2652, 2014.

[77]

James J Kuffner and Steven M LaValle. RRT-connect: An efficient approach to single-query path planning. In IEEE International Conference on Robotics and Automation, volume 2, pages 995–1001. IEEE, 2000.

[78]

Sandeep Kumar and Ruth Nussinov. Close-range electrostatic interactions in proteins. ChemBioChem, 3(7):604–617, 2002.

[79]

D. Landau and K. Binder. A guide to Monte Carlo simulations in statistical physics. Cambridge university press, 2014.

[80]

B. Larsen and C. Aone. Fast and effective text mining using linear-time document clustering. In ACM SIGKDD, pages 16–22. ACM, 1999.

[81]

Y. LeCun and C. Cortes. The MNIST database of handwritten digits, 1998.

[82]

J.A. Lee and M. Verleysen. Nonlinear dimensionality reduction. Springer Verlag, 2007.

[83]

E. Levina and P. Bickel. The earth mover's distance is the mallows distance: Some insights from statistics. In IEEE ICCV, volume 2, pages 251–256. IEEE, 2001.

[84]

Z. Li and H.A. Scheraga. Monte carlo-minimization approach to the multiple-minima problem in protein folding. PNAS, 84(19):6611–6615, 1987.

[85]

J. Lin. Divergence measures based on the Shannon entropy. Information Theory, IEEE Transactions on, 37(1):145–151, 1991.

[86]

L. Lo Conte, C. Chothia, and J. Janin. The atomic structure of protein-protein recognition sites. JMB, 285(5):2177–2198, 1999.

[87]

S. Loriot and F. Cazals. Modeling macro-molecular interfaces with Intervor. Bioinformatics, 26(7):964–965, 2010.

[88]

H. Mahmoud. Evolution of random search trees. Wiley-Interscience, 1992.

[89]

N. Malod-Dognin, A. Bansal, and F. Cazals. Characterizing the morphology of protein binding patches. Proteins: structure, function, and bioinformatics, 80(12):2652–2665,

[90]

S. Marillet, P. Boudinot, and F. Cazals. High resolution crystal structures leverage protein binding affinity predictions. Proteins: structure, function, and bioinformatics, 1(84):9–20, 2015.

[91]

Juliette Martin, Guillaume Letellier, Antoine Marin, Jean-François Taly, Alexandre G de Brevern, and Jean-François Gibrat. Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. BMC structural biology, 5(1):1, 2005.

[92]

M. Meila. Comparing clusterings. 2002.

[93]

John W. Milnor. Morse Theory. Princeton University Press, Princeton, NJ, 1963.

[94]

J.W. Milnor. Morse Theory. Princeton University Press, Princeton, NJ, 1963. m-mt-63.

[95]

M.Oakley, D.J. Wales, and R. Johnston. Energy landscape and global optimization for a frustrated model protein. The Journal of Physical Chemistry B, 115(39):11525–11529, 2011.

[96]

G. Monge. Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences de Paris, pages 666–704, 1781.

[97]

Francisco Moreno-Seco, Luisa Micó, and Jose Oncina. A modification of the laesa algorithm for approximated k-nn classification. Pattern Recognition Letters, 24(1):47–53, 2003.

[98]

M. Muja and D.G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In Int'l Conf. on Comp. vision: Theory and Applications, Lisboa, 2009. Springer.

[99]

Naoko Nakagawa and Michel Peyrard. The inherent structure landscape of a protein. Proc Natl Acad Sci U S A, 103(14):5279–5284, Apr 2006.

[100]

Jaroslav Nesetril, Eva Milková, and Helena Nesetrilová. Otakar bor r uvka on minimum spanning tree problem (Translation of both the 1926 papers, comments, history). Discrete Mathematics, 233(1):3–36, 2001.

[101]

S. O'Hara and B.A. Draper. Are you using the right approximate nearest neighbor algorithm? In Applications of Computer Vision (WACV), 2013 IEEE Workshop on, pages 9–14. IEEE, 2013.

[102]

Patric R.J. Ostergard. A fast algorithm for the maximum clique problem. Discrete Applied Mathematics, 120:197–207, 2002.

[103]

J. Parsons, J.B. Holmes, J.M. Rojas, J. Tsai, and C.E.M. Strauss. Practical conversion from torsion space to cartesian space for in silico protein synthesis. Journal of computational chemistry, 26(10):1063–1068, 2005.

[104]

J. Pérez-Vargas, T. Krey, C. Valansi, Ori O. Avinoam, A. Haouz, M. Jamin, H. Raveh-Barak, B. Podbilewicz, and F. Rey. Structural basis of eukaryotic cell-cell fusion. Cell, 157(2):407–419, 2014.

[105]

B. Phipson and G.K. Smyth. Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn. Statistical Applications in Genetics and Molecular Biology, 9(1), 2010.

[106]

Peter Pulay, Geza Fogarasi, Frank Pang, and James E Boggs. Systematic ab initio gradient calculation of molecular geometries, force constants, and dipole moment derivatives. Journal of the American Chemical Society, 101(10):2550–2560, 1979.

[107]

F. M. Richards. Areas, volumes, packing and protein structure. Ann. Rev. Biophys. Bioeng., 6:151–176, 1977.

[108]

C. Robert and G. Casella. Monte Carlo statistical methods. Springer Science & Business Media, 2013.

[109]

A. Roth, T. Dreyfus, C.H. Robert, and F. Cazals. Hybridizing rapidly growing random trees and basin hopping yields an improved exploration of energy landscapes. J. of Computational Chemistry, 37(8):739–752, 2016.

[110]

Y. Rubner, C. Tomasi, and L.J. Guibas. The earth mover's distance as a metric for image retrieval. International Journal of Computer Vision, 40(2):99–121, 2000.

[111]

H. Samet. Foundations of multidimensional and metric data structures. Morgan Kaufmann, 2006.

[112]

H Bernhard Schlegel. Estimating the hessian for gradient-type geometry optimizations. Theoretica chimica acta, 66(5):333–340, 1984.

[113]

H.B. Schlegel. Exploring potential energy surfaces for chemical reactions: an overview of some practical methods. Journal of computational chemistry, 24(12):1514–1527, 2003.

[114]

G. Shakhnarovich, T. Darrell, and P. Indyk (Eds). Nearest-Neighbors Methods in Learning and Vision. Theory and Practice. MIT press, 2005.

[115]

J. Solomon, F. De Goes, G. Peyré, M. Cuturi, A. Butscher, A. Nguyen, T. Du, and L. Guibas. Convolutional wasserstein distances: Efficient optimal transportation on geometric domains. ACM Transactions on Graphics (TOG), 34(4):66, 2015.

[116]

William Stein. Sage for Power Users. Sage, 2012.

[117]

Frank H. Stillinger and Thomas A. Weber. Hidden structure in liquids. Phys. Rev. A, 25:978–989, 1982.

[118]

A. Strehl and J. Ghosh. Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of machine learning research, 3(Dec):583–617, 2002.

[119]

T. Taverner, H. Hernández, M. Sharon, B.T. Ruotolo, D. Matak-Vinkovic, D. Devos, R.B. Russell, and C.V. Robinson. Subunit architecture of intact protein complexes from mass spectrometry and homology modeling. Accounts of chemical research, 41(5):617–627, 2008.

[120]

Gareth A Tribello, Michele Ceriotti, and Michele Parrinello. Using sketch-map coordinates to analyze and bias molecular dynamics simulations. Proceedings of the National Academy of Sciences, 109(14):5196–5201, 2012.

[121]

J. Tsai, R. Taylor, C. Chothia, and M. Gerstein. The packing density in proteins: standard radii and volumes. JMB, 290:253–266, 1999.

[122]

C. Villani. Topics in optimal transportation. Number 58. AMS, 2003.

[123]

D. Wales and J.P.K. Doye. Global optimization by basin-hopping and the lowest energy structures of lennard-jones clusters containing up to 110 atoms. The Journal of Physical Chemistry A, 101(28):5111–5116, 1997.

[124]

D. J. Wales, M.A. Miller, and T.R. Walsh. Archetypal energy landscapes. Nature, 394(6695):758–760, 1998.

[125]

D. J. Wales. Energy Landscapes. Cambridge University Press, 2003.

[126]

Peter N Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In ACM SODA, volume 93, pages 311–321, 1993.