{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Conformational_ensemble_comparison" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example\n", "\n", "### Options\n", "\n", "The options of the compare method in the next cell are:\n", " - metric: euclid or lrmsd\n", " - samplingFile1: plain text file listing all conformations as D-dimensional points\n", " - samplingFile2: second file listing conformations to be compared to\n", " - identityThreshold: threshold under which two conformations are considered identical\n", " - Hausdorff: run Hausdorff distance based comparison\n", " - symmetricDifference: run comparison by computing intersection and symmetric difference ofthe two ensembles\n", " - exactSearch: brut force algorithm for searching nearest neighbors" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import re #regular expressions\n", "import sys #misc system\n", "import os\n", "import pdb\n", "import shutil # python 3 only\n", "\n", "def compare(metric, samplingFile1, samplingFile2, identityThreshold = 0.01, \\\n", " Hausdorff = True, symmetricDifference = True, exactSearch = True):\n", "\n", " odir = \"tmp-results-%s\" % metric\n", " if os.path.exists(odir):\n", " os.system(\"rm -rf %s\" % odir)\n", " os.system( (\"mkdir %s\" % odir) )\n", " \n", " # check executable exists and is visible\n", " exe = shutil.which(\"sbl-conf-ensemble-comparison-%s.exe\" % metric)\n", " if exe:\n", " print((\"Using executable %s\\n\" % exe))\n", " cmd = \"sbl-conf-ensemble-comparison-%s.exe --points-file %s --points-file %s \\\n", " --identity-threshold %d --directory %s --verbose --output-prefix --log\"\\\n", " % (metric, samplingFile1, samplingFile2, identityThreshold, odir)\n", " if Hausdorff:\n", " cmd += \" --Hausdorff\"\n", " if symmetricDifference:\n", " cmd += \" --symmetric-difference\"\n", " if exactSearch:\n", " cmd == \" --exact-search\"\n", " os.system(cmd)\n", " \n", " cmd = \"ls %s\" % odir\n", " ofnames = os.popen(cmd).readlines()\n", " print(\"All output files:\",ofnames)\n", " \n", " #find the log file and display log file\n", " cmd = \"find %s -name *log.txt\" % odir\n", " lines = os.popen(cmd).readlines()\n", " if len(lines) > 0:\n", " lfname = lines[0].rstrip()\n", " print(\"Log file is:\", lfname)\n", " log = open(lfname).readlines()\n", " for line in log: print(line.rstrip())\n", " \n", " else:\n", " print(\"Executable not found\")\n", " \n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Marker : Calculation Started\n", "Using executable /home/redantlabs/projects/sbl/bin/sbl-conf-ensemble-comparison-euclid.exe\n", "\n", "All output files: ['sbl-conf-ensemble-comparison-euclid__points_file_himmelblau_grid_points__points_file_himmelblau_rand_points__common_conformations.txt\\n', 'sbl-conf-ensemble-comparison-euclid__points_file_himmelblau_grid_points__points_file_himmelblau_rand_points__log.txt\\n', 'sbl-conf-ensemble-comparison-euclid__points_file_himmelblau_grid_points__points_file_himmelblau_rand_points__msf.txt\\n']\n", "Log file is: tmp-results-euclid/sbl-conf-ensemble-comparison-euclid__points_file_himmelblau_grid_points__points_file_himmelblau_rand_points__log.txt\n", "Running: sbl-conf-ensemble-comparison-euclid.exe --points-file data/himmelblau_grid_points.txt --points-file data/himmelblau_rand_points.txt --identity-threshold 0 --directory tmp-results-euclid --verbose --output-prefix --log --Hausdorff --symmetric-difference\n", "\n", "Conformations Loader\n", "Statistics:\n", "Conformations File Loader statistics:\n", "Number of loaded conformations ensembles: 2\n", "Details for each ensemble:\n", "-- ensemble 1:\n", "-- -- Dimension: 2\n", "-- -- Number of conformations: 10000\n", "-- ensemble 2:\n", "-- -- Dimension: 2\n", "-- -- Number of conformations: 10000\n", "\n", "Target Spatial Search\n", "Creating database of size 10000 ...\n", "\n", "0% 10 20 30 40 50 60 70 80 90 100%\n", "|----|----|----|----|----|----|----|----|----|----|\n", "***************************************************\n", "Statistics:\n", "\n", "Source Spatial Search\n", "Creating database of size 10000 ...\n", "\n", "0% 10 20 30 40 50 60 70 80 90 100%\n", "|----|----|----|----|----|----|----|----|----|----|\n", "***************************************************\n", "Statistics:\n", "\n", "MSF analysis\n", "Computing nearest neighbors of sources...\n", "\n", "0% 10 20 30 40 50 60 70 80 90 100%\n", "|----|----|----|----|----|----|----|----|----|----|\n", "***************************************************\n", "Computing nearest neighbors of targets...\n", "\n", "0% 10 20 30 40 50 60 70 80 90 100%\n", "|----|----|----|----|----|----|----|----|----|----|\n", "***************************************************\n", "Statistics:\n", "-- Number of source vertices with n ingoing edges (#ingoing edges, #vertices): (1, 3621) (2, 1840) (3, 633) (4, 150) (5, 30) (6, 6) (7, 2)\n", "-- Number of target vertices with n ingoing edges (#ingoing edges, #vertices): (1, 5355) (2, 1802) (3, 276) (4, 48) (5, 3) (6, 1)\n", "-- Statistics on distances from source (sum, mean, min, max): (50.237510, 0.005024, 0.000020, 0.019628)\n", "-- Statistics on distances from target (sum, mean, min, max): (38.603903, 0.003860, 0.000000, 0.010505)\n", "\n", "Report...\n", "\n", "Symmetric difference analysis\n", "Statistics:\n", "-- 0 common conformations.\n", "-- 0 conformations in first found in second ensemble.\n", "-- 0 conformations in second found in first ensemble.\n", "\n", "Report...\n", "\n", "Hausdorff Distance Comparison\n", "Statistics:\n", "Haudorff Distance statistics...\n", "-- One way :\n", " min / mean / max: 0.000020 0.005024 0.019628\n", " one sided Hausdorff distance: 0.019628\n", "-- Opposite way :\n", " min / mean / max: 0.000020 0.005024 0.010505\n", " one sided Hausdorff distance: 0.010505\n", "-- Hausdorff distance: 0.010505\n", "\n", "Report...\n", "\n", "End Run\n", "\n", "General Statistics:\n", "\n", "Times elapsed for computations (in seconds):\n", "-- Source Spatial Search: 0.204825\n", "-- Target Spatial Search: 0.083980\n", "-- MSF analysis: 0.376795\n", "-- Hausdorff Distance Comparison: 0.001748\n", "-- Symmetric difference analysis: 0.000306\n", "Total: 0.667654\n", "\n", "Marker : Calculation Ended\n" ] } ], "source": [ "print(\"Marker : Calculation Started\")\n", "compare(\"euclid\", \"data/himmelblau_grid_points.txt\",\"data/himmelblau_rand_points.txt\") \n", "print(\"Marker : Calculation Ended\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "print(\"Marker : Calculation Started\")\n", "compare(\"lrmsd\", \"data/bln69_sampling.txt\",\"data/bln69_10_lowest_minima.txt\") \n", "#compare(\"euclid\", \"data/bln69_sampling.txt\",\"data/bln69_10_lowest_minima.txt\") \n", "print(\"Marker : Calculation Ended\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Example of processed output. Comparison between a set of selected precomputed minima of BLN69, and minima computed using the Landscape_explorer application : only matched samples are shown and drawn.\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.4" } }, "nbformat": 4, "nbformat_minor": 2 }