{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Multiple_interface_string_alignments (MISA)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " \n", "The following notebook uses a number of files provided in the following directories: ```pdb misa-RBD-ACE2-cmp misa-RBD-IG'''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Example 1: comparing the RBD or Sars-CoV-1 and Sars-CoV-2\n", "\n", "This first example provides a step by step comparison on the RBD. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 1: generating all colored MISA with sbl-misa.py\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, the MISA is calculated for each of the chains specified in ```ifile-misa.txt```.\n", "\n", "Content of ```./misa-RBD-ACE2-cmp/ifile-misa.txt``` :\n", "\n", "```\n", "# Windows for ACE2-bound-to-SARS-CoV-1\n", "[ACE2-bound-to-SARS-CoV-1_0 (19, 83) (321,393)]\n", "\n", "./pdb/2ajf.pdb (A, E, SARS-CoV-1-RBD, bound) (B, A, ACE2-bound-to-SARS-CoV-1, bound)\n", "./pdb/2ajf.pdb (A, F, SARS-CoV-1-RBD, bound) (B, B, ACE2-bound-to-SARS-CoV-1, bound)\n", "./pdb/5x58.pdb (A, A, SARS-CoV-1-RBD, unbound-closed) \n", "./pdb/6crz.pdb (A, C, SARS-CoV-1-RBD, unbound-closed)\n", "\n", "# Specification for SARS-CoV-2\n", "./pdb/6m0j.pdb (C, E, SARS-CoV-2-RBD, bound) (D, A, ACE2-bound-to-SARS-CoV-2, bound)\n", "./pdb/6lzg.pdb (C, B, SARS-CoV-2-RBD, bound) (D, A, ACE2-bound-to-SARS-CoV-2, bound)\n", "./pdb/6vxx.pdb (C, A, SARS-CoV-2-RBD, unbound-closed)\n", "./pdb/6vyb.pdb (C, A, SARS-CoV-2-RBD, unbound-closed) \n", "```\n", "\n", "Each line corresponds to a complex, or to an unbound structure if the structure is alone on its line.\n", "For each complex, we provided one or two examples of bound complexes, as well as two examples of unbound complexes, in order to be able to calculate the $\\Delta\\_ASA$ induced by the conformational change. Otherwise, the $\\Delta\\_ASA$ won't be computed.\n", "\n", "The first line of the file is used to restrict the displayed portion of the interface for the SARS-CoV-1 RBD, in order to compact the output.\n", "\n", "The details of the specification are developed in the paper." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "#!/usr/bin/python3\n", "import os\n", "import subprocess\n", "import re\n", "import shutil\n", "from IPython.core.display import display, HTML\n", "from collections import defaultdict\n", "from IPython.display import IFrame\n", "from SBL import SBL_pytools\n", "from SBL_pytools import SBL_pytools as sblpyt" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running /home/stephane/StageM1/dev2/sbl/Applications/Multiple_interface_string_alignment/python/sbl-misa.py -ifile ./misa-RBD-ACE2-cmp/ifile-misa.txt -prefix_dir ./misa-RBD-ACE2-cmp -prefix demo-misa-1 --verbose 0 -normalize_b_factor 2\n", "\n", "Done\n" ] } ], "source": [ "exe = shutil.which('sbl-misa.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa.py not in your PATH')\n", "ifile = './misa-RBD-ACE2-cmp/ifile-misa.txt'\n", "prefix_dir = './misa-RBD-ACE2-cmp' # To append at the beginning of every input and output directories\n", "# It allows to compacify the possible specification of the sub-output directories.\n", "\n", "prefix = 'demo-misa-1' # To append at the beginning of the output files\n", "verbose = '0'\n", "normalize_b_factor = '2' # Normalization with only respect to the displayed residues\n", "cmd = [exe, \"-ifile\", ifile, \"-prefix_dir\", prefix_dir, '-prefix', prefix, '--verbose', verbose, '-normalize_b_factor', normalize_b_factor]\n", "print('Running %s -ifile %s -prefix_dir %s -prefix %s --verbose %s -normalize_b_factor %s' % (exe, ifile, prefix_dir, prefix, verbose, normalize_b_factor))\n", "s = subprocess.check_output(cmd, encoding='UTF-8')\n", "print('\\nDone')\n", "#print(s)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```sbl-misa.py``` displays the MISA, with several colorings showing complementary data. \n", "\n", "An individual summary figure for each of the chains specified in ```ifile-misa.txt``` is produced. For example, here are the figures generated for SARS-CoV-2-RBD and for SARS-CoV-1-ACE2 (for which the effect of the window specification, restricting the range of displayed residues, can be observed) :" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n",
       "\n",
       "Legend for the amino acids (aa) encoding :\n",
       " \n",
       " For aa not at the interface :\n",
       "   _  if aa is missing\n",
       "   - if aa is present\n",
       " \n",
       " For aa at the interface :\n",
       "   *  if aa is missing\n",
       "   X if consensus aa (= most frequent among the bound structures, and in case of tie the first by alphabetical order)\n",
       "   x  otherwise \n",
       "   (For bound structure files only) : \n",
       "   x or X  if the aa is part of the consensus interface but not part of the interface of this file \n",
       " \n",
       "\n",
       "
MISA SSE for MISA-ID SARS-CoV-2-RBD_0
\n", "3-turn helix - 4-turn helix - 5-turn helix - Isolated beta-bridge residue - Extended strand - \n", "Bend - Hydrogen bonded turn - Other - Missing Residue - \n", "\n", "\n", "Residue Index 403----410-------420-------430-------440-------450-------460-------470-------480-------490-------500----\n", " | | | | | | | | | | | \n", "bound-6m0j-E, res :2.45 Å, 27 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "bound-6lzg-B, res :2.5 Å, 39 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res R-DE----------KI--Y-----------------N-----**G-Y---Y-**_____-------____*_***______****_YF-LQSYGFQPTN*VGYQ\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res R-DE----------KI--Y-----------------N---__***-Y---Y-LF--------------__*_***______****_*F-LQSYGFQPTN*VGYQ\n", "\n", "\n", "\n", "
MISA BSA for MISA-ID SARS-CoV-2-RBD_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Buried Surface Area (BSA) (in Å2) \n", " \n", "\n", "| bound-6m0j-E : total bsa = 887.29 Å2 | bound-6lzg-B : total bsa = 1120.20 Å2 \n", "\n", "Residue Index 403----410-------420-------430-------440-------450-------460-------470-------480-------490-------500----\n", " | | | | | | | | | | | \n", "bound-6m0j-E, res :2.45 Å, 27 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "bound-6lzg-B, res :2.5 Å, 39 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "\n", "\n", "
MISA Delta_ASA for MISA-ID SARS-CoV-2-RBD_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Bound structures : In light grey residues in bound structure for which miss corresponding ASA values in the unbound structures\n", " Per residue i, delta_ASA = ASA[i] - mean(ASA[i]) (mean is computed using the unbound structures) (in Å2)\n", " \n", " \n", "Unbound structures : Accessible Surface Area (ASA) (in Å2) \n", " \n", "\n", "\n", "\n", "Residue Index 403----410-------420-------430-------440-------450-------460-------470-------480-------490-------500----\n", " | | | | | | | | | | | \n", "bound-6m0j-E, res :2.45 Å, 27 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "bound-6lzg-B, res :2.5 Å, 39 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res R-DE----------KI--Y-----------------N-----**G-Y---Y-**_____-------____*_***______****_YF-LQSYGFQPTN*VGYQ\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res R-DE----------KI--Y-----------------N---__***-Y---Y-LF--------------__*_***______****_*F-LQSYGFQPTN*VGYQ\n", "\n", "\n", "\n", "
MISA B_factor for MISA-ID SARS-CoV-2-RBD_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " B-Factor (in Å2) \n", " \n", "\n", "\n", "\n", "Residue Index 403----410-------420-------430-------440-------450-------460-------470-------480-------490-------500----\n", " | | | | | | | | | | | \n", "bound-6m0j-E, res :2.45 Å, 27 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "bound-6lzg-B, res :2.5 Å, 39 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res R-DE----------KI--Y-----------------N-----**G-Y---Y-**_____-------____*_***______****_YF-LQSYGFQPTN*VGYQ\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res R-DE----------KI--Y-----------------N---__***-Y---Y-LF--------------__*_***______****_*F-LQSYGFQPTN*VGYQ\n", "\n", "\n", "\n", "\n", "
\n", " \n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#IFrame(src='./misa-RBD-ACE2-cmp/MISA/SARS-CoV-2-RBD_0-demo-misa-1.html', width=\"100%\", height=600)\n", "display(HTML('./misa-RBD-ACE2-cmp/MISA/SARS-CoV-2-RBD_0-demo-misa-1.html'))" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n",
       "\n",
       "Legend for the amino acids (aa) encoding :\n",
       " \n",
       " For aa not at the interface :\n",
       "   _  if aa is missing\n",
       "   - if aa is present\n",
       " \n",
       " For aa at the interface :\n",
       "   *  if aa is missing\n",
       "   X if consensus aa (= most frequent among the bound structures, and in case of tie the first by alphabetical order)\n",
       "   x  otherwise \n",
       "   (For bound structure files only) : \n",
       "   x or X  if the aa is part of the consensus interface but not part of the interface of this file \n",
       " \n",
       "\n",
       "
MISA SSE for MISA-ID ACE2-bound-to-SARS-CoV-1_0
\n", "3-turn helix - 4-turn helix - 5-turn helix - Isolated beta-bridge residue - Extended strand - \n", "Bend - Hydrogen bonded turn - Other - Missing Residue - \n", "\n", "\n", "Residue Index -20--------30--------40--------50--------60--------70--------80-- 321------330-------340-------350-------360-------370-------380-------390-\n", " | | | | | | | | | | | | | | | \n", "bound-2ajf-A, res :2.9 Å, 33 interf res S---EQ-KTF-DK--H--ED--YQ--L---------------------------------L--MY ---TQGF-EN----------------------KGDFR----------------------------AAQP---R\n", " \n", "bound-2ajf-B, res :2.9 Å, 27 interf res S---EQ-KTF-DK--H--ED--YQ--L---------------------------------L--MY ---TQGF-EN----------------------KGDFR----------------------------AAQP---R\n", " \n", " \n", " \n", "
MISA BSA for MISA-ID ACE2-bound-to-SARS-CoV-1_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Buried Surface Area (BSA) (in Å2) \n", " \n", "\n", "| bound-2ajf-A : total bsa = 888.38 Å2 | bound-2ajf-B : total bsa = 817.21 Å2 \n", "\n", "Residue Index -20--------30--------40--------50--------60--------70--------80-- 321------330-------340-------350-------360-------370-------380-------390-\n", " | | | | | | | | | | | | | | | \n", "bound-2ajf-A, res :2.9 Å, 33 interf res S---EQ-KTF-DK--H--ED--YQ--L---------------------------------L--MY ---TQGF-EN----------------------KGDFR----------------------------AAQP---R\n", " \n", "bound-2ajf-B, res :2.9 Å, 27 interf res S---EQ-KTF-DK--H--ED--YQ--L---------------------------------L--MY ---TQGF-EN----------------------KGDFR----------------------------AAQP---R\n", " \n", " \n", " \n", "
MISA Delta_ASA for MISA-ID ACE2-bound-to-SARS-CoV-1_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Bound structures : In light grey residues in bound structure for which miss corresponding ASA values in the unbound structures\n", " Per residue i, delta_ASA = ASA[i] - mean(ASA[i]) (mean is computed using the unbound structures) (in Å2)\n", " \n", " \n", "Unbound structures : Accessible Surface Area (ASA) (in Å2) \n", " \n", "\n", "\n", "\n", "Residue Index -20--------30--------40--------50--------60--------70--------80-- 321------330-------340-------350-------360-------370-------380-------390-\n", " | | | | | | | | | | | | | | | \n", "bound-2ajf-A, res :2.9 Å, 33 interf res S---EQ-KTF-DK--H--ED--YQ--L---------------------------------L--MY ---TQGF-EN----------------------KGDFR----------------------------AAQP---R\n", " \n", "bound-2ajf-B, res :2.9 Å, 27 interf res S---EQ-KTF-DK--H--ED--YQ--L---------------------------------L--MY ---TQGF-EN----------------------KGDFR----------------------------AAQP---R\n", " \n", " \n", " \n", "
MISA B_factor for MISA-ID ACE2-bound-to-SARS-CoV-1_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " B-Factor (in Å2) \n", " \n", "\n", "\n", "\n", "Residue Index -20--------30--------40--------50--------60--------70--------80-- 321------330-------340-------350-------360-------370-------380-------390-\n", " | | | | | | | | | | | | | | | \n", "bound-2ajf-A, res :2.9 Å, 33 interf res S---EQ-KTF-DK--H--ED--YQ--L---------------------------------L--MY ---TQGF-EN----------------------KGDFR----------------------------AAQP---R\n", " \n", "bound-2ajf-B, res :2.9 Å, 27 interf res S---EQ-KTF-DK--H--ED--YQ--L---------------------------------L--MY ---TQGF-EN----------------------KGDFR----------------------------AAQP---R\n", " \n", " \n", " \n", "\n", "
\n", " \n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#IFrame(src='./misa-RBD-ACE2-cmp/MISA/ACE2-bound-to-SARS-CoV-1_0-demo-misa-1.html', width=\"100%\", height=600)\n", "display(HTML('misa-RBD-ACE2-cmp/MISA/ACE2-bound-to-SARS-CoV-1_0-demo-misa-1.html'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2: mixing selected MISA with sbl-misa-mix.py\n", "\n", "Once these individual figures have been generated, a call to ```sbl-misa-mix.py``` allows to simultaneously compare different MISA_id in the same figure. In the sequel, we focus on the following three colored MISA: SSE, BSA, Delta_ASA,\n", "\n", "```sbl-misa-mix.py``` parses the specification file ```ifile-misa-mix.txt```, from which it finds the location of the directory containing the input data, the MISA_id to be displayed, and the colorings to be displayed." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Content of ```./misa-RBD-ACE2-cmp/ifile-misa-mix.txt``` :\n", "\n", "```\n", "# List of input directories\n", "localisation (./misa-RBD-ACE2-cmp/MISA)\n", "\n", "# List of MISA_chain_ids\n", "misa_chain_id (SARS-CoV-1-RBD_0, SARS-CoV-2-RBD_0)\n", "\n", "# List of colorings of interest\n", "coloring (SSE, BSA, Delta_ASA)\n", "```" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running /home/stephane/StageM1/dev2/sbl/Applications/Multiple_interface_string_alignment/python/sbl-misa-mix.py -mix_ifile ./misa-RBD-ACE2-cmp/ifile-misa-mix.txt -prefix demo-mix-1 -odir ./misa-RBD-ACE2-cmp --verbose 0\n" ] } ], "source": [ "exe = shutil.which('sbl-misa-mix.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa-mix.py not in your PATH')\n", "prefix = 'demo-mix-1' # To append at the beginning of the output files\n", "mix_ifile = './misa-RBD-ACE2-cmp/ifile-misa-mix.txt' # Specification file\n", "odir = './misa-RBD-ACE2-cmp' # Output directory\n", "verbose = '0'\n", "cmd = [exe, \"-mix_ifile\", mix_ifile, '-prefix', prefix, '-odir', odir, '--verbose', verbose]\n", "print('Running %s -mix_ifile %s -prefix %s -odir %s --verbose %s' % (exe, mix_ifile, prefix, odir, verbose))\n", "s = subprocess.check_output(cmd, encoding='UTF-8')\n", "\n", "#print(s)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first figure of the article corresponds to the output of ```sbl-misa-mix.py``` , run with the ```ifile-misa-mix.txt``` presented above :" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n",
       "Legend for the amino acids (aa) encoding :\n",
       "\n",
       "For aa not at the interface :\n",
       "_  if aa is missing\n",
       "- if aa is present\n",
       "\n",
       "For aa at the interface :\n",
       "*  if aa is missing\n",
       "X if consensus aa (= most frequent among the bound structures, and in case of tie the first by alphabetical order)\n",
       "x  otherwise\n",
       "(For bound structure files only) : \n",
       "x or X  if the aa is part of the consensus interface but not part of the interface of this file )\n",
       "\n",
       "
MISA SSE for MISA-ID SARS-CoV-1-RBD_0
\n", "3-turn helix - 4-turn helix - 5-turn helix - Isolated beta-bridge residue - Extended strand - \n", "Bend - Hydrogen bonded turn - Other - Missing Residue - \n", "\n", "Residue Index 390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490\n", " | | | | | | | | | | | \n", "bound-2ajf-E, res :2.9 Å, 29 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "bound-2ajf-F, res :2.9 Å, 29 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "unbound-closed-5x58-A, res :3.2 Å, 0 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "unbound-closed-6crz-C, res :3.3 Å, 0 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "\n", "\n", "\n", "
MISA SSE for MISA-ID SARS-CoV-2-RBD_0
\n", "3-turn helix - 4-turn helix - 5-turn helix - Isolated beta-bridge residue - Extended strand - \n", "Bend - Hydrogen bonded turn - Other - Missing Residue - \n", "\n", "Residue Index 403----410-------420-------430-------440-------450-------460-------470-------480-------490-------500----\n", " | | | | | | | | | | | \n", "bound-6m0j-E, res :2.45 Å, 27 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "bound-6lzg-B, res :2.5 Å, 39 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res R-DE----------KI--Y-----------------N-----**G-Y---Y-**_____-------____*_***______****_YF-LQSYGFQPTN*VGYQ\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res R-DE----------KI--Y-----------------N---__***-Y---Y-LF--------------__*_***______****_*F-LQSYGFQPTN*VGYQ\n", "\n", "\n", "\n", "\n", "
MISA BSA for MISA-ID SARS-CoV-1-RBD_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Buried Surface Area (BSA) (in Å2) \n", " \n", "\n", "| bound-2ajf-E : total bsa = 925.41 Å2 | bound-2ajf-F : total bsa = 864.87 Å2 \n", "Residue Index 390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490\n", " | | | | | | | | | | | \n", "bound-2ajf-E, res :2.9 Å, 29 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "bound-2ajf-F, res :2.9 Å, 29 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "\n", "\n", "\n", "
MISA BSA for MISA-ID SARS-CoV-2-RBD_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Buried Surface Area (BSA) (in Å2) \n", " \n", "\n", "| bound-6m0j-E : total bsa = 887.29 Å2 | bound-6lzg-B : total bsa = 1120.20 Å2 \n", "Residue Index 403----410-------420-------430-------440-------450-------460-------470-------480-------490-------500----\n", " | | | | | | | | | | | \n", "bound-6m0j-E, res :2.45 Å, 27 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "bound-6lzg-B, res :2.5 Å, 39 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "\n", "\n", "\n", "
MISA Delta_ASA for MISA-ID SARS-CoV-1-RBD_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Bound structures : In light grey residues in bound structure for which miss corresponding ASA values in the unbound structures\n", " Per residue i, delta_ASA = ASA[i] - mean(ASA[i]) (mean is computed using the unbound structures) (in Å2)\n", " \n", " \n", "Unbound structures : Accessible Surface Area (ASA) (in Å2) \n", " \n", "\n", "\n", "Residue Index 390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490\n", " | | | | | | | | | | | \n", "bound-2ajf-E, res :2.9 Å, 29 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "bound-2ajf-F, res :2.9 Å, 29 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "unbound-closed-5x58-A, res :3.2 Å, 0 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "unbound-closed-6crz-C, res :3.3 Å, 0 interf res K--D--Q-------VI--Y-----------------R-----S---Y---Y-YL----------------F-PD------P-LNCY---NDYG-YTTTGI-YQ\n", "\n", "\n", "\n", "\n", "
MISA Delta_ASA for MISA-ID SARS-CoV-2-RBD_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Bound structures : In light grey residues in bound structure for which miss corresponding ASA values in the unbound structures\n", " Per residue i, delta_ASA = ASA[i] - mean(ASA[i]) (mean is computed using the unbound structures) (in Å2)\n", " \n", " \n", "Unbound structures : Accessible Surface Area (ASA) (in Å2) \n", " \n", "\n", "\n", "Residue Index 403----410-------420-------430-------440-------450-------460-------470-------480-------490-------500----\n", " | | | | | | | | | | | \n", "bound-6m0j-E, res :2.45 Å, 27 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "bound-6lzg-B, res :2.5 Å, 39 interf res R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res R-DE----------KI--Y-----------------N-----**G-Y---Y-**_____-------____*_***______****_YF-LQSYGFQPTN*VGYQ\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res R-DE----------KI--Y-----------------N---__***-Y---Y-LF--------------__*_***______****_*F-LQSYGFQPTN*VGYQ\n", "\n", "\n", "\n", "\n", "
\n", " \n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#IFrame(src='./misa-RBD-ACE2-cmp/demo-mix-1_SSE_BSA_Delta_ASA_SARS-CoV-1-RBD_0_SARS-CoV-2-RBD_0_mixed_figure.html', width=\"100%\", height=600)\n", "display(HTML('./misa-RBD-ACE2-cmp/demo-mix-1_SSE_BSA_Delta_ASA_SARS-CoV-1-RBD_0_SARS-CoV-2-RBD_0_mixed_figure.html'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3: investigating buried surface areas of selected residues with sbl-misa-bsa.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To further the study of an interface, we recover the BSA value of specific user defined residues. This is the purpose of ```sbl-misa-bsa.py``` .\n", "\n", "We provide to the program an ```.xml``` file generated by ```sbl-intervor-ABW-atomic.exe``` (run with the ```--output-prefix``` option), as well as a specfile containing the list of residues of interest, as presented below :\n", "\n", "Content of ```./misa-RBD-ACE2-cmp/ifile-misa-bsa.txt``` :\n", "\n", "```[(A, E, 303), (A, E, 403), (A, E, 449), (A, E, 455), (A, E, 486), (A, E, 502), (B, A, 79), (B, A, 35)] ```" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running sbl-misa-bsa.py\n", "XML: 1 / 1 files were loaded\n", "\n", "####################################################\n", "BSA for the intervor_partner A\n", "\n", "First according to the provided list of residues :\n", "\n", "Chain E Residue 303 : bsa = NA Å^2\n", "Chain E Residue 486 : bsa = 98.531 Å^2\n", "Chain E Residue 502 : bsa = 41.882 Å^2\n", "Chain E Residue 455 : bsa = 41.547 Å^2\n", "Chain E Residue 449 : bsa = 37.186 Å^2\n", "Chain E Residue 403 : bsa = 0.096 Å^2\n", "\n", "Cumulated bsa for intervor_partner A - chain E is 219.242 Å^2\n", "Cumulated bsa for intervor_partner A is 219.242 Å^2 \n", "\n", "Then for the other residues (we only display the residues with a BSA greater than 0.001 Å^2) :\n", "\n", "Chain E Residue 500 : bsa = 91.397 Å^2\n", "Chain E Residue 505 : bsa = 86.628 Å^2\n", "Chain E Residue 489 : bsa = 73.470 Å^2\n", "Chain E Residue 493 : bsa = 60.387 Å^2\n", "Chain E Residue 498 : bsa = 55.658 Å^2\n", "Chain E Residue 456 : bsa = 45.129 Å^2\n", "Chain E Residue 475 : bsa = 38.708 Å^2\n", "Chain E Residue 487 : bsa = 38.399 Å^2\n", "Chain E Residue 501 : bsa = 30.063 Å^2\n", "Chain E Residue 417 : bsa = 27.641 Å^2\n", "Chain E Residue 496 : bsa = 22.986 Å^2\n", "Chain E Residue 453 : bsa = 22.979 Å^2\n", "Chain E Residue 503 : bsa = 21.688 Å^2\n", "Chain E Residue 484 : bsa = 13.355 Å^2\n", "Chain E Residue 446 : bsa = 10.431 Å^2\n", "Chain E Residue 476 : bsa = 10.198 Å^2\n", "Chain E Residue 445 : bsa = 9.552 Å^2\n", "Chain E Residue 473 : bsa = 6.310 Å^2\n", "Chain E Residue 490 : bsa = 1.404 Å^2\n", "Chain E Residue 477 : bsa = 1.387 Å^2\n", "Chain E Residue 485 : bsa = 0.278 Å^2\n", "\n", "Cumulated bsa for intervor_partner A - chain E is 668.049 Å^2\n", "Cumulated bsa for intervor_partner A is 668.049 Å^2 \n", "\n", "\n", "The bsa for intervor_partner A is 668.049 Å^2 (with respect to the provided residue ids)\n", "\n", "####################################################\n", "BSA for the intervor_partner B\n", "\n", "First according to the provided list of residues :\n", "\n", "Chain A Residue 79 : bsa = 24.518 Å^2\n", "Chain A Residue 35 : bsa = 17.735 Å^2\n", "\n", "Cumulated bsa for intervor_partner B - chain A is 42.254 Å^2\n", "Cumulated bsa for intervor_partner B is 42.254 Å^2 \n", "\n", "Then for the other residues (we only display the residues with a BSA greater than 0.001 Å^2) :\n", "\n", "Chain A Residue 353 : bsa = 97.689 Å^2\n", "Chain A Residue 31 : bsa = 93.757 Å^2\n", "Chain A Residue 34 : bsa = 68.567 Å^2\n", "Chain A Residue 27 : bsa = 66.460 Å^2\n", "Chain A Residue 24 : bsa = 53.133 Å^2\n", "Chain A Residue 42 : bsa = 44.506 Å^2\n", "Chain A Residue 41 : bsa = 43.731 Å^2\n", "Chain A Residue 30 : bsa = 40.852 Å^2\n", "Chain A Residue 83 : bsa = 38.720 Å^2\n", "Chain A Residue 38 : bsa = 34.191 Å^2\n", "Chain A Residue 354 : bsa = 31.103 Å^2\n", "Chain A Residue 330 : bsa = 28.685 Å^2\n", "Chain A Residue 82 : bsa = 28.595 Å^2\n", "Chain A Residue 45 : bsa = 25.525 Å^2\n", "Chain A Residue 324 : bsa = 17.235 Å^2\n", "Chain A Residue 28 : bsa = 16.649 Å^2\n", "Chain A Residue 37 : bsa = 16.266 Å^2\n", "Chain A Residue 355 : bsa = 11.668 Å^2\n", "Chain A Residue 357 : bsa = 11.206 Å^2\n", "Chain A Residue 19 : bsa = 10.478 Å^2\n", "Chain A Residue 393 : bsa = 9.348 Å^2\n", "Chain A Residue 325 : bsa = 8.452 Å^2\n", "Chain A Residue 326 : bsa = 4.140 Å^2\n", "Chain A Residue 386 : bsa = 0.593 Å^2\n", "\n", "Cumulated bsa for intervor_partner B - chain A is 801.550 Å^2\n", "Cumulated bsa for intervor_partner B is 801.550 Å^2 \n", "\n", "\n", "The bsa for intervor_partner B is 801.550 Å^2 (with respect to the provided residue ids)\n", "\n", "\n", "\n", "Done\n", "\n" ] } ], "source": [ "exe = shutil.which('sbl-misa-bsa.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa-bsa.py not in your PATH')\n", "specfile = './misa-RBD-ACE2-cmp/ifile-misa-bsa.txt' # Path to the spec file containing the list of resid to be studied\n", "xmlfile = './misa-RBD-ACE2-cmp/input-data/intervor/sbl-intervor-ABW-atomic__radius_water_1dot4__f_6m0j__p_4__P_E__P_A___alpha_0__buried_surface_area.xml' # Name of the .xml input file \n", "cmd = [exe, \"-specfile\", specfile, '-xmlfile', xmlfile]\n", "s = subprocess.check_output(cmd, encoding='UTF-8')\n", "print(s)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 4: comparing interfaces and MISA with sbl-misa-diff.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```sbl-misa-diff.py``` allows to compare the interface between two peer chains, identifying the residues specific to each, and the shared residues, as well as displaying the BSA of these residues.\n", "\n", "It allows to compare Voronoi interfaces and/or manually defined interfaces.\n", "Hand-defined interfaces shall be specified in the same format as ```SARS-CoV-1-RBD-Harisson-2005.txt``` (see below), where the first line corresponds to the name given to the chain, and each subsequent line corresponds to a residue (nature + index).\n", "\n", "Content of ```./misa-RBD-ACE2-cmp/SARS-CoV-1-RBD-Harisson-2005.txt``` :\n", "\n", "```\n", "harisson-interface\n", "T402 \n", "R426 \n", "Y436 \n", "Y440 \n", "Y442 \n", "L472 \n", "N473 \n", "Y475 \n", "N479\n", "Y484\n", "T486 \n", "T487 \n", "G488 \n", "Y491\n", "```\n", "\n", "The corresponding specfile is the following :\n", "\n", "Content of ```./misa-RBD-ACE2-cmp/ifile-misa-diff.txt``` :\n", "\n", "```\n", "(./misa-RBD-ACE2-cmp/MISA/raw-data, 2ajf, E)\n", "(./misa-RBD-ACE2-cmp/SARS-CoV-1-RBD-Harisson-2005.txt)\n", "```\n", "\n", "The output ```.txt``` file is displayed below." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running sbl-misa-diff.py\n", "Reading spec file\n", "Reading interface file\n", "Comparing interfaces\n", "created ./misa-RBD-ACE2-cmp/comparison-interface-2ajf-E-with-harisson-interface.txt\n", "Done\n", "\n", "\n", "++Showing file misa-RBD-ACE2-cmp/comparison-interface-2ajf-E-with-harisson-interface.txt\n", "Comparison of the Buried Surface Area (BSA) and of the nature of the residues for the interface residues.\n", "(Missing data are denoted by \"NA\")\n", "\n", "16 exclusive residues at the interface of chain 2ajf-E :\n", " BSA-2ajf-E Names-2ajf-E\n", "( , 390, ) 9.34 K\n", "( , 393, ) 4.92 D\n", "( , 404, ) 14.90 V\n", "( , 405, ) 1.66 I\n", "( , 408, ) 3.25 Y\n", "( , 432, ) 6.46 S\n", "( , 443, ) 34.93 L\n", "( , 460, ) 3.21 F\n", "( , 462, ) 49.79 P\n", "( , 463, ) 11.89 D\n", "( , 470, ) 2.18 P\n", "( , 480, ) 0.45 D\n", "( , 481, ) 11.17 Y\n", "( , 482, ) 14.59 G\n", "( , 489, ) 32.87 I\n", "( , 492, ) 3.55 Q\n", "\n", "1 exclusive residues at the interface of chain harisson-interface :\n", " BSA-harisson-interface Names-harisson-interface\n", "( , 402, ) NA NA\n", "\n", "13 shared residues :\n", " BSA-2ajf-E BSA-harisson-interface Names-2ajf-E Names-harisson-interface\n", "( , 426, ) 42.57 NA R NA\n", "( , 436, ) 36.67 NA Y NA\n", "( , 440, ) 32.86 NA Y NA\n", "( , 442, ) 60.05 NA Y NA\n", "( , 472, ) 75.32 NA L NA\n", "( , 473, ) 51.06 NA N NA\n", "( , 475, ) 83.14 NA Y NA\n", "( , 479, ) 23.05 NA N NA\n", "( , 484, ) 62.65 NA Y NA\n", "( , 486, ) 86.47 NA T NA\n", "( , 487, ) 43.74 NA T NA\n", "( , 488, ) 41.77 NA G NA\n", "( , 491, ) 80.91 NA Y NA\n", "--Done\n", "\n", "\n" ] } ], "source": [ "exe = shutil.which('sbl-misa-diff.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa-diff.py not in your PATH')\n", "specfile = './misa-RBD-ACE2-cmp/ifile-misa-diff.txt'\n", "odir = './misa-RBD-ACE2-cmp' # Outp\n", "cmd = [exe, \"-specfile\", specfile, \"-odir\", odir]\n", "s = subprocess.check_output(cmd, encoding='UTF-8')\n", "print(s)\n", "#sblpyt.show_this_text_file('misa-RBD-ACE2-cmp/comparison-interface-2ajf-E-with-Harisson-interface-RBD-CoV1-ACE2-Science-2005.txt')\n", "sblpyt.show_this_text_file('misa-RBD-ACE2-cmp/comparison-interface-2ajf-E-with-harisson-interface.txt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Example 2: structure of the RBD or Sars-cov-1 and Sars-cov-2 bound to immunoglobulins\n", "\n", "This figure compares the RBD interface of SARS-CoV-1 and SARS-CoV-2 with ACE2, or with different immunoglobulins (VHH72, CR3022, 2F6).\n", "\n", "The RBD from SARS-CoV-2 is implied in several complexes, but a same structure cannot simultaneously appears more than once. It is thus necessary to make one ```ifile-misa.txt``` per complex. One subdirectory per complex, containing only the relevant ```ifile-misa.txt``` was provided." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 1 : generating all colored MISA for each complex with sbl-misa.py\n", "\n", "Each ```ifile-misa.txt``` contains one or two examples of bound RBD, as well as two examples of unbound RBD, in order to be able to calculate the $\\delta_ASA$ induced by the conformational change." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Content of ```./misa-RBD-IG/RBD-VHH72/ifile-misa.txt``` :\n", "```\n", "./pdb/6waq.pdb (A, B, SARS-CoV-1-RBD-bound-to-VHH72, bound) (B, A, VHH72, bound)\n", "./pdb/5x58.pdb (A, A, SARS-CoV-1-RBD-bound-to-VHH72, unbound-closed) \n", "./pdb/6crz.pdb (A, C, SARS-CoV-1-RBD-bound-to-VHH72, unbound-closed)\n", "```\n", "Content of ```./misa-RBD-IG/RBD-ACE2/ifile-misa.txt``` :\n", "```\n", "[SARS-CoV-2-RBD-bound-to-ACE2_0 (346, 528)]\n", "[SARS-CoV-2-RBD-bound-to-ACE2_0 (355, 494)]\n", "\n", "# Specification SARS-CoV-1\n", "./pdb/2ajf.pdb (A, E, SARS-CoV-1-RBD-bound-to-ACE2, bound) (B, A, ACE2-bound-to-CoV-1, bound)\n", "./pdb/2ajf.pdb (A, F, SARS-CoV-1-RBD-bound-to-ACE2, bound) (B, B, ACE2-bound-to-CoV-1, bound)\n", "./pdb/5x58.pdb (A, A, SARS-CoV-1-RBD-bound-to-ACE2, unbound-closed) \n", "./pdb/6crz.pdb (A, C, SARS-CoV-1-RBD-bound-to-ACE2, unbound-closed)\n", "\n", "# Specification SARS-CoV-2\n", "./pdb/6m0j.pdb (A, E, SARS-CoV-2-RBD-bound-to-ACE2, bound) (B, A, ACE2-bound-to-CoV-2, bound)\n", "./pdb/6lzg.pdb (A, B, SARS-CoV-2-RBD-bound-to-ACE2, bound) (B, A, ACE2-bound-to-CoV-2, bound)\n", "./pdb/6vxx.pdb (A, A, SARS-CoV-2-RBD-bound-to-ACE2, unbound-closed)\n", "./pdb/6vyb.pdb (A, A, SARS-CoV-2-RBD-bound-to-ACE2, unbound-closed)\n", "\n", "```\n", "Content of ```./misa-RBD-IG/RBD-CR3022/ifile-misa.txt```\n", "```\n", "[SARS-CoV-2-RBD-bound-to-CR3022_0 (346, 528)]\n", "./pdb/6yla.pdb (A, E, SARS-CoV-2-RBD-bound-to-CR3022, bound) (B, H, CR3022-antibody, bound) (B, L, CR3022-antibody, bound)\n", "./pdb/6yla.pdb (A, A, SARS-CoV-2-RBD-bound-to-CR3022, bound) (B, B, CR3022-antibody, bound) (B, C, CR3022-antibody, bound)\n", "./pdb/6vxx.pdb (A, A, SARS-CoV-2-RBD-bound-to-CR3022, unbound-closed)\n", "./pdb/6vyb.pdb (A, A, SARS-CoV-2-RBD-bound-to-CR3022, unbound-closed)\n", "```\n", "Content of ```./misa-RBD-IG/RBD-2F6/ifile-misa.txt```\n", "```\n", "[SARS-CoV-2-RBD-bound-to-2F6_0 (346, 528)]\n", "./pdb/7bwj.pdb (A, E, SARS-CoV-2-RBD-bound-to-2F6, bound) (B, H, 2F6-antibody, bound) (B, L, 2F6-antibody, bound)\n", "./pdb/6vxx.pdb (A, A, SARS-CoV-2-RBD-bound-to-2F6, unbound-closed)\n", "./pdb/6vyb.pdb (A, A, SARS-CoV-2-RBD-bound-to-2F6, unbound-closed)\n", "```" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done for complex RBD-VHH72\n", "Done for complex RBD-ACE2\n", "Done for complex RBD-CR3022\n", "Done for complex RBD-P2B-2F6\n" ] } ], "source": [ "exe = shutil.which('sbl-misa.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa.py not in your PATH')\n", "for dir_complex in ['RBD-VHH72','RBD-ACE2', 'RBD-CR3022','RBD-P2B-2F6']:\n", " prefix_dir = './misa-RBD-IG/%s' % dir_complex # To append at the beginning of every input and output directories\n", " ifile = '%s/ifile-misa.txt' % prefix_dir # Specification file\n", " prefix = 'demo-misa-2' # To append at the beginning of the output files\n", " verbose = '0'\n", " normalize_b_factor = '2' # Normalization with only respect to the displayed residues\n", " cmd = [exe, \"-ifile\", ifile, \"-prefix_dir\", prefix_dir, '-prefix', prefix, '--verbose', verbose, '-normalize_b_factor', normalize_b_factor]\n", " s = subprocess.check_output(cmd, encoding='UTF-8')\n", " print('Done for complex %s' % dir_complex)\n", " #print(s)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As in the first example, a figure containing the four colorings is produced for each chain. For example, here is the RBD of SARS-CoV-2 in complex with the CR3022 antibody:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n",
       "\n",
       "Legend for the amino acids (aa) encoding :\n",
       " \n",
       " For aa not at the interface :\n",
       "   _  if aa is missing\n",
       "   - if aa is present\n",
       " \n",
       " For aa at the interface :\n",
       "   *  if aa is missing\n",
       "   X if consensus aa (= most frequent among the bound structures, and in case of tie the first by alphabetical order)\n",
       "   x  otherwise \n",
       "   (For bound structure files only) : \n",
       "   x or X  if the aa is part of the consensus interface but not part of the interface of this file \n",
       " \n",
       "\n",
       "
MISA SSE for MISA-ID SARS-CoV-2-RBD-bound-to-CR3022_0
\n", "3-turn helix - 4-turn helix - 5-turn helix - Isolated beta-bridge residue - Extended strand - \n", "Bend - Hydrogen bonded turn - Other - Missing Residue - \n", "\n", "\n", "Residue Index 346-350-------360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490-------500-------510-------520------\n", " | | | | | | | | | | | | | | | | | | | \n", "bound-6yla-E, res :2.42 Å, 33 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT------------------------------------------------------------------------------------FELLH--------K\n", "\n", "bound-6yla-A, res :2.42 Å, 31 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT---------------__-------------------------------------------------------------------FELLH--------K\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT--------------__--------_______-------____________________-------------_------------FELLH--------K\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT------------_____-----------------------___________________------------_------------FELLH--------K\n", "\n", "\n", "\n", "
MISA BSA for MISA-ID SARS-CoV-2-RBD-bound-to-CR3022_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Buried Surface Area (BSA) (in Å2) \n", " \n", "\n", "| bound-6yla-E : total bsa = 983.29 Å2 | bound-6yla-A : total bsa = 1079.67 Å2 \n", "\n", "Residue Index 346-350-------360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490-------500-------510-------520------\n", " | | | | | | | | | | | | | | | | | | | \n", "bound-6yla-E, res :2.42 Å, 33 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT------------------------------------------------------------------------------------FELLH--------K\n", "\n", "bound-6yla-A, res :2.42 Å, 31 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT---------------__-------------------------------------------------------------------FELLH--------K\n", "\n", "\n", "\n", "
MISA Delta_ASA for MISA-ID SARS-CoV-2-RBD-bound-to-CR3022_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " Bound structures : In light grey residues in bound structure for which miss corresponding ASA values in the unbound structures\n", " Per residue i, delta_ASA = ASA[i] - mean(ASA[i]) (mean is computed using the unbound structures) (in Å2)\n", " \n", " \n", "Unbound structures : Accessible Surface Area (ASA) (in Å2) \n", " \n", "\n", "\n", "\n", "Residue Index 346-350-------360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490-------500-------510-------520------\n", " | | | | | | | | | | | | | | | | | | | \n", "bound-6yla-E, res :2.42 Å, 33 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT------------------------------------------------------------------------------------FELLH--------K\n", "\n", "bound-6yla-A, res :2.42 Å, 31 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT---------------__-------------------------------------------------------------------FELLH--------K\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT--------------__--------_______-------____________________-------------_------------FELLH--------K\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT------------_____-----------------------___________________------------_------------FELLH--------K\n", "\n", "\n", "\n", "
MISA B_factor for MISA-ID SARS-CoV-2-RBD-bound-to-CR3022_0
\n", "In dark grey, residues with missing data for coloring\n", "\n", " B-Factor (in Å2) \n", " \n", "\n", "\n", "\n", "Residue Index 346-350-------360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490-------500-------510-------520------\n", " | | | | | | | | | | | | | | | | | | | \n", "bound-6yla-E, res :2.42 Å, 33 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT------------------------------------------------------------------------------------FELLH--------K\n", "\n", "bound-6yla-A, res :2.42 Å, 31 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT---------------__-------------------------------------------------------------------FELLH--------K\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT--------------__--------_______-------____________________-------------_------------FELLH--------K\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT------------_____-----------------------___________________------------_------------FELLH--------K\n", "\n", "\n", "\n", "\n", "
\n", " \n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# IFrame(src='./misa-RBD-IG/RBD-CR3022/MISA/SARS-CoV-2-RBD-bound-to-CR3022_0-demo-misa-2.html', width=\"100%\", height=600)\n", "display(HTML('./misa-RBD-IG/RBD-CR3022/MISA/SARS-CoV-2-RBD-bound-to-CR3022_0-demo-misa-2.html'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2 : mixing selected MISA" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```sbl-misa-mix.py``` can also gather the output of different runs of ```sbl-misa.py``` (on the contrary to the first example where the MISA_id all came from the same run of ```sbl-misa.py```).\n", "Here is an example, which corresponds to the second figure on the paper, based on the following ```ifile-misa-mix.txt``` :\n", "\n", "Content of ```./misa-RBD-IG/ifile-misa-mix.txt```:\n", "\n", "```\n", "localisation (./misa-RBD-IG/RBD-VHH72/MISA, ./misa-RBD-IG/RBD-ACE2/MISA, ./misa-RBD-IG/RBD-CR3022/MISA, ./misa-RBD-IG/RBD-P2B-2F6/MISA)\n", "\n", "# List of MISA_chain_ids\n", "misa_chain_id (SARS-CoV-2-RBD-bound-to-P2B-2F6_0, SARS-CoV-2-RBD-bound-to-CR3022_0, SARS-CoV-1-RBD-bound-to-VHH72_0, SARS-CoV-2-RBD-bound-to-ACE2_0)\n", "\n", "# List of coloring of interest\n", "coloring (SSE)\n", "```" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running sbl-misa-mix.py\n", "Done\n", "\n" ] } ], "source": [ "exe = shutil.which('sbl-misa-mix.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa-mix.py not in your PATH')\n", "prefix = 'demo-mix-2' # To append at the beginning of the output files\n", "mix_ifile = './misa-RBD-IG/ifile-misa-mix.txt' # Specification file \n", "odir = './misa-RBD-IG' # Output directory\n", "verbose = '0'\n", "cmd = [exe, \"-mix_ifile\", mix_ifile, '-prefix', prefix, '-odir', odir, '--verbose', verbose]\n", "s = subprocess.check_output(cmd, encoding='UTF-8')\n", "print(s)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It gives the following output :" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n",
       "Legend for the amino acids (aa) encoding :\n",
       "\n",
       "For aa not at the interface :\n",
       "_  if aa is missing\n",
       "- if aa is present\n",
       "\n",
       "For aa at the interface :\n",
       "*  if aa is missing\n",
       "X if consensus aa (= most frequent among the bound structures, and in case of tie the first by alphabetical order)\n",
       "x  otherwise\n",
       "(For bound structure files only) : \n",
       "x or X  if the aa is part of the consensus interface but not part of the interface of this file )\n",
       "\n",
       "
MISA SSE for MISA-ID SARS-CoV-1-RBD-bound-to-VHH72_0
\n", "3-turn helix - 4-turn helix - 5-turn helix - Isolated beta-bridge residue - Extended strand - \n", "Bend - Hydrogen bonded turn - Other - Missing Residue - \n", "\n", "Residue Index 355--360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490--\n", " | | | | | | | | | | | | | | | \n", "bound-6waq-B, res :2.2 Å, 25 interf res LYNSTFFSTFKC--V-AT------------------GD-VR---------------------------WN-R--------------------------------------------------------------IG---Y\n", "\n", "unbound-closed-5x58-A, res :3.2 Å, 0 interf res LYNSTFFSTFKC--V-AT------------------GD-VR---------------------------WN-R--------------------------------------------------------------IG---Y\n", "\n", "unbound-closed-6crz-C, res :3.3 Å, 0 interf res LYNSTFFSTFKC__*_AT------------------GD-VR---------------------------WN-R--------------------------------------------------------------IG---Y\n", "\n", "\n", "\n", "\n", "
MISA SSE for MISA-ID SARS-CoV-2-RBD-bound-to-ACE2_0
\n", "3-turn helix - 4-turn helix - 5-turn helix - Isolated beta-bridge residue - Extended strand - \n", "Bend - Hydrogen bonded turn - Other - Missing Residue - \n", "\n", "Residue Index 346-350-------360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490-------500-------510-------520------\n", " | | | | | | | | | | | | | | | | | | | \n", "bound-6m0j-E, res :2.45 Å, 27 interf res ---------------------------------------------------------R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ--------------------__\n", "\n", "bound-6lzg-B, res :2.5 Å, 39 interf res ---------------------------------------------------------R-DE----------KI--Y-----------------N-----VGG-Y---Y-LF----------------Y-AGS------EGFN-YF-LQSYGFQPTNGVGYQ---------------------_\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res ---------------------------------------------------------R-DE----------KI--Y-----------------N-----**G-Y---Y-**_____-------____*_***______****_YF-LQSYGFQPTN*VGYQ----------------------\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res ---------------------------------------------------------R-DE----------KI--Y-----------------N---__***-Y---Y-LF--------------__*_***______****_*F-LQSYGFQPTN*VGYQ----------------------\n", "\n", "\n", "\n", "\n", "
MISA SSE for MISA-ID SARS-CoV-2-RBD-bound-to-CR3022_0
\n", "3-turn helix - 4-turn helix - 5-turn helix - Isolated beta-bridge residue - Extended strand - \n", "Bend - Hydrogen bonded turn - Other - Missing Residue - \n", "\n", "Residue Index 346-350-------360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490-------500-------510-------520------\n", " | | | | | | | | | | | | | | | | | | | \n", "bound-6yla-E, res :2.42 Å, 33 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT------------------------------------------------------------------------------------FELLH--------K\n", "\n", "bound-6yla-A, res :2.42 Å, 31 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT---------------__-------------------------------------------------------------------FELLH--------K\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT--------------__--------_______-------____________________-------------_------------FELLH--------K\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res ----------------------LYNS--FSTFKCYGVSPTK-N-L-F---------------R--AP-Q------------DDFT------------_____-----------------------___________________------------_------------FELLH--------K\n", "\n", "\n", "\n", "\n", "
MISA SSE for MISA-ID SARS-CoV-2-RBD-bound-to-P2B-2F6_0
\n", "3-turn helix - 4-turn helix - 5-turn helix - Isolated beta-bridge residue - Extended strand - \n", "Bend - Hydrogen bonded turn - Other - Missing Residue - \n", "\n", "Residue Index 346-350-------360-------370-------380-------390-------400-------410-------420-------430-------440-------450-------460-------470-------480-------490-------500-------510-------520------\n", " | | | | | | | | | | | | | | | | | | | \n", "bound-7bwj-E, res :2.85 Å, 20 interf res R----Y--------------------------------------------------------------------------------------------KVGGNYN-L-----------------T-I---------GVEG----F-LQS--------------------------------__\n", "\n", "unbound-closed-6vxx-A, res :2.8 Å, 0 interf res R----Y--------------------------------------------------------------------------------------------K**GNYN-L--_______-------_*_*_________****___-F-LQS-------_--------------------------\n", "\n", "unbound-closed-6vyb-A, res :3.2 Å, 0 interf res R----Y-------------------------------------------------------------------------------------------_****NYN-L-----------------T_*_________****____F-LQS-------_--------------------------\n", "\n", "\n", "\n", "\n", "
\n", " \n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#IFrame(src='./misa-RBD-IG/demo-mix-2_SSE_SARS-CoV-2-RBD-bound-to-P2B-2F6_0_SARS-CoV-2-RBD-bound-to-CR3022_0_SARS-CoV-1-RBD-bound-to-VHH72_0_SARS-CoV-2-RBD-bound-to-ACE2_0_mixed_figure.html', width=\"100%\", height=600)\n", "display(HTML('./misa-RBD-IG/demo-mix-2_SSE_SARS-CoV-2-RBD-bound-to-P2B-2F6_0_SARS-CoV-2-RBD-bound-to-CR3022_0_SARS-CoV-1-RBD-bound-to-VHH72_0_SARS-CoV-2-RBD-bound-to-ACE2_0_mixed_figure.html'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3 : comparing interfaces and MISA with sbl-misa-diff.py\n", "In the sequel, we provide three analysis:\n", "* comparison of the interfaces (RBD bound to P2B-2F6) vs (RBD bound to ACE2)\n", "* comparison of the interfaces (RBD bound to CR3022) vs (RBD bound to ACE2)\n", "* comparison of our Voronoi interface against that of Ju et al, Nature, 2020\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Comparing two Voronoi interfaces\n", "\n", "```sbl-misa-diff.py``` is used to compare the competition on the SARS-CoV-2 RBD interface between ACE2, CR3022 and P2B-2F6, looking at the residues on the RBD side involved in either interface.\n", "\n", "We created one ```ifile-misa-diff.txt``` per interface to study.\n", "\n", "The output ```.txt``` file is displayed right below the corresponding call to ```sbl-misa-diff.py```.\n", "\n", "#### RBD bound to P2B-2F6 vs RBD bound to ACE2\n", "\n", "It corresponds to the specification file ```./misa-RBD-IG/ifile-misa-diff1.txt``` :\n", "```\n", "(./misa-RBD-IG/RBD-P2B-2F6/MISA/raw-data, 7bwj, E)\n", "(./misa-RBD-IG/RBD-ACE2/MISA/raw-data, 6lzg, B) \n", "```" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running sbl-misa-diff.py\n", "Reading spec file\n", "Comparing interfaces\n", "created ./misa-RBD-IG/comparison-interface-7bwj-E-with-6lzg-B.txt\n", "Done\n", "\n", "\n", "++Showing file misa-RBD-IG/comparison-interface-7bwj-E-with-6lzg-B.txt\n", "Comparison of the Buried Surface Area (BSA) and of the nature of the residues for the interface residues.\n", "(Missing data are denoted by \"NA\")\n", "\n", "10 exclusive residues at the interface of chain 7bwj-E :\n", " BSA-7bwj-E Names-7bwj-E\n", "( , 346, ) 18.18 R\n", "( , 351, ) 8.03 Y\n", "( , 444, ) 33.62 K\n", "( , 448, ) 4.65 N\n", "( , 450, ) 62.33 N\n", "( , 452, ) 37.43 L\n", "( , 470, ) 9.64 T\n", "( , 472, ) 11.69 I\n", "( , 482, ) 0.06 G\n", "( , 483, ) 68.56 V\n", "\n", "29 exclusive residues at the interface of chain 6lzg-B :\n", " BSA-6lzg-B Names-6lzg-B\n", "( , 403, ) 4.97 R\n", "( , 405, ) 11.85 D\n", "( , 406, ) 6.52 E\n", "( , 417, ) 27.07 K\n", "( , 418, ) NA I\n", "( , 421, ) 0.11 Y\n", "( , 439, ) 0.45 N\n", "( , 453, ) 23.53 Y\n", "( , 455, ) 45.13 L\n", "( , 456, ) 48.92 F\n", "( , 473, ) 8.87 Y\n", "( , 475, ) 40.28 A\n", "( , 476, ) 21.62 G\n", "( , 477, ) 2.76 S\n", "( , 486, ) 103.77 F\n", "( , 487, ) 37.95 N\n", "( , 489, ) 85.14 Y\n", "( , 495, ) 3.52 Y\n", "( , 496, ) 32.16 G\n", "( , 497, ) 2.17 F\n", "( , 498, ) 52.86 Q\n", "( , 499, ) 8.32 P\n", "( , 500, ) 111.97 T\n", "( , 501, ) 41.45 N\n", "( , 502, ) 46.29 G\n", "( , 503, ) 29.12 V\n", "( , 504, ) 10.52 G\n", "( , 505, ) 111.15 Y\n", "( , 506, ) 16.32 Q\n", "\n", "10 shared residues :\n", " BSA-7bwj-E BSA-6lzg-B Names-7bwj-E Names-6lzg-B\n", "( , 445, ) 21.02 9.79 V V\n", "( , 446, ) 29.66 12.23 G G\n", "( , 447, ) 11.56 NA G G\n", "( , 449, ) 93.42 33.71 Y Y\n", "( , 484, ) 76.25 15.96 E E\n", "( , 485, ) 16.03 2.37 G G\n", "( , 490, ) 70.56 17.97 F F\n", "( , 492, ) 6.92 2.81 L L\n", "( , 493, ) 9.99 81.33 Q Q\n", "( , 494, ) 9.41 9.23 S S\n", "--Done\n", "\n", "\n" ] } ], "source": [ "exe = shutil.which('sbl-misa-diff.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa-diff.py not in your PATH')\n", "odir = './misa-RBD-IG' # Output directory\n", "specfile = './misa-RBD-IG/ifile-misa-diff1.txt' # Specification file\n", "cmd = [exe, \"-specfile\", specfile, \"-odir\", odir]\n", "s = subprocess.check_output(cmd, encoding='UTF-8')\n", "print(s)\n", "\n", "output_file = 'misa-RBD-IG/comparison-interface-7bwj-E-with-6lzg-B.txt'\n", "sblpyt.show_this_text_file(output_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### RBD bound to CR3022 vs RBD bound to ACE2\n", "\n", "It corresponds to the specification file ```./misa-RBD-IG/ifile-misa-diff2.txt``` :\n", "```\n", "(./misa-RBD-IG/RBD-CR3022/MISA/raw-data, 6yla, E)\n", "(./misa-RBD-IG/RBD-ACE2/MISA/raw-data, 6lzg, B) \n", "```" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running sbl-misa-diff.py\n", "Reading spec file\n", "Comparing interfaces\n", "created ./misa-RBD-IG/comparison-interface-6yla-E-with-6lzg-B.txt\n", "Done\n", "\n", "\n", "++Showing file misa-RBD-IG/comparison-interface-6yla-E-with-6lzg-B.txt\n", "Comparison of the Buried Surface Area (BSA) and of the nature of the residues for the interface residues.\n", "(Missing data are denoted by \"NA\")\n", "\n", "33 exclusive residues at the interface of chain 6yla-E :\n", " BSA-6yla-E Names-6yla-E\n", "( , 368, ) 0.00 L\n", "( , 369, ) 58.69 Y\n", "( , 370, ) 15.39 N\n", "( , 371, ) 6.78 S\n", "( , 374, ) 15.88 F\n", "( , 375, ) 26.09 S\n", "( , 376, ) 24.00 T\n", "( , 377, ) 56.34 F\n", "( , 378, ) 89.76 K\n", "( , 379, ) 37.19 C\n", "( , 380, ) 41.05 Y\n", "( , 381, ) 74.01 G\n", "( , 382, ) 28.32 V\n", "( , 383, ) 38.63 S\n", "( , 384, ) 23.68 P\n", "( , 385, ) 65.33 T\n", "( , 386, ) 89.28 K\n", "( , 390, ) 27.19 L\n", "( , 392, ) 15.55 F\n", "( , 408, ) 8.49 R\n", "( , 411, ) 5.71 A\n", "( , 412, ) 0.54 P\n", "( , 414, ) 7.69 Q\n", "( , 427, ) 3.36 D\n", "( , 428, ) 75.35 D\n", "( , 429, ) 3.74 F\n", "( , 430, ) 53.27 T\n", "( , 515, ) 10.11 F\n", "( , 516, ) 8.77 E\n", "( , 517, ) 57.03 L\n", "( , 518, ) 2.44 L\n", "( , 519, ) 11.46 H\n", "( , 528, ) 2.16 K\n", "\n", "39 exclusive residues at the interface of chain 6lzg-B :\n", " BSA-6lzg-B Names-6lzg-B\n", "( , 403, ) 4.97 R\n", "( , 405, ) 11.85 D\n", "( , 406, ) 6.52 E\n", "( , 417, ) 27.07 K\n", "( , 418, ) NA I\n", "( , 421, ) 0.11 Y\n", "( , 439, ) 0.45 N\n", "( , 445, ) 9.79 V\n", "( , 446, ) 12.23 G\n", "( , 447, ) NA G\n", "( , 449, ) 33.71 Y\n", "( , 453, ) 23.53 Y\n", "( , 455, ) 45.13 L\n", "( , 456, ) 48.92 F\n", "( , 473, ) 8.87 Y\n", "( , 475, ) 40.28 A\n", "( , 476, ) 21.62 G\n", "( , 477, ) 2.76 S\n", "( , 484, ) 15.96 E\n", "( , 485, ) 2.37 G\n", "( , 486, ) 103.77 F\n", "( , 487, ) 37.95 N\n", "( , 489, ) 85.14 Y\n", "( , 490, ) 17.97 F\n", "( , 492, ) 2.81 L\n", "( , 493, ) 81.33 Q\n", "( , 494, ) 9.23 S\n", "( , 495, ) 3.52 Y\n", "( , 496, ) 32.16 G\n", "( , 497, ) 2.17 F\n", "( , 498, ) 52.86 Q\n", "( , 499, ) 8.32 P\n", "( , 500, ) 111.97 T\n", "( , 501, ) 41.45 N\n", "( , 502, ) 46.29 G\n", "( , 503, ) 29.12 V\n", "( , 504, ) 10.52 G\n", "( , 505, ) 111.15 Y\n", "( , 506, ) 16.32 Q\n", "\n", "0 shared residues :\n", "Empty DataFrame\n", "Columns: [BSA-6yla-E, BSA-6lzg-B, Names-6yla-E, Names-6lzg-B]\n", "Index: []\n", "--Done\n", "\n", "\n" ] } ], "source": [ "exe = shutil.which('sbl-misa-diff.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa-diff.py not in your PATH')\n", "odir = './misa-RBD-IG' # Output directory\n", "specfile = './misa-RBD-IG/ifile-misa-diff2.txt' # Specification file\n", "cmd = [exe, \"-specfile\", specfile, \"-odir\", odir]\n", "s = subprocess.check_output(cmd, encoding='UTF-8')\n", "print(s)\n", "\n", "output_file = 'misa-RBD-IG/comparison-interface-6yla-E-with-6lzg-B.txt'\n", "sblpyt.show_this_text_file(output_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Comparing a Voronoi interface with a reference interface" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, the interface between the P2B-2F6 antibody and the RBD as described in the paper presenting it can be compared with that predicted by the Voronoi model, by providing a description of the paper interface in the following format :\n", "\n", "Content of ```./misa-RBD-IG/SARS-CoV-2-RBD--Ju2020.txt``` :\n", "```\n", "Ju-interface-RBD-CoV2-IGP2B-2F6-Nature-2020\n", "K444\n", "G446\n", "G447\n", "N448\n", "Y449\n", "N450\n", "L452\n", "V483\n", "E484\n", "G485\n", "F490\n", "S494\n", "```\n", "\n", "and by using the following ```ifile-misa-diff3.txt``` :\n", "\n", "Content of ```./misa-RBD-IG/ifile-misa-diff3.txt```:\n", "```\n", "(./misa-RBD-IG/RBD-P2B-2F6/MISA/raw-data, 7bwj, E)\n", "(./misa-RBD-IG/SARS-CoV-2-RBD--Ju2020.txt) \n", "```\n", "\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running sbl-misa-diff.py\n", "Reading spec file\n", "Reading interface file\n", "Comparing interfaces\n", "created ./misa-RBD-IG/comparison-interface-7bwj-E-with-Ju-interface-RBD-CoV2-IGP2B-2F6-Nature-2020.txt\n", "Done\n", "\n", "\n", "++Showing file ./misa-RBD-IG/comparison-interface-7bwj-E-with-Ju-interface-RBD-CoV2-IGP2B-2F6-Nature-2020.txt\n", "Comparison of the Buried Surface Area (BSA) and of the nature of the residues for the interface residues.\n", "(Missing data are denoted by \"NA\")\n", "\n", "8 exclusive residues at the interface of chain 7bwj-E :\n", " BSA-7bwj-E Names-7bwj-E\n", "( , 346, ) 18.18 R\n", "( , 351, ) 8.03 Y\n", "( , 445, ) 21.02 V\n", "( , 470, ) 9.64 T\n", "( , 472, ) 11.69 I\n", "( , 482, ) 0.06 G\n", "( , 492, ) 6.92 L\n", "( , 493, ) 9.99 Q\n", "\n", "0 exclusive residues at the interface of chain Ju-interface-RBD-CoV2-IGP2B-2F6-Nature-2020 :\n", "Empty DataFrame\n", "Columns: [BSA-Ju-interface-RBD-CoV2-IGP2B-2F6-Nature-2020, Names-Ju-interface-RBD-CoV2-IGP2B-2F6-Nature-2020]\n", "Index: []\n", "\n", "12 shared residues :\n", " BSA-7bwj-E BSA-Ju-interface-RBD-CoV2-IGP2B-2F6-Nature-2020 Names-7bwj-E Names-Ju-interface-RBD-CoV2-IGP2B-2F6-Nature-2020\n", "( , 444, ) 33.62 NA K NA\n", "( , 446, ) 29.66 NA G NA\n", "( , 447, ) 11.56 NA G NA\n", "( , 448, ) 4.65 NA N NA\n", "( , 449, ) 93.42 NA Y NA\n", "( , 450, ) 62.33 NA N NA\n", "( , 452, ) 37.43 NA L NA\n", "( , 483, ) 68.56 NA V NA\n", "( , 484, ) 76.25 NA E NA\n", "( , 485, ) 16.03 NA G NA\n", "( , 490, ) 70.56 NA F NA\n", "( , 494, ) 9.41 NA S NA\n", "--Done\n", "\n", "\n" ] } ], "source": [ "exe = shutil.which('sbl-misa-diff.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa-diff.py not in your PATH')\n", "odir = './misa-RBD-IG' # Output directory\n", "specfile = './misa-RBD-IG/ifile-misa-diff3.txt' # Specification file\n", "cmd = [exe, \"-specfile\", specfile, \"-odir\", odir]\n", "s = subprocess.check_output(cmd, encoding='UTF-8')\n", "print(s)\n", "\n", "output_file = './misa-RBD-IG/comparison-interface-7bwj-E-with-Ju-interface-RBD-CoV2-IGP2B-2F6-Nature-2020.txt'\n", "sblpyt.show_this_text_file(output_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Example 3 : Comparing aligned proteins\n", "\n", "```sbl-misa.py``` allows you to compare the interface between realigned crystals, either by the user or by ```ClustalOmega``` (which must then be installed on your computer).\n", "\n", "The alignment format provided by the user must be ```.aln```, which is the format associated with the output of ```ClustalOmega```. There must be one file per MISA id to be realigned, provided in the ```adir``` directory, and this file must contain exactly one line per chain of the MISA id. The name associated with each chain must be: [```tag```]-[```PDBID```]-[```one-letter chain_identifier```], where the tag is the same as provided in the ``ifile.txt``, so that the program can match the alignment with the chain specifications\n", "\n", "Here is an example with the RBD of SARS-CoV-1 and SARS-CoV-2. No alignment is provided, let the program compute it itself.\n", "\n", "\n", "The associated specification file is as follows, with the same specification rules than ```ifile-misa.txt```: \n", "\n", "Content of ```./misa-RBD-ACE2-cmp/ifile-misa-align.txt```:\n", "\n", "```\n", "# Windows for SARS-CoV-1-ACE2\n", "[SARS-CoV-1-ACE2_0 (19, 83) (321,393)]\n", "\n", "./pdb/2ajf.pdb (A, E, SARS-CoV-RBD-aligned, bound-CoV-1) (B, A, SARS-CoV-1-ACE2, bound)\n", "./pdb/2ajf.pdb (A, F, SARS-CoV-RBD-aligned, bound-CoV-1) (B, B, SARS-CoV-1-ACE2, bound)\n", "./pdb/5x58.pdb (A, A, SARS-CoV-RBD-aligned, unbound-CoV-1) \n", "./pdb/6crz.pdb (A, C, SARS-CoV-RBD-aligned, unbound-CoV-1)\n", "./pdb/6vxx.pdb (A, A, SARS-CoV-RBD-aligned, unbound-CoV-2)\n", "./pdb/6vyb.pdb (A, A, SARS-CoV-RBD-aligned, unbound-CoV-2) \n", "./pdb/6m0j.pdb (A, E, SARS-CoV-RBD-aligned, bound-CoV-2) (D, A, SARS-CoV-2-ACE2, bound)\n", "./pdb/6lzg.pdb (A, B, SARS-CoV-RBD-aligned, bound-CoV-2) (D, A, SARS-CoV-2-ACE2, bound)\n", "```" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running /home/stephane/StageM1/dev2/sbl/Applications/Multiple_interface_string_alignment/python/sbl-misa.py -ifile ./misa-RBD-ACE2-cmp/ifile-misa-align.txt -prefix_dir ./misa-RBD-ACE2-cmp -prefix demo-misa-1-align --verbose 0 -normalize_b_factor 2 -to_align SARS-CoV-RBD-aligned_0 -adir /input-data/MSA\n", "\n", "Done\n" ] } ], "source": [ "exe = shutil.which('sbl-misa.py')\n", "if not exe: # if exe == None\n", " print('sbl-misa.py not in your PATH')\n", "ifile = './misa-RBD-ACE2-cmp/ifile-misa-align.txt' # Specification file\n", "prefix_dir = './misa-RBD-ACE2-cmp' # To append at the beginning of every input and output directories\n", "prefix = 'demo-misa-1-align' # To append at the beginning of the output files\n", "to_align = 'SARS-CoV-RBD-aligned_0' # MISA_id to realign\n", "adir = '/input-data/MSA' # Where to store/read the alignment files\n", "verbose = '0'\n", "normalize_b_factor = '2' # Normalization with only respect to the displayed residues\n", "cmd = [exe, \"-ifile\", ifile, \"-prefix_dir\", prefix_dir, '-prefix', prefix, '--verbose', verbose, '-normalize_b_factor', normalize_b_factor, '-to_align', to_align, 'adir', adir]\n", "print('Running %s -ifile %s -prefix_dir %s -prefix %s --verbose %s -normalize_b_factor %s -to_align %s -adir %s' % (exe, ifile, prefix_dir, prefix, verbose, normalize_b_factor, to_align, adir))\n", "s = subprocess.check_output(cmd, encoding='UTF-8')\n", "#print(s)\n", "print('\\nDone')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }