diff --git a/notebooks/CapeTown_Genomics_Tutorial_partI.ipynb b/notebooks/CapeTown_Genomics_Tutorial_partI.ipynb
index f35cf2fb4dcd8fe83e3be74e1b60572fef3715fe..016f3ee156a87322702c3ad9ffaab270ca7ca026 100644
--- a/notebooks/CapeTown_Genomics_Tutorial_partI.ipynb
+++ b/notebooks/CapeTown_Genomics_Tutorial_partI.ipynb
@@ -17,11 +17,13 @@
    "source": [
     "## 0. Getting started\n",
     "### How to start the jupyter notebook\n",
-    "1. Access the cloud: ssh student01@86.119.40.206\n",
+    "1. Access the cloud: ssh studentXX@86.119.40.206\n",
     "2. Your password is: stphcourse2018\n",
-    "3. cp -r ../Workshop_SA.git\n",
-    "4. singularity exec ../container.img jupyter notebook --no-browser --ip='*' --port=YourPortNumber eg.30000\n",
-    "5. Type in the browser: http://86.119.40.206:YourPortNumber/?token=c0669c145a630ea14b6ec3b29b870811844fefe12c375feb\n"
+    "3. copy this folder to your home directory: cp -r /home/Workshop_SA/ .\n",
+    "4. In hour home, type: singularity exec /home/container.img jupyter notebook --no-browser --ip='*' --port=YourPortNumber eg.30000\n",
+    "\n",
+    "\n",
+    "If you wish to access the git from your webbrowser, the URL is: https://git.scicore.unibas.ch/TBRU/Workshop_SA"
    ]
   },
   {
@@ -39,7 +41,7 @@
     "- It is your bioinformatics 'lab book'.\n",
     "\n",
     "### Useful tips to use in the jupyter notebook\n",
-    "- Run the command in the 'code cell': Shift + Return\n",
+    "- Run the command in the 'code cell': Shift + Enter\n",
     "- You can change the cell type from Code to Markdown to include explanatory text in your notebook\n",
     "- Use the \"tab\" key to autocomplement commands\n",
     "- https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/\n",
@@ -53,19 +55,47 @@
     "### Magics\n",
     "Taken from: https://blog.dominodatalab.com/lesser-known-ways-of-using-notebooks/\n",
     "\n",
-    "You can start notebooks with different kernels (e.g., R, Julia) — not just Python. What you might not know is that even within a notebook, you can run different types of code in different cells. With \"magics\", it is possible to use different languages \n",
+    "You can start notebooks with different kernels (e.g., R, Shell) — not just Python. What you might not know is that even within a notebook, you can run different types of code in different cells. With \"magics\", it is possible to use different languages \n",
     "By running % lsmagic in a cell you get a list of all the available magics. You can use % to start a single-line expression to run with the magics command. Or you can use a double %% to run a multi-line expression.\n",
     "\n",
     "Some of my favorites are:\n",
     "\n",
-    "!: to run a shell command.\n",
-    "% bash to run cell with bash in a subprocess.\n",
+    "    ! to run a shell command.\n",
+    "\n",
+    "    % bash to run cell with bash in a subprocess.\n",
     "\n",
     "### Using shell commands\n",
     "\n",
     "Any command that works at the command-line can be used in IPython by prefixing it with the ! character. For example, the ls, pwd, and echo commands can be run as follows:\n"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "/scicore/home/gagneux/loiseau/Workshop_SA/notebooks\n",
+      "The files in my working directory are:\n",
+      "adapters\t\t\t\t  Drug_resistance_mutations_MTBC.txt\n",
+      "annotation\t\t\t\t  images\n",
+      "CapeTown_Genomics_Tutorial_partIII.ipynb  Locus_to_exclude_Mtb.txt\n",
+      "CapeTown_Genomics_Tutorial_partII.ipynb   reference_genome\n",
+      "CapeTown_Genomics_Tutorial_partI.ipynb\t  slurm_scripts\n"
+     ]
+    }
+   ],
+   "source": [
+    "! pwd\n",
+    "! echo 'The files in my working directory are:'\n",
+    "! ls"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -75,19 +105,19 @@
     " - Perform essential steps of a Illumina whole-genome sequencing analysis pipeline of MTBC genomes.\n",
     "\n",
     "## Content of this tutorial:\n",
-    "- Finding genetic variants from raw sequencing data:\n",
-    "    - Looking into a fastq file: reads, Phred Quality scores\n",
-    "    - Raw read processing and quality assessment\n",
+    "- **Finding genetic variants from raw sequencing data**:\n",
+    "    - Looking into a fastq file: quality assessment of the reads\n",
+    "    - Raw read processing: trimming of illumina adapters and low quality bases \n",
     "    - Mapping processed reads to a reference genome (creation of a BAM file)\n",
     "    - BAM post-processing \n",
     "    - BAM quality assesment\n",
     "    - Variant identification (creation of a VCF file)\n",
     "    - Variant Annotation\n",
-    "<img src=\"images/Pipeline1.png\" width=\"500\">\n",
+    "<img src=\"images/Pipeline1.png\" width=\"600\">\n",
     "- You want to find genetic variants (SNPs, insertion, deletions) in these sequences.\n",
     "- To do so, you need to perform the following bioinformatics steps:\n",
     "\n",
-    "<img src=\"images/Pipeline2.png\" width=\"500\">\n",
+    "<img src=\"images/Pipeline2.png\" width=\"600\">\n",
     "\n"
    ]
   },
@@ -114,7 +144,7 @@
     "    Forward read: ~/Workshop_SA/data_Eldholm/ERR760779_1.fastq.gz\n",
     "    Reverse read: ~/Workshop_SA/data_Eldholm/ERR760779_2.fastq.gz\n",
     "    \n",
-    "The fastq files are compressed (.gz) to save space. Let's have a look at the first read of the file.\n",
+    "The fastq files are compressed (.gz) to save space. Let's have a look at the first read of the file using zcat.\n",
     "For this, read the first 4 lines of the file:"
    ]
   },
@@ -222,7 +252,11 @@
    "metadata": {},
    "source": [
     "Go back to the terminal and run from the command line type:\n",
-    "    - sbatch ~/Workshop_SA/notebooks/slurm_scripts/launch_fastqc.slurm"
+    "    - sbatch ~/Workshop_SA/notebooks/slurm_scripts/launch_fastqc.slurm\n",
+    "    \n",
+    "For this you will have to open a new terminal window and reconnect:\n",
+    "    - ssh studentXX@86.119.40.206\n",
+    "    - password: stphcourse2018"
    ]
   },
   {
@@ -244,11 +278,11 @@
     "    - an html which you can visualise using firefox for example\n",
     "    - a compressed folder (.zip). You can see the content of this folder by using the command 'unzip'. \n",
     "    \n",
-    "You can visualise the html file using firefox. \n",
+    "To visualise the html file open a new terminal on MobaXterm and type:\n",
+    "    - scp studentXX@86.119.40.206:/home/studentXX/ERR760779_1_fastqc.html Desktop\n",
     "\n",
-    "From the terminal, type:\n",
-    "    \n",
-    "    firefox ERR760779_1_fastqc.html"
+    "\n",
+    "The html file is now on your local computer, on your desktop. Double click on it."
    ]
   },
   {
@@ -342,10 +376,10 @@
    "metadata": {},
    "source": [
     "### Exercise: \n",
-    "- How many reads were dropped by Trimmomatic ? \n",
-    "- Why are complete reads dropped ? \n",
-    "- What is the percentage of reads we will find in the files ERR760779_**1P**.trimmed.fastq.gz and ERR760779_**2P**.trimmed.fastq.gz ?\n",
-    "- What is the percentage of reads we will find in the files ERR760779_**1U**.trimmed.fastq.gz and ERR760779_**2U**.trimmed.fastq.gz ?"
+    "- How many reads were dropped by Trimmomatic ? ................\n",
+    "- Why are complete reads dropped ? ................\n",
+    "- What is the percentage of reads we will find in the files ERR760779_**1P**.trimmed.fastq.gz and ERR760779_**2P**.trimmed.fastq.gz ? ................\n",
+    "- What is the percentage of reads we will find in the files ERR760779_**1U**.trimmed.fastq.gz and ERR760779_**2U**.trimmed.fastq.gz ? ................"
    ]
   },
   {
@@ -389,9 +423,18 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As before, to visualise the html file produce, open it with firefox:\n",
-    "    - firefox ERR760779_1P.html"
+    "As before, to visualise the html file produced:\n",
+    "    - scp studentXX@86.119.40.206:/home/studentXX/ERR760779_1P.trimmed.html Desktop"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {
diff --git a/notebooks/Locus_to_exclude_Mtb.txt b/notebooks/Locus_to_exclude_Mtb.txt
new file mode 100755
index 0000000000000000000000000000000000000000..a66d26a19a68837274535d5796695adc15a6869e
--- /dev/null
+++ b/notebooks/Locus_to_exclude_Mtb.txt
@@ -0,0 +1,506 @@
+Chrom	ChromStart	ChromEnd	locus tag	Comment	
+NC_000962	23182	23269	IG18_Rv0018c-Rv0019c	
+NC_000962	33582	33794	Rv0031	remnant of A transposase
+NC_000962	80194	80623	IG71_Rv0071-Rv0072	
+NC_000962	103710	104663	Rv0094c	50bp_duplicated	
+NC_000962	104663	104805	IG_Rv0094c-Rv0095c	
+NC_000962	104805	105215	Rv0095c	50bp_duplicated	
+NC_000962	105215	105324	IG_Rv0095c-Rv0096	
+NC_000962	105324	106715	Rv0096	PPE family protein
+NC_000962	131382	132872	Rv0109	PE-PGRS family protein
+NC_000962	149533	150996	Rv0124	PE-PGRS family protein
+NC_000962	154130	154231	IG127_Rv0126-Rv0127	
+NC_000962	177543	179309	Rv0151c	PE family protein	
+NC_000962	179309	179319	IG_Rv0151c-Rv0152c	
+NC_000962	179319	180896	Rv0152c	PE family protein	
+NC_000962	187433	188839	Rv0159c	PE family protein	
+NC_000962	188839	188931	IG_Rv0159c-Rv0160c	
+NC_000962	188931	190439	Rv0160c	PE family protein	
+NC_000962	307877	309547	Rv0256c	PPE family protein	
+NC_000962	309547	309699	IG_Rv0256c-Rv0257	
+NC_000962	309699	310073	Rv0257	50bp_duplicated
+NC_000962	332708	333136	Rv0277c	50bp_duplicated	
+NC_000962	333136	333437	IG_Rv0277c-Rv0278c	
+NC_000962	333437	336310	Rv0278c	PE-PGRS family protein	
+NC_000962	336310	336560	IG_Rv0278c-Rv0279c	
+NC_000962	336560	339073	Rv0279c	PE-PGRS family protein	
+NC_000962	339073	339364	IG_Rv0279c-Rv0280	
+NC_000962	339364	340974	Rv0280	PPE family protein
+NC_000962	349624	349932	Rv0285	PE family protein
+NC_000962	349935	351476	Rv0286	PPE family protein
+NC_000962	361334	363109	Rv0297	PE-PGRS family protein
+NC_000962	366150	372764	Rv0304c	PPE family protein	
+NC_000962	372764	372820	IG_Rv0304c-Rv0305c	
+NC_000962	372820	375711	Rv0305c	PPE family protein	
+NC_000962	399535	400050	Rv0335c	PE family protein	
+NC_000962	400050	400192	IG_Rv0335c-Rv0336	
+NC_000962	400192	401703	Rv0336	50bp_duplicated
+NC_000962	423639	424019	Rv0353	50bp_duplicated
+NC_000962	424019	424269	IG_Rv0353-Rv0354c	
+NC_000962	424269	424694	Rv0354c	PPE family protein	
+NC_000962	424694	424777	IG_Rv0354c-Rv0355c	
+NC_000962	424777	434679	Rv0355c	PPE family protein	
+NC_000962	466672	467406	Rv0387c	PPE family protein	
+NC_000962	467406	467459	IG_Rv0387c-Rv0388c	
+NC_000962	467459	468001	Rv0388c	PPE family protein	
+NC_000962	472781	474106	Rv0393	50bp_duplicated
+NC_000962	475816	476184	Rv0397	50bp_duplicated
+NC_000962	530751	532214	Rv0442c	PPE family protein	
+NC_000962	543174	544730	Rv0453	PPE family protein
+NC_000962	576787	577338	Rv0487	50bp_duplicated
+NC_000962	579349	580581	Rv0490	50bp_duplicated
+NC_000962	606551	608062	Rv0515	50bp_duplicated
+NC_000962	616832	616845	IG533_Rv0525-Rv0526	
+NC_000962	622793	624577	Rv0532	PE-PGRS family protein
+NC_000962	630040	631686	Rv0538	50bp_duplicated
+NC_000962	642812	642888	IG559_Rv0551c-Rv0552	
+NC_000962	671996	675916	Rv0578c	PE-PGRS family protein	
+NC_000962	701406	702014	Rv0605	repeat region
+NC_000962	706930	706947	IG622_Rv0612-Rv0613c	
+NC_000962	831776	832303	Rv0740	50bp_duplicated
+NC_000962	832303	832534	IG_Rv0740-Rv0741	
+NC_000962	832534	832848	Rv0741	transposase
+NC_000962	832848	832981	IG_Rv0741-Rv0742	
+NC_000962	832981	833508	Rv0742	PE-PGRS family protein
+NC_000962	835701	838052	Rv0746	PE-PGRS family protein
+NC_000962	838052	838451	IG_Rv0746-Rv0747	
+NC_000962	838451	840856	Rv0747	PE-PGRS family protein
+NC_000962	842033	842278	Rv0750	50bp_duplicated
+NC_000962	846159	847913	Rv0754	PE-PGRS family protein
+NC_000962	847913	850527	IG_Rv0754-Rv0755A	
+NC_000962	848103	850040	Rv0755c	PPE family protein	
+NC_000962	850342	850527	Rv0755A	transposase	
+NC_000962	863159	863255	IG784_Rv0769-Rv0770	
+NC_000962	889072	889398	Rv0795	transposase IS6110
+NC_000962	889347	890333	Rv0796	transposase IS6110
+NC_000962	889347	889398	IG_Rv0795-Rv0796	
+NC_000962	890333	890388	IG_Rv0796-Rv0797	
+NC_000962	890388	891482	Rv0797	50bp_duplicated
+NC_000962	908181	908483	Rv0814c	50bp_duplicated	
+NC_000962	916477	917646	Rv0823c	50bp_duplicated	
+NC_000962	921575	921865	Rv0829	50bp_duplicated
+NC_000962	924951	925364	Rv0832	PE-PGRS family protein
+NC_000962	925361	927610	Rv0833	PE-PGRS family protein
+NC_000962	927610	927837	IG_Rv0833-Rv0834c	
+NC_000962	927837	930485	Rv0834c	PE-PGRS family protein	
+NC_000962	947312	947644	Rv0850	transposase
+NC_000962	960152	960341	IG877_Rv0861c-Rv0862c	
+NC_000962	964312	965535	Rv0867c	50bp_duplicated	
+NC_000962	968424	970244	Rv0872c	PE-PGRS family protein	
+NC_000962	976872	978203	Rv0878c	PPE family protein	
+NC_000962	1020058	1021329	Rv0915c	PPE family protein	
+NC_000962	1021329	1021344	IG_Rv0915c-Rv0916c	
+NC_000962	1021344	1021643	Rv0916c	PE family protein	
+NC_000962	1025497	1026816	Rv0920c	transposase	
+NC_000962	1026816	1027104	IG_Rv0920c-Rv0921	
+NC_000962	1027104	1027685	Rv0921	resolvase
+NC_000962	1027685	1029337	Rv0922	transposase
+NC_000962	1090373	1093144	Rv0977	PE-PGRS family protein
+NC_000962	1093144	1093361	IG_Rv0977-Rv0978c	
+NC_000962	1093361	1094356	Rv0978c	PE-PGRS family protein	
+NC_000962	1095078	1096451	Rv0980c	PE-PGRS family protein	
+NC_000962	1158918	1159307	Rv1034c	transposase	
+NC_000962	1159307	1159375	IG_Rv1034c-Rv1035c	
+NC_000962	1159375	1160061	Rv1035c	transposase	
+NC_000962	1160061	1160095	IG_Rv1035c-Rv1036c	
+NC_000962	1160095	1160433	Rv1036c	truncated IS1560 transposase	
+NC_000962	1160433	1160544	IG_Rv1036c-Rv1037c	
+NC_000962	1160544	1160828	Rv1037c	50bp_duplicated	
+NC_000962	1160828	1160855	IG_Rv1037c-Rv1038c	
+NC_000962	1160855	1161151	Rv1038c	50bp_duplicated	
+NC_000962	1161151	1161297	IG_Rv1038c-Rv1039c	
+NC_000962	1161297	1162472	Rv1039c	PPE family protein	
+NC_000962	1162472	1162549	IG_Rv1039c-Rv1040c	
+NC_000962	1162549	1163376	Rv1040c	PE family protein	
+NC_000962	1163376	1164572	IG_Rv1040c-Rv1041c	
+NC_000962	1164572	1165435	Rv1041c	IS like-2 transposase	
+NC_000962	1165092	1165499	Rv1042c	IS like-2 transposase	
+NC_000962	1169423	1170670	Rv1047	transposase
+NC_000962	1188421	1190424	Rv1067c	PE-PGRS family protein	
+NC_000962	1190424	1190757	IG_Rv1067c-Rv1068c	
+NC_000962	1190757	1192148	Rv1068c	PE-PGRS family protein	
+NC_000962	1211560	1213863	Rv1087	PE-PGRS family protein
+NC_000962	1213863	1214513	IG_Rv1087-Rv1088	
+NC_000962	1214513	1214947	Rv1088	PE family protein
+NC_000962	1214769	1215131	Rv1089	PE family protein
+NC_000962	1216469	1219030	Rv1091	PE-PGRS family protein
+NC_000962	1251617	1252972	Rv1128c	repeat_region	
+NC_000962	1262272	1264128	Rv1135c	PPE family protein	
+NC_000962	1276300	1277748	Rv1148c	50bp_duplicated	
+NC_000962	1277748	1277893	IG_Rv1148c-Rv1149	
+NC_000962	1277893	1278300	Rv1149	transposase
+NC_000962	1278269	1278820	Rv1150	Possible fragment of transposase 
+NC_000962	1298764	1299804	Rv1168c	PPE family protein	
+NC_000962	1299804	1299822	IG_Rv1168c-Rv1169c	
+NC_000962	1299822	1300124	Rv1169c	PE family protein	
+NC_000962	1301755	1302681	Rv1172c	PE family protein	
+NC_000962	1306002	1306201	IG1195_Rv1174c-Rv1175c	
+NC_000962	1339003	1339302	Rv1195	PE family protein
+NC_000962	1339302	1339349	IG_Rv1195-Rv1196	
+NC_000962	1339349	1340524	Rv1196	PPE family protein
+NC_000962	1340524	1340659	IG_Rv1196-Rv1197	
+NC_000962	1340659	1340955	Rv1197	50bp_duplicated
+NC_000962	1340955	1341006	IG_Rv1197-Rv1198	
+NC_000962	1341006	1341290	Rv1198	50bp_duplicated
+NC_000962	1341290	1341358	IG_Rv1198-Rv1199c	
+NC_000962	1341358	1342605	Rv1199c	transposase	
+NC_000962	1357293	1357625	Rv1214c	PE family protein	
+NC_000962	1384989	1386677	Rv1243c	PE-PGRS family protein	
+NC_000962	1441348	1442718	Rv1288	50bp_duplicated
+NC_000962	1450697	1451779	Rv1295	50bp_duplicated
+NC_000962	1468171	1469505	Rv1313c	transposase	
+NC_000962	1479199	1480824	Rv1318c	50bp_duplicated	
+NC_000962	1480824	1480894	IG_Rv1318c-Rv1319c	
+NC_000962	1480894	1482501	Rv1319c	50bp_duplicated	
+NC_000962	1488154	1489965	Rv1325c	PE-PGRS family protein	
+NC_000962	1532443	1533633	Rv1361c	PPE family protein	
+NC_000962	1541994	1542980	Rv1369c	transposase	
+NC_000962	1542929	1543255	Rv1370c	transposase	
+NC_000962	1561464	1561772	Rv1386	PE family protein
+NC_000962	1561769	1563388	Rv1387	PPE family protein
+NC_000962	1572127	1573857	Rv1396c	PE-PGRS family protein	
+NC_000962	1606386	1607972	Rv1430	PE family protein
+NC_000962	1618209	1619684	Rv1441c	PE-PGRS family protein	
+NC_000962	1630638	1634627	Rv1450c	PE-PGRS family protein	
+NC_000962	1636004	1638229	Rv1452c	PE-PGRS family protein	
+NC_000962	1643319	1644260	Rv1458c	50bp_duplicated	
+NC_000962	1655609	1656721	Rv1468c	PE-PGRS family protein	
+NC_000962	1678942	1679172	Rv1489A	50bp_duplicated	
+NC_000962	1684005	1686257	Rv1493	50bp_duplicated
+NC_000962	1751297	1753333	Rv1548c	PPE family protein	
+NC_000962	1761744	1762937	Rv1557	repeat_region
+NC_000962	1762937	1762947	IG_Rv1557-Rv1558	
+NC_000962	1762947	1763393	Rv1558	repeat_region
+NC_000962	1779194	1779298	Rv1572c	repeat_region	
+NC_000962	1779298	1779314	IG_Rv1572c-Rv1573	
+NC_000962	1779314	1779724	Rv1573	phiRV1 phage protein
+NC_000962	1779724	1779930	IG_Rv1573-Rv1574	
+NC_000962	1779930	1780241	Rv1574	repeat_region
+NC_000962	1779930	1780241	Rv1574	phiRV1 phage related protein
+NC_000962	1780199	1780699	Rv1575	repeat_region
+NC_000962	1780199	1780699	Rv1575	phiRV1 phage protein
+NC_000962	1780643	1782064	Rv1576c	phiRV1 phage protein	
+NC_000962	1782064	1782072	IG_Rv1576c-Rv1577c	
+NC_000962	1782072	1782584	Rv1577c	phiRv1 phage protein	
+NC_000962	1782584	1782758	IG_Rv1577c-Rv1578c	
+NC_000962	1782758	1783228	Rv1578c	phiRv1 phage protein	
+NC_000962	1783228	1783309	IG_Rv1578c-Rv1579c	
+NC_000962	1783309	1783623	Rv1579c	phiRv1 phage protein	
+NC_000962	1783620	1783892	Rv1580c	phiRv1 phage protein	
+NC_000962	1783892	1783906	IG_Rv1580c-Rv1581c	
+NC_000962	1783906	1784301	Rv1581c	phiRv1 phage protein	
+NC_000962	1784301	1784497	IG_Rv1581c-Rv1582c	
+NC_000962	1784497	1785912	Rv1582c	phiRv1 phage protein	
+NC_000962	1785912	1786310	Rv1583c	phiRv1 phage protein	
+NC_000962	1786307	1786528	Rv1584c	phiRv1 phage protein	
+NC_000962	1786528	1786584	IG_Rv1584c-Rv1585c	
+NC_000962	1786584	1787099	Rv1585c	phiRv1 phage protein	
+NC_000962	1787096	1788505	Rv1586c	phiRv1 integrase	
+NC_000962	1788162	1789163	Rv1587c	REP13E12 repeat-containing protein	
+NC_000962	1789163	1789168	IG_Rv1587c-Rv1588c	
+NC_000962	1789168	1789836	Rv1588c	REP13E12 repeat-containing protein	
+NC_000962	1855764	1856696	Rv1646	PE family protein
+NC_000962	1862347	1865382	Rv1651c	PE-PGRS family protein	
+NC_000962	1907321	1907593	IG1711_Rv1682-Rv1683	
+NC_000962	1927211	1928575	Rv1702c	repeat_region	
+NC_000962	1931497	1932654	Rv1705c	PPE family protein	
+NC_000962	1932654	1932694	IG_Rv1705c-Rv1706c	
+NC_000962	1932694	1933878	Rv1706c	PPE family protein	
+NC_000962	1981614	1984775	Rv1753c	PPE family protein	
+NC_000962	1987745	1988731	Rv1756c	putative transposase	
+NC_000962	1988680	1989006	Rv1757c	putative transposase	
+NC_000962	1989006	1989042	IG_Rv1757c-Rv1758	
+NC_000962	1989042	1989566	Rv1758	putative transposase
+NC_000962	1989566	1989833	IG_Rv1758-Rv1759c	
+NC_000962	1989833	1992577	Rv1759c	PE-PGRS family protein	
+NC_000962	1996152	1996478	Rv1763	putative transposase
+NC_000962	1996427	1997413	Rv1764	putative transposase
+NC_000962	1997413	1997418	IG_Rv1764-Rv1765c	
+NC_000962	1997418	1998515	Rv1765c	50bp_duplicated	
+NC_000962	1998515	1999142	IG_Rv1765c-Rv1765A	
+NC_000962	1999142	1999357	Rv1765A	transposase	
+NC_000962	2000614	2002470	Rv1768	PE-PGRS family protein
+NC_000962	2025301	2026398	Rv1787	PPE family protein
+NC_000962	2026398	2026477	IG_Rv1787-Rv1788	
+NC_000962	2026477	2026776	Rv1788	PE family protein
+NC_000962	2026776	2026790	IG_Rv1788-Rv1789	
+NC_000962	2026790	2027971	Rv1789	PPE family protein
+NC_000962	2027971	2028425	IG_Rv1789-Rv1790	
+NC_000962	2028425	2029477	Rv1790	PPE family protein
+NC_000962	2029477	2029904	IG_Rv1790-Rv1791	
+NC_000962	2029904	2030203	Rv1791	PE family protein
+NC_000962	2030694	2030978	Rv1793	50bp_duplicated
+NC_000962	2039453	2041420	Rv1800	PPE family protein
+NC_000962	2041420	2042001	IG_Rv1800-Rv1801	
+NC_000962	2042001	2043272	Rv1801	PPE family protein
+NC_000962	2043272	2043384	IG_Rv1801-Rv1802	
+NC_000962	2043384	2044775	Rv1802	PPE family protein
+NC_000962	2044775	2044923	IG_Rv1802-Rv1803c	
+NC_000962	2044923	2046842	Rv1803c	PE-PGRS family protein	
+NC_000962	2048072	2048371	Rv1806	PE family protein
+NC_000962	2048371	2048398	IG_Rv1806-Rv1807	
+NC_000962	2048398	2049597	Rv1807	PPE family protein
+NC_000962	2049597	2049921	IG_Rv1807-Rv1808	
+NC_000962	2049921	2051150	Rv1808	PPE family protein
+NC_000962	2051150	2051282	IG_Rv1808-Rv1809	
+NC_000962	2051282	2052688	Rv1809	PPE family protein
+NC_000962	2061178	2062674	Rv1818c	PE-PGRS family protein	
+NC_000962	2073943	2074437	Rv1829	50bp_duplicated
+NC_000962	2087971	2089518	Rv1840c	PE-PGRS family protein	
+NC_000962	2156706	2157299	Rv1910c	50bp_duplicated	
+NC_000962	2157299	2157382	IG_Rv1910c-Rv1911c	
+NC_000962	2157382	2157987	Rv1911c	50bp_duplicated	
+NC_000962	2162932	2167311	Rv1917c	PPE family protein	
+NC_000962	2167311	2167649	IG_Rv1917c-Rv1918c	
+NC_000962	2167649	2170612	Rv1918c	PPE family protein	
+NC_000962	2195989	2197353	Rv1945	repeat_region
+NC_000962	2226244	2227920	Rv1983	PE-PGRS family protein
+NC_000962	2260665	2261144	Rv2013	transposase
+NC_000962	2261098	2261688	Rv2014	transposase
+NC_000962	2261688	2261816	IG_Rv2014-Rv2015c	
+NC_000962	2261816	2263072	Rv2015c	50bp_duplicated	
+NC_000962	2294531	2306986	Rv2048c	50bp_duplicated	
+NC_000962	2338709	2340874	Rv2082	50bp_duplicated
+NC_000962	2343027	2343332	Rv2085	repeat_region
+NC_000962	2347373	2348554	Rv2090	50bp_duplicated
+NC_000962 	2356729	2358206	Rv2098c	PE-PGRS family protein
+NC_000962	2365465	2365791	Rv2105	transposase
+NC_000962	2365740	2366726	Rv2106	transposase
+NC_000962	2366726	2367359	IG_Rv2106-Rv2107	
+NC_000962	2367359	2367655	Rv2107	PE family protein
+NC_000962	2367655	2367711	IG_Rv2107-Rv2108	
+NC_000962	2367711	2368442	Rv2108	PPE family protein
+NC_000962	2370905	2372569	Rv2112c	50bp_duplicated	
+NC_000962	2381071	2382492	Rv2123	PPE family protein
+NC_000962	2387202	2387972	Rv2126c	PE-PGRS family protein	
+NC_000962	2423240	2424838	Rv2162c	PE-PGRS family protein	
+NC_000962	2430159	2431145	Rv2167c	transposase	
+NC_000962	2431094	2431420	Rv2168c	transposase	
+NC_000962	2439282	2439947	Rv2177c	transposase	
+NC_000962	2459678	2461327	Rv2196	50bp_duplicated
+NC_000962	2530836	2531897	Rv2258c	50bp_duplicated	
+NC_000962	2549124	2550029	Rv2277c	50bp_duplicated	
+NC_000962	2550029	2550065	IG_Rv2277c-Rv2278	
+NC_000962	2550065	2550391	Rv2278	transposase
+NC_000962	2550340	2551326	Rv2279	transposase
+NC_000962	2600731	2601879	Rv2328	PE family protein
+NC_000962	2617667	2618908	Rv2340c	PE-PGRS family protein	
+NC_000962	2625888	2626172	Rv2346c	50bp_duplicated	
+NC_000962	2626172	2626223	IG_Rv2346c-Rv2347c	
+NC_000962	2626223	2626519	Rv2347c	50bp_duplicated	
+NC_000962	2632923	2634098	Rv2352c	PPE family protein	
+NC_000962	2634098	2634528	IG_Rv2352c-Rv2353c	
+NC_000962	2634528	2635592	Rv2353c	PPE family protein	
+NC_000962	2635592	2635628	IG_Rv2353c-Rv2354	
+NC_000962	2635628	2635954	Rv2354	transposase
+NC_000962	2635903	2636889	Rv2355	transposase
+NC_000962	2636889	2637688	IG_Rv2355-Rv2356c	
+NC_000962	2637688	2639535	Rv2356c	PPE family protein	
+NC_000962	2651753	2651938	Rv2371	PE-PGRS family protein
+NC_000962	2692799	2693884	Rv2396	PE-PGRS family protein
+NC_000962	2706017	2706736	Rv2408	PE family protein
+NC_000962	2720776	2721777	Rv2424c	transposase	
+NC_000962	2727336	2727920	Rv2430c	PPE family protein	
+NC_000962	2727920	2727967	IG_Rv2430c-Rv2431c	
+NC_000962	2727967	2728266	Rv2431c	PE family protein	
+NC_000962	2762531	2763175	Rv2460c	repeat_region	
+NC_000962	2763172	2763774	Rv2461c	repeat_region	
+NC_000962	2784657	2785643	Rv2479c	transposase	
+NC_000962	2785592	2785918	Rv2480c	transposase	
+NC_000962	2795301	2797385	Rv2487c	PE-PGRS family protein	
+NC_000962	2800846	2801145	Rv2489c	repeat_region	
+NC_000962	2801145	2801254	IG_Rv2489c-Rv2490c	
+NC_000962	2801254	2806236	Rv2490c	PE-PGRS family protein	
+NC_000962	2828556	2829803	Rv2512c	IS1081 transposase	
+NC_000962	2835785	2837263	Rv2519	PE family protein
+NC_000962	2866468	2867127	Rv2543	50bp_duplicated
+NC_000962	2867124	2867786	Rv2544	50bp_duplicated
+NC_000962	2921551	2923182	Rv2591	PE-PGRS family protein
+NC_000962	2935046	2936788	Rv2608	PPE family protein
+NC_000962	2943600	2944985	Rv2615c	PE-PGRS family protein	
+NC_000962	2960105	2962441	Rv2634c	PE-PGRS family protein	
+NC_000962	2972160	2972486	Rv2648	transposase IS6110
+NC_000962	2972435	2973421	Rv2649	transposase IS6110
+NC_000962	2973421	2973795	IG_Rv2649-Rv2650c	
+NC_000962	2973795	2975234	Rv2650c	phiRv2 prophage protein	
+NC_000962	2975234	2975242	IG_Rv2650c-Rv2651c	
+NC_000962	2975242	2975775	Rv2651c	phiRv2 prophage protease	
+NC_000962	2975775	2975928	IG_Rv2651c-Rv2652c	
+NC_000962	2975928	2976554	Rv2652c	phiRv2 prophage protein	
+NC_000962	2976554	2976586	IG_Rv2652c-Rv2653c	
+NC_000962	2976586	2976909	Rv2653c	phiRv2 prophage protein	
+NC_000962	2976909	2976989	IG_Rv2653c-Rv2654c	
+NC_000962	2976989	2977234	Rv2654c	phiRv2 prophage protein	
+NC_000962	2977231	2978658	Rv2655c	phiRv2 prophage protein	
+NC_000962	2978658	2978660	IG_Rv2655c-Rv2656c	
+NC_000962	2978660	2979052	Rv2656c	phiRv2 prophage protein	
+NC_000962	2979049	2979309	Rv2657c	phiRv2 prophage protein	
+NC_000962	2979691	2980818	Rv2659c	phiRv2 prophage integrase	
+NC_000962	2982699	2982980	Rv2665	50bp_duplicated
+NC_000962	2982980	2983071	IG_Rv2665-Rv2666	
+NC_000962	2983071	2983874	Rv2666	truncated IS1081 transposase
+NC_000962	2989291	2990592	Rv2673	50bp_duplicated
+NC_000962	2996105	2996737	Rv2680	50bp_duplicated
+NC_000962	3005845	3007062	Rv2689c	50bp_duplicated	
+NC_000962	3007062	3007236	IG_Rv2689c-Rv2690c	
+NC_000962	3007236	3009209	Rv2690c	repeat_region	
+NC_000962	3053914	3055491	Rv2741	PE-PGRS family protein
+NC_000962	3076894	3078078	Rv2768c	PPE family protein	
+NC_000962	3078078	3078158	IG_Rv2768c-Rv2769c	
+NC_000962	3078158	3078985	Rv2769c	PE family protein	
+NC_000962	3078985	3079309	IG_Rv2769c-Rv2770c	
+NC_000962	3079309	3080457	Rv2770c	PPE family protein	
+NC_000962	3082352	3082756	Rv2774c	50bp_duplicated	
+NC_000962	3100202	3101581	Rv2791c	transposase	
+NC_000962	3101581	3102162	Rv2792c	resolvase	
+NC_000962	3112867	3113271	Rv2805	50bp_duplicated
+NC_000962	3113658	3114812	Rv2807	50bp_duplicated
+NC_000962	3115741	3116142	Rv2810c	transposase	
+NC_000962	3116818	3118227	Rv2812	transposase
+NC_000962	3120566	3121552	Rv2814c	transposase	
+NC_000962	3121501	3121827	Rv2815c	transposase	
+NC_000962	3132892	3133539	Rv2825c	50bp_duplicated	
+NC_000962	3135788	3136333	Rv2828c	50bp_duplicated	
+NC_000962	3162268	3164115	Rv2853	PE-PGRS family protein
+NC_000962	3170720	3171646	Rv2859c	50bp_duplicated	
+NC_000962	3191644	3192201	Rv2882c	50bp_duplicated	
+NC_000962	3194166	3195548	Rv2885c	transposase	
+NC_000962	3195545	3196432	Rv2886c	resolvase	
+NC_000962	3200794	3202020	Rv2892c	PPE family protein	
+NC_000962	3245445	3251075	Rv2931	50bp_duplicated
+NC_000962	3251072	3255688	Rv2932	50bp_duplicated
+NC_000962	3288464	3289705	Rv2943	IS1533 transposase
+NC_000962	3289705	3290235	Rv2943A	transposase	
+NC_000962	3289790	3290506	Rv2944	IS1533 transposase
+NC_000962	3313283	3313672	Rv2961	transposase
+NC_000962	3318816	3318900	IG3012_Rv2965c-Rv2966c	
+NC_000962	3319468	3319662	IG3013_Rv2966c-Rv2967c	
+NC_000962	3332787	3333788	Rv2977c	50bp_duplicated	
+NC_000962	3333785	3335164	Rv2978c	transposase	
+NC_000962	3335164	3335748	Rv2979c	resolvase	
+NC_000962	3335748	3335960	IG_Rv2979c-Rv2980	
+NC_000962	3335960	3336505	Rv2980	50bp_duplicated
+NC_000962	3376939	3378243	Rv3018c	PPE family protein	
+NC_000962	3378243	3378329	IG_Rv3018c-Rv3018A	
+NC_000962	3378329	3378415	Rv3018A	PE family protein	
+NC_000962	3379376	3380452	Rv3021c	PPE family protein	
+NC_000962	3380440	3380682	Rv3022c	PPE family protein	
+NC_000962	3380679	3380993	Rv3022A	PE family protein	
+NC_000962	3380993	3381375	IG_Rv3022A-Rv3023c	
+NC_000962	3381375	3382622	Rv3023c	transposase	
+NC_000962	3465778	3467091	Rv3097c	PE-PGRS family protein	
+NC_000962	3481451	3482698	Rv3115	transposase
+NC_000962	3490476	3491651	Rv3125c	PPE family protein	
+NC_000962	3501334	3501732	Rv3135	PPE family protein
+NC_000962	3501732	3501794	IG_Rv3135-Rv3136	
+NC_000962	3501794	3502936	Rv3136	PPE family protein
+NC_000962	3510088	3511317	Rv3144c	PPE family protein	
+NC_000962	3527391	3529163	Rv3159c	PPE family protein	
+NC_000962	3551281	3551607	Rv3184	transposase
+NC_000962	3551556	3552542	Rv3185	transposase
+NC_000962	3552542	3552764	IG_Rv3185-Rv3186	
+NC_000962	3552764	3553090	Rv3186	transposase
+NC_000962	3553039	3554025	Rv3187	transposase
+NC_000962	3557311	3558345	Rv3191c	transposase	
+NC_000962	3663689	3664222	Rv3281	50bp_duplicated
+NC_000962	3710433	3710759	Rv3325	transposase
+NC_000962	3710708	3711694	Rv3326	transposase
+NC_000962	3711694	3711749	IG_Rv3326-Rv3327	
+NC_000962	3711749	3713461	Rv3327	transposase
+NC_000962	3729364	3736935	Rv3343c	PPE family protein	
+NC_000962	3736935	3736984	IG_Rv3343c-Rv3344c	
+NC_000962	3736984	3738438	Rv3344c	PE-PGRS family protein	
+NC_000962	3738158	3742774	Rv3345c	PE-PGRS family protein	
+NC_000962	3742774	3743198	IG_Rv3345c-Rv3346c	
+NC_000962	3743198	3743455	Rv3346c	50bp_duplicated	
+NC_000962	3743455	3743711	IG_Rv3346c-Rv3347c	
+NC_000962	3743711	3753184	Rv3347c	PPE family protein	
+NC_000962	3753184	3753765	IG_Rv3347c-Rv3348	
+NC_000962	3753765	3754256	Rv3348	transposase
+NC_000962	3754256	3754293	IG_Rv3348-Rv3349c	
+NC_000962	3754293	3755033	Rv3349c	transposase	
+NC_000962	3755033	3755952	IG_Rv3349c-Rv3350c	
+NC_000962	3755952	3767102	Rv3350c	PPE family protein	
+NC_000962	3769514	3769807	Rv3355c	50bp_duplicated	
+NC_000962	3778568	3780334	Rv3367	PE-PGRS family protein
+NC_000962	3795100	3796086	Rv3380c	transposase	
+NC_000962	3796035	3796361	Rv3381c	transposase	
+NC_000962	3800092	3800796	Rv3386	transposase
+NC_000962	3800786	3801463	Rv3387	transposase
+NC_000962	3801463	3801653	IG_Rv3387-Rv3388	
+NC_000962	3801653	3803848	Rv3388	PE-PGRS family protein
+NC_000962	3841714	3842076	Rv3424c	50bp_duplicated	
+NC_000962	3842076	3842239	IG_Rv3424c-Rv3425	
+NC_000962	3842239	3842769	Rv3425	PPE family protein
+NC_000962	3842769	3843036	IG_Rv3425-Rv3426	
+NC_000962	3843036	3843734	Rv3426	PPE family protein
+NC_000962	3843734	3843885	IG_Rv3426-Rv3427c	
+NC_000962	3843885	3844640	Rv3427c	transposase	
+NC_000962	3844640	3844738	IG_Rv3427c-Rv3428c	
+NC_000962	3844738	3845970	Rv3428c	transposase	
+NC_000962	3845970	3847165	IG_Rv3428c-Rv3429	
+NC_000962	3847165	3847701	Rv3429	PPE family protein
+NC_000962	3847642	3848805	Rv3430c	transposase	
+NC_000962	3848805	3849294	IG_Rv3430c-Rv3431c	
+NC_000962	3849294	3850139	Rv3431c	repeat_region	
+NC_000962	3883525	3884193	Rv3466	repeat_region
+NC_000962	3883964	3884917	Rv3467	repeat_region
+NC_000962	3890830	3891156	Rv3474	transposase IS6110
+NC_000962	3891105	3892091	Rv3475	transposase IS6110
+NC_000962	3894093	3894389	Rv3477	PE family protein
+NC_000962	3894389	3894426	IG_Rv3477-Rv3478	
+NC_000962	3894426	3895607	Rv3478	PE family protein
+NC_000962	3926569	3930714	Rv3507	PE-PGRS family protein
+NC_000962	3930714	3931005	IG_Rv3507-Rv3508	
+NC_000962	3931005	3936710	Rv3508	PE-PGRS family protein
+NC_000962	3939617	3941761	Rv3511	PE-PGRS family protein
+NC_000962	3941724	3944963	Rv3512	PE-PGRS family protein
+NC_000962	3944963	3945092	IG_Rv3512-Rv3513c	
+NC_000962	3945092	3945748	Rv3513c	50bp_duplicated	
+NC_000962	3945748	3945794	IG_Rv3513c-Rv3514	
+NC_000962	3945794	3950263	Rv3514	PE-PGRS family protein
+NC_000962	3950263	3950824	IG_Rv3514-Rv3515c	
+NC_000962	3950824	3952470	Rv3515c	50bp_duplicated	
+NC_000962	3969343	3970563	Rv3532	PPE family protein
+NC_000962	3970563	3970705	IG_Rv3532-Rv3533c	
+NC_000962	3970705	3972453	Rv3533c	PPE family protein	
+NC_000962	3978059	3979498	Rv3539	PPE family protein
+NC_000962	3997980	3999638	Rv3558	PPE family protein
+NC_000962	4031404	4033158	Rv3590c	PE-PGRS family protein	
+NC_000962	4036731	4038050	Rv3595c	PE-PGRS family protein	
+NC_000962	4052950	4053603	Rv3611	50bp_duplicated
+NC_000962	4059984	4060268	Rv3619c	50bp_duplicated	
+NC_000962	4060268	4060295	IG_Rv3619c-Rv3620c	
+NC_000962	4060295	4060591	Rv3620c	50bp_duplicated	
+NC_000962	4060591	4060648	IG_Rv3620c-Rv3621c	
+NC_000962	4060648	4061889	Rv3621c	PPE family protein	
+NC_000962	4061889	4061899	IG_Rv3621c-Rv3622c	
+NC_000962	4061899	4062198	Rv3622c	PE family protein	
+NC_000962	4075752	4076099	Rv3636	transposase
+NC_000962	4076099	4076484	IG_Rv3636-Rv3637	
+NC_000962	4076484	4076984	Rv3637	transposase
+NC_000962	4076984	4077730	Rv3638	transposase
+NC_000962	4077730	4077884	IG_Rv3638-Rv3639c	
+NC_000962	4077884	4078450	Rv3639c	50bp_duplicated	
+NC_000962	4078450	4078520	IG_Rv3639c-Rv3640c	
+NC_000962	4078520	4079749	Rv3640c	transposase	
+NC_000962	4091233	4091517	Rv3650	PE family protein
+NC_000962 	4093632	4093946	Rv3652	PE-PGRS family protein	
+NC_000962	4093940	4094527	Rv3653PE-PGRS family protein	
+NC_000962	4119795	4120955	Rv3680	50bp_duplicated
+NC_000962	4153740	4155674	Rv3710	50bp_duplicated
+NC_000962	4189285	4190232	Rv3738c	PPE family protein	
+NC_000962	4190232	4190284	IG_Rv3738c-Rv3739c	
+NC_000962	4190284	4190517	Rv3739c	PPE family protein	
+NC_000962	4196171	4196506	Rv3746c	PE family protein	
+NC_000962	4252993	4254327	Rv3798	transposase
+NC_000962	4276571	4278085	Rv3812	PE-PGRS family protein
+NC_000962	4299812	4301566	Rv3826	50bp_duplicated
+NC_000962	4301563	4302789	Rv3827c	transposase	
+NC_000962	4302786	4303397	Rv3828c	resolvase	
+NC_000962	4318775	4319266	Rv3844	transposase
+NC_000962	4351075	4352181	Rv3873	PPE family protein
+NC_000962	4353010	4355010	Rv3876	50bp_duplicated
+NC_000962	4374484	4375683	Rv3892c	PPE family protein	
+NC_000962	4375683	4375762	IG_Rv3892c-Rv3893c	
+NC_000962	4375762	4375995	Rv3893c	PE family protein	
diff --git a/notebooks/slurm_scripts/launch_GATK.slurm b/notebooks/slurm_scripts/launch_GATK.slurm
new file mode 100644
index 0000000000000000000000000000000000000000..b7a694649b52bb25bcd310e2479c4d0a40b41ed0
--- /dev/null
+++ b/notebooks/slurm_scripts/launch_GATK.slurm
@@ -0,0 +1,13 @@
+#!/bin/bash 
+
+#SBATCH --job-name=GATK
+#SBATCH --cpus-per-task=1
+#SBATCH --mem-per-cpu=4G
+#SBATCH --time=6:00:00
+#SBATCH --output=GATK.o
+#SBATCH --error=GATK.e
+
+singularity exec container.img gatk-launch -T RealignerTargetCreator -nt 1 -R ~/Workshop_SA/notebooks/reference_genome/MTB_ancestor_reference.fasta -o ERR760779.intervals -I ERR760779.dedup.bam
+
+
+singularity exec container.img gatk-launch --disable_bam_indexing -T IndelRealigner R ~/Workshop_SA/notebooks/reference_genome/MTB_ancestor_reference.fasta -targetIntervals ERR760779.intervals -I ERR760779.dedup.bam -o ERR760779.dedup.realigned.bam
diff --git a/notebooks/slurm_scripts/launch_index2.slurm b/notebooks/slurm_scripts/launch_index2.slurm
new file mode 100644
index 0000000000000000000000000000000000000000..ed6f3c865d0f321007d8787b8b3f657f4f761624
--- /dev/null
+++ b/notebooks/slurm_scripts/launch_index2.slurm
@@ -0,0 +1,7 @@
+#!/bin/bash 
+
+#SBATCH --job-name=index
+#SBATCH --cpus-per-task=1
+#SBATCH --mem-per-cpu=2G
+
+singularity exec /home/container.img samtools index ERR760779.dedup.realigned.bam