Skip to content
Snippets Groups Projects
Commit 116f3f4c authored by Christoph Stritt's avatar Christoph Stritt
Browse files

minor changes

parent fd6c5364
Branches master
Tags v1.4
No related merge requests found
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Create a pangenome graph with PGGB # Create a pangenome graph with PGGB
PGGB requires as input a single fasta file that contains the genome assemblies. PGGB requires as input a single fasta file that contains the genome assemblies.
The file **/data/MTBC.L1.assemblies.fasta** contains all the contigs resulting from the assembly of the 19 L1 strains, while **/data/assembly_summaries.tsv** contains assembly metadata. The file **/data/MTBC.L1.assemblies.fasta** contains all the contigs resulting from the assembly of the 19 L1 strains, while **/data/assembly_summaries.tsv** contains assembly metadata.
Starting from these two files, we do the following: Starting from these two files, we do the following:
- load the metadata, select complete genomes - load the metadata, select complete genomes
- write all complete genomes to a fasta - write all complete genomes to a fasta
❓ What do we mean with 'complete'? ❓ What do we mean with 'complete'?
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Inspect the Flye assembly summary # Inspect the Flye assembly summary
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas import pandas
md = pandas.read_csv('data/assembly_summaries.tsv', sep='\t') md = pandas.read_csv('data/assembly_summaries.tsv', sep='\t')
md.head() md.head()
``` ```
%% Output %% Output
sample #seq_name length cov. circ. repeat mult. alt_group graph_path sample #seq_name length cov. circ. repeat mult. alt_group graph_path
0 PB000195 contig_1 4444823 114 Y N 1 * 1 0 PB000195 contig_1 4444823 114 Y N 1 * 1
1 PB000196 contig_10 4421359 133 N N 1 * 10,41 1 PB000196 contig_10 4421359 133 N N 1 * 10,41
2 PB000196 contig_47 13589 32 N Y 1 48 *,47,* 2 PB000196 contig_47 13589 32 N Y 1 48 *,47,*
3 PB000196 contig_1 10270 45 N Y 1 6 *,1,* 3 PB000196 contig_1 10270 45 N Y 1 6 *,1,*
4 PB000196 contig_42 1662 52 N N 1 * -41,42 4 PB000196 contig_42 1662 52 N N 1 * -41,42
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
How many sequences per strain? How many sequences per strain?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
md['sample'].value_counts() md['sample'].value_counts()
``` ```
%% Output %% Output
sample sample
PB000205 49 PB000205 49
PB000202 7 PB000202 7
PB000196 4 PB000196 4
PB000220 2 PB000220 2
PB000198 2 PB000198 2
PB000207 1 PB000207 1
PB000219 1 PB000219 1
PB000211 1 PB000211 1
PB000210 1 PB000210 1
PB000209 1 PB000209 1
PB000208 1 PB000208 1
PB000195 1 PB000195 1
PB000206 1 PB000206 1
PB000203 1 PB000203 1
PB000201 1 PB000201 1
PB000200 1 PB000200 1
PB000199 1 PB000199 1
PB000197 1 PB000197 1
PB000204 1 PB000204 1
Name: count, dtype: int64 Name: count, dtype: int64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Which sequences are cirularized? Which sequences are cirularized?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
md[md['circ.'] == 'Y'] md[md['circ.'] == 'Y']
``` ```
%% Output %% Output
sample #seq_name length cov. circ. repeat mult. alt_group \ sample #seq_name length cov. circ. repeat mult. alt_group \
0 PB000195 contig_1 4444823 114 Y N 1 * 0 PB000195 contig_1 4444823 114 Y N 1 *
5 PB000197 contig_1 4410932 129 Y N 1 * 5 PB000197 contig_1 4410932 129 Y N 1 *
8 PB000199 contig_1 4426248 112 Y N 1 * 8 PB000199 contig_1 4426248 112 Y N 1 *
9 PB000200 contig_1 4428581 84 Y N 1 * 9 PB000200 contig_1 4428581 84 Y N 1 *
10 PB000201 contig_1 4408479 96 Y N 1 * 10 PB000201 contig_1 4408479 96 Y N 1 *
18 PB000203 contig_1 4434008 47 Y N 1 * 18 PB000203 contig_1 4434008 47 Y N 1 *
19 PB000204 contig_1 4419776 74 Y N 1 * 19 PB000204 contig_1 4419776 74 Y N 1 *
20 PB000205 contig_50 4420883 37 Y Y 2 * 20 PB000205 contig_50 4420883 37 Y Y 2 *
69 PB000206 contig_1 4440794 119 Y N 1 * 69 PB000206 contig_1 4440794 119 Y N 1 *
70 PB000207 contig_1 4424460 85 Y N 1 * 70 PB000207 contig_1 4424460 85 Y N 1 *
71 PB000208 contig_1 4419922 119 Y N 1 * 71 PB000208 contig_1 4419922 119 Y N 1 *
72 PB000209 contig_1 4443791 80 Y N 1 * 72 PB000209 contig_1 4443791 80 Y N 1 *
73 PB000210 contig_1 4444571 115 Y N 1 * 73 PB000210 contig_1 4444571 115 Y N 1 *
74 PB000211 contig_1 4441994 120 Y N 1 * 74 PB000211 contig_1 4441994 120 Y N 1 *
75 PB000219 contig_1 4410932 101 Y N 1 * 75 PB000219 contig_1 4410932 101 Y N 1 *
graph_path graph_path
0 1 0 1
5 1 5 1
8 1 8 1
9 1 9 1
10 1 10 1
18 1 18 1
19 1 19 1
20 50 20 50
69 1 69 1
70 1 70 1
71 1 71 1
72 1 72 1
73 1 73 1
74 1 74 1
75 1 75 1
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Sequences larger than 4 Mb Sequences larger than 4 Mb
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
md[md['length']>4e6] md[md['length']>4e6]
``` ```
%% Output %% Output
sample #seq_name length cov. circ. repeat mult. alt_group \ sample #seq_name length cov. circ. repeat mult. alt_group \
0 PB000195 contig_1 4444823 114 Y N 1 * 0 PB000195 contig_1 4444823 114 Y N 1 *
1 PB000196 contig_10 4421359 133 N N 1 * 1 PB000196 contig_10 4421359 133 N N 1 *
5 PB000197 contig_1 4410932 129 Y N 1 * 5 PB000197 contig_1 4410932 129 Y N 1 *
6 PB000198 contig_1 4332805 86 N N 1 * 6 PB000198 contig_1 4332805 86 N N 1 *
8 PB000199 contig_1 4426248 112 Y N 1 * 8 PB000199 contig_1 4426248 112 Y N 1 *
9 PB000200 contig_1 4428581 84 Y N 1 * 9 PB000200 contig_1 4428581 84 Y N 1 *
10 PB000201 contig_1 4408479 96 Y N 1 * 10 PB000201 contig_1 4408479 96 Y N 1 *
18 PB000203 contig_1 4434008 47 Y N 1 * 18 PB000203 contig_1 4434008 47 Y N 1 *
19 PB000204 contig_1 4419776 74 Y N 1 * 19 PB000204 contig_1 4419776 74 Y N 1 *
20 PB000205 contig_50 4420883 37 Y Y 2 * 20 PB000205 contig_50 4420883 37 Y Y 2 *
69 PB000206 contig_1 4440794 119 Y N 1 * 69 PB000206 contig_1 4440794 119 Y N 1 *
70 PB000207 contig_1 4424460 85 Y N 1 * 70 PB000207 contig_1 4424460 85 Y N 1 *
71 PB000208 contig_1 4419922 119 Y N 1 * 71 PB000208 contig_1 4419922 119 Y N 1 *
72 PB000209 contig_1 4443791 80 Y N 1 * 72 PB000209 contig_1 4443791 80 Y N 1 *
73 PB000210 contig_1 4444571 115 Y N 1 * 73 PB000210 contig_1 4444571 115 Y N 1 *
74 PB000211 contig_1 4441994 120 Y N 1 * 74 PB000211 contig_1 4441994 120 Y N 1 *
75 PB000219 contig_1 4410932 101 Y N 1 * 75 PB000219 contig_1 4410932 101 Y N 1 *
76 PB000220 contig_1 4319282 88 N N 1 * 76 PB000220 contig_1 4319282 88 N N 1 *
graph_path graph_path
0 1 0 1
1 10,41 1 10,41
5 1 5 1
6 2,1,2 6 2,1,2
8 1 8 1
9 1 9 1
10 1 10 1
18 1 18 1
19 1 19 1
20 50 20 50
69 1 69 1
70 1 70 1
71 1 71 1
72 1 72 1
73 1 73 1
74 1 74 1
75 1 75 1
76 2,1,2 76 2,1,2
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
For which samples do we lack a circular genome? For which samples do we lack a circular genome?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
samples_circ = md['sample'][md['circ.'] == 'Y'] samples_circ = md['sample'][md['circ.'] == 'Y']
samples_noncirc = md['sample'][md['sample'].isin(samples_circ) == False] samples_noncirc = md['sample'][md['sample'].isin(samples_circ) == False]
print(set(samples_noncirc)) print(set(samples_noncirc))
``` ```
%% Output %% Output
{'PB000220', 'PB000202', 'PB000198', 'PB000196'} {'PB000220', 'PB000202', 'PB000198', 'PB000196'}
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
❓ What is the matter with these genomes? ❓ What is the matter with these genomes?
❓ Should we include them in the graph? ❓ Should we include them in the graph?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
md[md['sample'] == 'PB000205'] md[md['sample'] == 'PB000205']
``` ```
%% Output %% Output
sample #seq_name length cov. circ. repeat mult. alt_group \ sample #seq_name length cov. circ. repeat mult. alt_group \
20 PB000205 contig_50 4420883 37 Y Y 2 * 20 PB000205 contig_50 4420883 37 Y Y 2 *
21 PB000205 contig_49 216232 6 N Y 1 * 21 PB000205 contig_49 216232 6 N Y 1 *
22 PB000205 contig_53 163344 6 N Y 1 * 22 PB000205 contig_53 163344 6 N Y 1 *
23 PB000205 contig_51 142109 6 N Y 1 * 23 PB000205 contig_51 142109 6 N Y 1 *
24 PB000205 contig_9 131264 6 N Y 1 * 24 PB000205 contig_9 131264 6 N Y 1 *
25 PB000205 contig_8 120360 6 N Y 1 * 25 PB000205 contig_8 120360 6 N Y 1 *
26 PB000205 contig_37 117735 6 N Y 1 * 26 PB000205 contig_37 117735 6 N Y 1 *
27 PB000205 contig_2 117435 7 N Y 1 * 27 PB000205 contig_2 117435 7 N Y 1 *
28 PB000205 contig_33 104292 6 N Y 1 * 28 PB000205 contig_33 104292 6 N Y 1 *
29 PB000205 contig_44 102971 6 N Y 1 * 29 PB000205 contig_44 102971 6 N Y 1 *
30 PB000205 contig_31 93943 5 N Y 1 * 30 PB000205 contig_31 93943 5 N Y 1 *
31 PB000205 contig_6 89619 5 N Y 1 * 31 PB000205 contig_6 89619 5 N Y 1 *
32 PB000205 contig_5 87578 6 N Y 1 * 32 PB000205 contig_5 87578 6 N Y 1 *
33 PB000205 contig_23 85152 5 N Y 1 * 33 PB000205 contig_23 85152 5 N Y 1 *
34 PB000205 contig_13 84548 5 N Y 1 * 34 PB000205 contig_13 84548 5 N Y 1 *
35 PB000205 contig_19 76959 5 N Y 1 * 35 PB000205 contig_19 76959 5 N Y 1 *
36 PB000205 contig_54 74739 5 N Y 1 * 36 PB000205 contig_54 74739 5 N Y 1 *
37 PB000205 contig_52 72581 5 N Y 1 * 37 PB000205 contig_52 72581 5 N Y 1 *
38 PB000205 contig_21 69776 6 N Y 1 * 38 PB000205 contig_21 69776 6 N Y 1 *
39 PB000205 contig_20 68663 5 N Y 1 * 39 PB000205 contig_20 68663 5 N Y 1 *
40 PB000205 contig_22 68060 6 N Y 1 * 40 PB000205 contig_22 68060 6 N Y 1 *
41 PB000205 contig_15 67068 6 N Y 1 * 41 PB000205 contig_15 67068 6 N Y 1 *
42 PB000205 contig_25 66397 7 N Y 1 * 42 PB000205 contig_25 66397 7 N Y 1 *
43 PB000205 contig_12 55613 7 N Y 1 * 43 PB000205 contig_12 55613 7 N Y 1 *
44 PB000205 contig_18 52108 5 N Y 1 * 44 PB000205 contig_18 52108 5 N Y 1 *
45 PB000205 contig_17 51926 6 N Y 1 * 45 PB000205 contig_17 51926 6 N Y 1 *
46 PB000205 contig_1 47711 6 N Y 1 * 46 PB000205 contig_1 47711 6 N Y 1 *
47 PB000205 contig_48 45758 6 N Y 1 * 47 PB000205 contig_48 45758 6 N Y 1 *
48 PB000205 contig_42 45451 5 N Y 1 * 48 PB000205 contig_42 45451 5 N Y 1 *
49 PB000205 contig_38 45013 6 N Y 1 * 49 PB000205 contig_38 45013 6 N Y 1 *
50 PB000205 contig_43 41645 5 N Y 1 * 50 PB000205 contig_43 41645 5 N Y 1 *
51 PB000205 contig_34 39550 6 N Y 1 * 51 PB000205 contig_34 39550 6 N Y 1 *
52 PB000205 contig_26 38539 7 N Y 1 * 52 PB000205 contig_26 38539 7 N Y 1 *
53 PB000205 contig_30 37006 7 N Y 1 * 53 PB000205 contig_30 37006 7 N Y 1 *
54 PB000205 contig_47 35474 6 N Y 1 * 54 PB000205 contig_47 35474 6 N Y 1 *
55 PB000205 contig_28 33249 7 N Y 1 * 55 PB000205 contig_28 33249 7 N Y 1 *
56 PB000205 contig_32 33146 5 N Y 1 * 56 PB000205 contig_32 33146 5 N Y 1 *
57 PB000205 contig_41 32796 5 N Y 1 * 57 PB000205 contig_41 32796 5 N Y 1 *
58 PB000205 contig_3 31924 7 N Y 1 * 58 PB000205 contig_3 31924 7 N Y 1 *
59 PB000205 contig_40 31087 6 N Y 1 * 59 PB000205 contig_40 31087 6 N Y 1 *
60 PB000205 contig_16 28303 6 N Y 1 * 60 PB000205 contig_16 28303 6 N Y 1 *
61 PB000205 contig_36 26152 6 N Y 1 * 61 PB000205 contig_36 26152 6 N Y 1 *
62 PB000205 contig_46 24509 6 N Y 1 * 62 PB000205 contig_46 24509 6 N Y 1 *
63 PB000205 contig_45 19766 6 N Y 1 * 63 PB000205 contig_45 19766 6 N Y 1 *
64 PB000205 contig_14 15999 5 N Y 1 * 64 PB000205 contig_14 15999 5 N Y 1 *
65 PB000205 contig_24 15160 8 N Y 1 * 65 PB000205 contig_24 15160 8 N Y 1 *
66 PB000205 contig_29 13880 6 N Y 1 * 66 PB000205 contig_29 13880 6 N Y 1 *
67 PB000205 contig_39 12605 8 N Y 1 * 67 PB000205 contig_39 12605 8 N Y 1 *
68 PB000205 contig_4 11914 6 N Y 1 * 68 PB000205 contig_4 11914 6 N Y 1 *
graph_path graph_path
20 50 20 50
21 *,49,* 21 *,49,*
22 *,53,* 22 *,53,*
23 *,51,* 23 *,51,*
24 *,9,* 24 *,9,*
25 *,8,* 25 *,8,*
26 *,37,* 26 *,37,*
27 *,2,* 27 *,2,*
28 *,33,* 28 *,33,*
29 *,44,* 29 *,44,*
30 *,31,* 30 *,31,*
31 *,6,* 31 *,6,*
32 *,5,* 32 *,5,*
33 *,23,* 33 *,23,*
34 *,13,* 34 *,13,*
35 *,19,* 35 *,19,*
36 *,54,* 36 *,54,*
37 *,52,* 37 *,52,*
38 *,21,* 38 *,21,*
39 *,20,* 39 *,20,*
40 *,22,* 40 *,22,*
41 *,15,* 41 *,15,*
42 *,25,* 42 *,25,*
43 *,12,* 43 *,12,*
44 *,18,* 44 *,18,*
45 *,17,* 45 *,17,*
46 *,1,* 46 *,1,*
47 *,48,* 47 *,48,*
48 *,42,* 48 *,42,*
49 *,38,* 49 *,38,*
50 *,43,* 50 *,43,*
51 *,34,* 51 *,34,*
52 *,26,* 52 *,26,*
53 *,30,* 53 *,30,*
54 *,47,* 54 *,47,*
55 *,28,* 55 *,28,*
56 *,32,* 56 *,32,*
57 *,41,* 57 *,41,*
58 *,3,* 58 *,3,*
59 *,40,* 59 *,40,*
60 *,16,* 60 *,16,*
61 *,36,* 61 *,36,*
62 *,46,* 62 *,46,*
63 *,45,* 63 *,45,*
64 *,14,* 64 *,14,*
65 *,24,* 65 *,24,*
66 *,29,* 66 *,29,*
67 *,39,* 67 *,39,*
68 *,4,* 68 *,4,*
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Combine complete genomes # Combine complete genomes
Let us be liberal and just use all sequences > 4 Mb. We can add a tag to the name of the 'unclean' ones. Let us be liberal and just use all sequences > 4 Mb. We can add a tag to the name of the 'unclean' ones.
First a quick look at the fasta headers. First a quick look at the fasta headers.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
%%bash %%bash
grep ^\> data/MTBC.L1.assemblies.fasta grep ^\> data/MTBC.L1.assemblies.fasta
``` ```
%% Output %% Output
>PB000195_1 >PB000195_1
>PB000196_1 >PB000196_1
>PB000196_2 >PB000196_2
>PB000196_3 >PB000196_3
>PB000197_1 >PB000197_1
>PB000198_1 >PB000198_1
>PB000198_2 >PB000198_2
>PB000199_1 >PB000199_1
>PB000200_1 >PB000200_1
>PB000201_1 >PB000201_1
>PB000202_1 >PB000202_1
>PB000202_2 >PB000202_2
>PB000202_3 >PB000202_3
>PB000202_4 >PB000202_4
>PB000202_5 >PB000202_5
>PB000202_6 >PB000202_6
>PB000202_7 >PB000202_7
>PB000203_1 >PB000203_1
>PB000204_1 >PB000204_1
>PB000205_1 >PB000205_1
>PB000205_2 >PB000205_2
>PB000205_3 >PB000205_3
>PB000205_4 >PB000205_4
>PB000205_5 >PB000205_5
>PB000205_6 >PB000205_6
>PB000205_7 >PB000205_7
>PB000205_8 >PB000205_8
>PB000205_9 >PB000205_9
>PB000205_10 >PB000205_10
>PB000205_11 >PB000205_11
>PB000205_12 >PB000205_12
>PB000205_13 >PB000205_13
>PB000205_14 >PB000205_14
>PB000205_15 >PB000205_15
>PB000205_16 >PB000205_16
>PB000205_17 >PB000205_17
>PB000205_18 >PB000205_18
>PB000205_19 >PB000205_19
>PB000205_20 >PB000205_20
>PB000205_21 >PB000205_21
>PB000205_22 >PB000205_22
>PB000205_23 >PB000205_23
>PB000205_24 >PB000205_24
>PB000205_25 >PB000205_25
>PB000205_26 >PB000205_26
>PB000205_27 >PB000205_27
>PB000205_28 >PB000205_28
>PB000205_29 >PB000205_29
>PB000205_30 >PB000205_30
>PB000205_31 >PB000205_31
>PB000205_32 >PB000205_32
>PB000205_33 >PB000205_33
>PB000205_34 >PB000205_34
>PB000205_35 >PB000205_35
>PB000205_36 >PB000205_36
>PB000205_37 >PB000205_37
>PB000205_38 >PB000205_38
>PB000205_39 >PB000205_39
>PB000205_40 >PB000205_40
>PB000205_41 >PB000205_41
>PB000205_42 >PB000205_42
>PB000205_43 >PB000205_43
>PB000205_44 >PB000205_44
>PB000205_45 >PB000205_45
>PB000205_46 >PB000205_46
>PB000205_47 >PB000205_47
>PB000205_48 >PB000205_48
>PB000205_49 >PB000205_49
>PB000206_1 >PB000206_1
>PB000207_1 >PB000207_1
>PB000208_1 >PB000208_1
>PB000209_1 >PB000209_1
>PB000210_1 >PB000210_1
>PB000211_1 >PB000211_1
>PB000219_1 >PB000219_1
>PB000220_1 >PB000220_1
>PB000220_2 >PB000220_2
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from Bio import SeqIO from Bio import SeqIO
assemblies_all = 'data/MTBC.L1.assemblies.fasta' assemblies_all = 'data/MTBC.L1.assemblies.fasta'
assemblies_pggb = open('results/assemblies_4mb.fasta', 'w') assemblies_pggb = open('results/assemblies_4mb.fasta', 'w')
# All sequences > 4Mb # All sequences > 4Mb
md_4mb = md[md['length']>4e6] md_4mb = md[md['length']>4e6]
for rec in SeqIO.parse(assemblies_all, "fasta"): for rec in SeqIO.parse(assemblies_all, "fasta"):
if len(rec.seq) < 4e6: if len(rec.seq) < 4e6:
continue continue
strain = rec.id.split('_')[0] strain = rec.id.split('_')[0]
strain_circ = md_4mb['circ.'][md_4mb['sample'] == strain].to_string(index=False) strain_circ = md_4mb['circ.'][md_4mb['sample'] == strain].to_string(index=False)
strain_repeat = md_4mb['repeat'][md_4mb['sample'] == strain].to_string(index=False) strain_repeat = md_4mb['repeat'][md_4mb['sample'] == strain].to_string(index=False)
new_id = strain new_id = strain
if strain_circ == 'N': if strain_circ == 'N':
new_id += '#circNo' new_id += '#circNo'
if strain_repeat == 'Y': if strain_repeat == 'Y':
new_id += '#repeatY' new_id += '#repeatY'
rec.id = new_id rec.id = new_id
rec.description = '' rec.description = ''
SeqIO.write(rec, assemblies_pggb, 'fasta') SeqIO.write(rec, assemblies_pggb, 'fasta')
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Run PGGB # Run PGGB
Important paramters: Important paramters:
-i : path to combined assemblies -i : path to combined assemblies
-o : output directory name -o : output directory name
-n : Number of haplotypes -n : Number of haplotypes
-p : Similarity of contigs -p : Similarity of contigs
-s : Maximum length of repeats in the genome -s : Maximum length of repeats in the genome
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Create slurm script Create slurm script
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
%%bash %%bash
echo '\ echo '\
#!/bin/bash #!/bin/bash
#SBATCH --cpus-per-task=20 #SBATCH --cpus-per-task=20
#SBATCH --mem-per-cpu=1G #SBATCH --mem-per-cpu=1G
#SBATCH --time=06:00:00 #SBATCH --time=06:00:00
#SBATCH --qos=6hours #SBATCH --qos=6hours
#SBATCH --output=stdout.%j #SBATCH --output=stdout.%j
#SBATCH --error=stderr.%j #SBATCH --error=stderr.%j
singularity exec /scicore/home/gagneux/GROUP/PacbioSnake_resources/containers/pggb_latest.sif pggb \ singularity exec /scicore/home/gagneux/GROUP/PacbioSnake_resources/containers/pggb_latest.sif pggb \
-i assemblies_combined.fasta.gz \ -i assemblies_combined.fasta.gz \
-o ./pggb \ -o ./pggb_L1 \
-m \ -m \
-t 20 \ -t 20 \
-n 52 \ -n 18 \
-p 99 \ -p 99 \
-s 5k -s 5k' > run_pggb.slrm
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Variant calling Submit it
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
%%bash %%bash
REF= sbatch run_pggb.slrm
/scicore/home/gagneux/GROUP/PacbioSnake_resources/containers/pggb_latest.sif vg deconstruct
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Visualization # Visualization
![](pics/is6110_ppe.small.png)
https://www.evomicslab.org/app/vrpg/
......
workshop/pics/is6110_ppe.small.png

37.7 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment