Skip to content
Snippets Groups Projects
Commit dca40749 authored by Christoph Stritt's avatar Christoph Stritt
Browse files

Duplicate read removal step added in circularize rule

parent 3e441ff6
Branches
No related tags found
No related merge requests found
...@@ -16,7 +16,7 @@ default-resources: ...@@ -16,7 +16,7 @@ default-resources:
restart-times: 3 restart-times: 3
max-jobs-per-second: 10 max-jobs-per-second: 10
max-status-checks-per-second: 1 max-status-checks-per-second: 1
local-cores: 1 local-cores: 20
latency-wait: 60 latency-wait: 60
jobs: 500 jobs: 500
keep-going: True keep-going: True
......
...@@ -2,8 +2,10 @@ ...@@ -2,8 +2,10 @@
# #
############################## ##############################
samples: config/samples.tsv samples: config/samples.tsv # overwritten by run_assembly_pipeline.py
outdir: ./results outdir: ./results # overwritten by run_assembly_pipeline.py
annotate: "No"
ref: ref:
genome_size: 4.4m genome_size: 4.4m
...@@ -12,7 +14,7 @@ ref: ...@@ -12,7 +14,7 @@ ref:
bakta_db: /scicore/home/gagneux/GROUP/PacbioSnake_resources/databases/bakta_db bakta_db: /scicore/home/gagneux/GROUP/PacbioSnake_resources/databases/bakta_db
container: /scicore/home/gagneux/GROUP/PacbioSnake_resources/containers/assemblySC.sif container: /scicore/home/gagneux/GROUP/PacbioSnake_resources/containers/assemblySC.sif
threads_per_job: 4 threads_per_job: 10 # Max. 20
assembly_iterations: 3 assembly_iterations: 3
......
...@@ -24,8 +24,27 @@ rule circlator_bam2reads: ...@@ -24,8 +24,27 @@ rule circlator_bam2reads:
""" """
rule circlator_removeduplicates:
input: config["outdir"] +"/{sample}/circlator/02.bam2reads.fasta"
output: config["outdir"] +"/{sample}/circlator/02.bam2reads.nodup.fasta"
run:
import sys
from Bio import SeqIO
record_dict = {}
for record in SeqIO.parse(input[0], "fasta"):
record_dict[record.id] = record
# record_dict = SeqIO.to_dict(SeqIO.parse(input[0], "fasta")) # Does not allow duplicate entries...
with open(output[0], "w") as output_handle:
SeqIO.write(record_dict.values(), output_handle, "fasta")
rule circlator_localassembly: rule circlator_localassembly:
input: config["outdir"] + "/{sample}/circlator/02.bam2reads.fasta" input: config["outdir"] + "/{sample}/circlator/02.bam2reads.nodup.fasta"
output: config["outdir"] + "/{sample}/circlator/03.assemble/assembly.fasta" output: config["outdir"] + "/{sample}/circlator/03.assemble/assembly.fasta"
params: params:
outdir = config["outdir"] + "/{sample}/circlator/03.assemble", outdir = config["outdir"] + "/{sample}/circlator/03.assemble",
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment