Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
R
Read sequencer
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Container registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
zavolan_group
tools
Read sequencer
Commits
1e148db1
Commit
1e148db1
authored
2 years ago
by
Christoph Harmel
Browse files
Options
Downloads
Patches
Plain Diff
fix: relocated import statements, remove types in docs as type hints are in place
parent
ea45d048
No related branches found
No related tags found
1 merge request
!26
feat: ReadSequencer class rewritten, updated CLI
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
read_sequencer_package/read_sequencer.py
+15
-14
15 additions, 14 deletions
read_sequencer_package/read_sequencer.py
with
15 additions
and
14 deletions
read_sequencer_package/read_sequencer.py
+
15
−
14
View file @
1e148db1
import
logging
import
logging
from
random
import
choice
,
gauss
from
textwrap
import
wrap
LOG
=
logging
.
getLogger
(
__name__
)
LOG
=
logging
.
getLogger
(
__name__
)
def
read_in_fasta
(
file_path
:
str
)
->
dict
[
str
,
str
]:
def
read_in_fasta
(
file_path
:
str
)
->
dict
[
str
,
str
]:
"""
"""
This function reads in FASTA files.
This function reads in FASTA files.
Args:
Args:
file_path
(str)
: A file path directing to the fasta file.
file_path: A file path directing to the fasta file.
Returns:
Returns:
Dict: It returns a dictionary with sequences.
Dict: It returns a dictionary with sequences.
...
@@ -31,14 +35,13 @@ def read_sequence(seq:str, read_length:int) -> str:
...
@@ -31,14 +35,13 @@ def read_sequence(seq:str, read_length:int) -> str:
smaller than the requested length or cuts the sequence if its longer.
smaller than the requested length or cuts the sequence if its longer.
Args:
Args:
seq
(str)
: the sequence to read
seq: the sequence to read
read_length
(int)
: length of reads
read_length: length of reads
Returns:
Returns:
str: returns sequenced element
str: returns sequenced element
"""
"""
from
random
import
choice
bases
:
list
[
str
]
=
[
"
A
"
,
"
T
"
,
"
C
"
,
"
G
"
]
bases
:
list
[
str
]
=
[
"
A
"
,
"
T
"
,
"
C
"
,
"
G
"
]
sequenced
:
str
=
''
sequenced
:
str
=
''
if
read_length
>
len
(
seq
):
if
read_length
>
len
(
seq
):
...
@@ -57,8 +60,8 @@ def simulate_sequencing(sequences: dict[str,str], read_length: int) -> dict[str,
...
@@ -57,8 +60,8 @@ def simulate_sequencing(sequences: dict[str,str], read_length: int) -> dict[str,
Simulates sequencing.
Simulates sequencing.
Args:
Args:
sequences
(dict)
: Dictionary of sequences to sequence.
sequences: Dictionary of sequences to sequence.
read_length
(int)
: length of reads
read_length: length of reads
Returns:
Returns:
dict: of n sequences as values
dict: of n sequences as values
...
@@ -75,14 +78,13 @@ def generate_sequences(n: int, mean: int, sd: int) -> dict[str,str]:
...
@@ -75,14 +78,13 @@ def generate_sequences(n: int, mean: int, sd: int) -> dict[str,str]:
Generates random sequences.
Generates random sequences.
Args:
Args:
n
(int)
: Amount of sequences to generate.
n: Amount of sequences to generate.
mean
(int)
: mean length of sequence (gaussian distribution).
mean: mean length of sequence (gaussian distribution).
sd
(float)
: standard deviation of length of sequence (gaussian distribution).
sd: standard deviation of length of sequence (gaussian distribution).
Returns:
Returns:
dict: of n sequences
dict: of n sequences
"""
"""
from
random
import
choice
,
gauss
LOG
.
info
(
"
Generating random sequences.
"
)
LOG
.
info
(
"
Generating random sequences.
"
)
sequences
:
dict
[
str
,
str
]
=
{}
sequences
:
dict
[
str
,
str
]
=
{}
for
i
in
range
(
n
):
for
i
in
range
(
n
):
...
@@ -94,18 +96,17 @@ def generate_sequences(n: int, mean: int, sd: int) -> dict[str,str]:
...
@@ -94,18 +96,17 @@ def generate_sequences(n: int, mean: int, sd: int) -> dict[str,str]:
sequences
[
key
]
=
seq
sequences
[
key
]
=
seq
return
sequences
return
sequences
def
write_fasta
(
sequences
:
dict
[
str
,
str
],
file_path
:
str
):
def
write_fasta
(
sequences
:
dict
[
str
,
str
],
file_path
:
str
)
->
None
:
"""
"""
Takes a dictionary and writes it to a fasta file.
Takes a dictionary and writes it to a fasta file.
Must specify the filename when calling the function.
Must specify the filename when calling the function.
Args:
Args:
sequences
(dict)
: Dictionary of sequence.
sequences: Dictionary of sequence.
file_path
(str)
: A file path directing to the output folder.
file_path: A file path directing to the output folder.
"""
"""
LOG
.
info
(
"
Writing FASTA file.
"
)
LOG
.
info
(
"
Writing FASTA file.
"
)
from
textwrap
import
wrap
with
open
(
file_path
,
"
w
"
)
as
outfile
:
with
open
(
file_path
,
"
w
"
)
as
outfile
:
for
key
,
value
in
sequences
.
items
():
for
key
,
value
in
sequences
.
items
():
outfile
.
write
(
key
+
"
\n
"
)
outfile
.
write
(
key
+
"
\n
"
)
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment