Skip to content
Snippets Groups Projects
Commit 3d5456c0 authored by Christoph Harmel's avatar Christoph Harmel
Browse files

feature: added run_read_sequencer, simulate_sequencing and refactoring to cli.py, modules.py

parent c46cf16d
No related branches found
No related tags found
1 merge request!10feature added: Simulate sequencing upper level function, restructuring of package modules and functions
This commit is part of merge request !10. Comments created here will be created in the context of that merge request.
>1|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 481 bp tgagcactcggtgccaagggcggggatacacagatggttggctgatacaaccgggacttaaattccctagactagatctgtgttggaacgcctctctacgagaaggcgaacgaactggcgccgaggcgatcgctaacatcttcgtctcgcttgaaccacacaatggatgattcctccctaggggtttgacaatcaacctggatagcgtttaatatagatggctggttgatttgtaaggccttcacagactactcagagcaataagtgaccccccaacaatcagaggctgatcctctgctccgaaggcagcactcatcatcggtattctgttcgctagaacagatggaatgcatgcgccccgctaagtttgattgagttaaacttattgtcttatagccatagccaaggtatctatggtgtcgtacgtgacagttccgtgtaggcatgccatcccgcctcatgcgtcatgtcatactgaggc >2|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 495 bp ctgaatcaggtgtaggttctttttacgtcgtttaaggagctacacggtatcttgttttcagttaaggtgccacacccccgggtggatcatccgtcagctttcctacaattaggtaactggcgggatcatttagtcttgtattaagacgctcgcgcccggggcggccggcttgtttgtggagagaaacaacaagtctgagtatagattaaatacaactggtttactggcaagtcagcgcgtaacaaccggtgagccgctgcgcatgcttactgcaatgaacatcttggcacgatcctgcgatagcgtgccctgacccgtgcacctcgtcggtgaatttcgtcgaacaagcggatcgccacgccacgtgagatcaagccaaaacacaaaaaccacaggcaatagcgacgctgaagtgtcatttcctacctaaaacttcggttcttcttcttaagagggcattaagtccggataactactgcatggcacctgtgtatg >3|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 193 bp acttcagtactggaaggatctaggaaccattaatgcgagtgtggtgacgccagacgacccccggtgttctgccaccttctttggataggagaaccgtcactcgccccggaggccccacggataagaagggtatcttgtgatcacgcgaatgactcacttgcgtaagtaatctaactttgtttttcgctataaa >4|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 625 bp acgtctggagcgtgggttgacccctgtacatggttctttccggatccttaacgtgccgatacaactcaaaggtaactgtgcttaccacttccgaagctacatgcctctaacaaagtactttcgaggaggcactcaacccccggagatgctttgcgcggaagcagagatcgctgctcaaaatttggaatcactttcgtgcgagacccaaacaatttatggtggattcaagcgaacgagtcatgattacagatctatcaatcgaggagaggacggcttcgccgtttccttttaatgtgaaactagagccttcctcaatagtgaggcccttgcccggtgcccgggattggctcaaaagtaccggctcaggaagtctctcaactgccaagttggttaaagtagcttcggcgtaaggaacccgaccgaccatcagtgtcatacaaggaaatatttccaggaacctagtttatcttgaagttctgaatgagttaggtaatgtcggcccttcatgacagggggacgtgctcgcgagctcaaaccaagggggctaagggcgacggtgcttagtcatctactagatccacccacatgacgattgtgtgggtcctttcagccttttacttctcgc >5|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 845 bp agagcgtacggcgcgcatcgtataccctacgagggcggcgtgtggaggaacgctgggctgacactgtagaagattagatacacttgtccctaaaattaacccttaaccgctattagccgtgaacgcttcctaatatttcaagccgtatagctaagtggagaatgtggagccctggtcaaatcacgagccaattagccctagacggacagcacatctcgtcgcgttaagcggaacactcagcttttattacctagtgctcagcctggtttccatatgctctaaccgaactgatgcatacttgggtctgactaagggccatggttcgcgtcaagcaggccgggttcagaagccctggttgaggggcggatactccagtcgcccgcaggtcgcgatccctgggtttaaactacttcacgacacgtacaaacacttactacctacccctacaacactgagtgaaaagtgctagctaagtatgtccgtcgagataggactcgactttagaggtctgcacgaatgtcctggtaggatgcccaccaaagggaataatccttcgggtacgttttgggcaggtgtgacgtgagaaatccgcacccttatcgccgtaaaggttatttgcggggtgcgtggttacttgagttgtccctctcgagcggggtaacaaaaacctactgtatatctagtctggtccgaaattctttatgctgccgacatgctgcgacccgactacgtgtgtaagagtcattatttcttaatagtttaacaaggctacacccctatatgataaaggctattctccatagtgtaagagtctgttgcaccgacaacccggcgtctggctacat >6|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 703 bp tgcagtcgatgtgctattcgttttaggcagtctacgcgcttagtaactcccacggccatagacttatctcagacatggaccatgtcgatatcggacgccgtcttaccacatttttcatagcccttcataaggcagcgtgctcttactgcccaataaggtggacgattccgaccctaggcgaaccagcgctatagatggaccttctaattgatgcgcaacgtgattgtttccttggtctgggttagcatttcggtagcctaacagtcactccagttcgctaactggcctggatgagggccccatactatatggtgatagcaacgacaccccagtgtattgacctgttgtgtcctggtgatgttgaacgtcaccaagatagtctctatgtgactccatagctaaggagggtgacgtgatgcgccggccccgccccagacactgctacagaaagcttaaggcgagcgtatgaagagcctttgggcatacactctcgtatctagctaggtcaaggtgacggaatgaatctgctatatctagattggcacgcgataatctaggccaattgctggtaaaacacatggtcttattgtatacccaatcccgatttgaatttcctgcaacgaggcagctcgcagaggaacttaagtagagtgaaccctggagccgaatcccagagtcgtcggggacaaagtatatgcaacgg >7|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 243 bp actctttagaatgggtttcactaatagtacgtgcatacaatttcgtcagaaagggcgcttgctaagggacacggatcaatgatgaccagacttatggtgtcaggtctcactatattacatatccggaacccgtgcccgcaccacgcgctgggtctaggcgaccggtgcatcatctccgcgtctctagaggattctctcggtaaatgctgaattgcgtgagatcaaatccgtatgccagtcatg >8|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 863 bp attggcccggtccaggacagagccttatattgctactggtatgagaaccgttctgacgtaaacttgatggctttacgcctgcacgggcttcatacacacatgaccgtggacaaagtcgcccaggccctcgaatagggtgtaatggttaacggttagtgccaccccaatgggtgcgaggcagtaagagtgtcctatggcaaaactctcctcgtttcagaagggtcgctcctctagcctccttatcccccctataatagtactcgccgggtacgagccggagctccctcgagaagtcatcctgctcttaccacaggcagagaacgcgcaaggtttagatactaacttcattcatccaccagctggacaaggaactatagagagatgacattaggttatagctgaggggcgatcccattacccgaggctcgcaatagccttctactctgccaatgatcagtattgtaaacatggctggcgtccctaaatacaaagtcccgctgcaattgatggacttagacgaatcactagcaaagtcgataaatgtttacgctatccaatcccgcggttttaaaggtctgtactatacattcaatacggggggagatgtgtattgagccactagaggtatctaatgggggattgaaaaccgttttatagtcagtctaagcagcgccccttatgtcgctgtgatcggcagggttttttccgagatgtaaggcgaccgatatttcgttcttggcttggaagtagtacccgcgatgacctaccaagtagtccgacccattgatagactatggataccgctcccctccggtcacgaagcaccttagtgaggcatccgccgtgatgtgttgtttggagtg >9|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 494 bp aagcgaaactcctagaacttcccatcaggcaatcgtgtcccacgaagcacggatactacgggcactagttgaatggggggtttttttcgtaggtcgtaataggtactcggatagtcggcccagagttatgcttaagaatgcgctgcttaattcaatgtgactgccgttgtctccgatcagatccaggtgatgattgcgatcgcagcgacatatgtctcgaaagacgtgtcgtgaataagcctgtaagcccaatgcaacatggttccctcaccttgtagctgatgtaccgtgtttcaatctccgcggctatcgatcgccctttcatgcaagctgtaaccagacaggaatctgccctgccatcgttatgtatgcgtattacgactgattcgcgcaggcatcccggctaaggccactgggtataatccacagacattgcacgtcatatagaaaacgacctgtgttcacaatcagcccggggggagtcagagtagtat >10|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 86 bp atcctagcgccaaagatttactgttatggggtcgacgaacactagccgataatgccgtcctgggatctctagcctagtattatgcg >11|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 360 bp cgcctgagggtcctaaatctgacgtatgatcgaagagattggaaggtcccggcgggtcaccccacgttgcgatcatggccaaggccatggtttgctcaaaaatcccacattcgccgtcttacgcgttaggacctcactatcccacagacggtgcgttaccttgtagttgacgcgggatcgtggtgataacagctatttccgagacttcatattcttttacatagcggcttaccgtagtgactccatacattatttgcctattttgtagtgccccgaacagtaaggggaagccaactgccgcggtagctcatagacagacgtggtatacacgctacaaataagcagtggattgagacatga >12|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 140 bp gaattcctggggatttactcacccccgaggcggacaagatttccagctggatcaccgagggttacttaatcccttcgatgctttcaaaggccctaatcagtattgagcaacgaaagcggagtcgttagtgtccaagttgc >13|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 832 bp aatactctcgttgaagcgtcggacagtaaagtgagagatttcggcccacggtagtcggacattctcagtggggagcgaagagttgcgcttagagccgacgtacacgatataacctcaattgaaaatcgctatgtgcatcgttagggcctccggcgtgctgtttcggcagctgagtgtgagggtataacttaccttcgacccgaattgtctcgcggaaatcctaggcaagtaatccacttttggtacgggggagctagttcctctaagacgaacaagtgcactcttcacgtatagtgccctacagttgcgctgttcatggaatccgactaatagaccagtcccgaccccagtgcttcgactgttacaacagttatcgtcgcgcttcgggacgaaatctcggcattactatactcgacatacacagaaagctatggaggtcgccgtaatattcacctcgtcgagtctgtaggcgtagtaaacgttacataatagctaatgggactttcgaacggaacattatactcatcgtgaaacgtttggtcaccacactgtagaatccacctggatcggtgctagttctagtcattatgccctctaatactggtcgcgtagcagggcgcaaggtacagtgcgataccggaataggggtccacacctcacgtccacgtacctatagtcacgccgtatcgaattcattgcactccttttaagtattgagcgacgactggcacccggaatagttagtttgttcggaacgcccccacgcacgagtcgtgcgaattgaatgccgttagaacttggtgcatgcgccgtcgctactattaacggacag >14|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 296 bp atcggggtgcgaaatcccctgagctggttgactacatacgtaaccacgttccgtgcgtcatctaagcgtatcggctcatactggtggtaactagacttggtgaaccctaggtgccggcatatcgaggtccgcatccaaaataactatcgctatagctacatagacatttactcgcaatattacacgaaccgtacgtccctcggtattaacgtaatggttaaagtctctaattccgctgcagagcggcgggataaagacgccggtgtggcctgaatggtggatctgtccgtagtacc >15|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 515 bp accttcaatttgttcgcccgggacaagtagaaattactgtaaactaaacttaacctattccttgttaaagtccgcaccaagtgtactgtaagaatggtcgctcgtaataataacgagaagatcctcgagccgtggtctgctgcaactaccttgagcggtacatcgatgtcccactctgggcggggatcaggggcgagacttgtggtgaggccaaagaatggcgcatatgtaggcaccatacgtcgatacgttccaggagtagaggcctcgaacatacaccacgataagtctacagacgcatagatgacgtggaactcgcatctaacccctggaattctctgactgtcaccctgagccttggtgggccaccgttagggacgtgcaactaccctttctggtgcgaatctcgtttccttagtgccttcacacaaggaggccaattgtttgctaccccatccagacatttgggcattcttgtgagtcccgtaagtcggtcatgatgggtttaagttagt >16|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 820 bp ccggctcaatcctgtagaaccgcgtacaacacacccaagctataccgcacacggcgccttagcaaccactgcttatctgcgtattatacctttacaatcattacatttgatctatctgtgtaccggttttttttgattcaattcgctggattacgacctcccggccaaaaattctcaattcatcgttaacagacgtatttgaagataatcattcaacgtgaactagcacttggtcacttggtacgccaaccaagctgtgctttggggcaaccctttataactcacatgccgtcctaggactttacctagtccacctagcgtgttacagataccgattgcatcaagtcctcgacggaccgcactcgtcgcagttaaaggcaggtctatcagggagataccgtgtgttgcgccaattaatcttagaaaattatgcttgcaatagagcaggaaccctgaaagaatatggttctgtgaaaaagcgtgcactcccgtgtgtgctcgttggttactagcaccgacgttgggtgggcggaggggtcatatgcctgcgccgatcgatatagcgacgctcgtatcaagttgtactgcggcaacacgggtggctcggtgtagaaataaaacgggtgccctcgggtcgaagcgaagcgcaaatgctcgtgggccggagcggccgggggcgggccactatgggcattacaagccagacaggtaagagggagtgctttgatgtgaacagcaccccttccgctggacggggttatgacaagttcgccaagattttacaatatccttaacagtggaatgtccttaccgggtttac >17|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 791 bp attgttagggcctgtccggaaaagatcaacggaagatattcaccagcacctatgctgactcacgtagttcccgacgttcagtcccctccaacgtggaaggtaggacccatctccttaacgggatcgatcggtcttcctgtgaaagttgctcagagtcctcaaggacgtttttgggtgcgtgtacggtatggttatggtacgtgtctgtgacagagggtattcttactggttaagtgacccatatgaccacctgacgcccgagcatagacctgtaggggtcgacgcgagagatggcagcttttgtcatatcatcggttcatgtcaaggttggaggaattcaggcatacacaatctcggcttagtctgcgctgctcctgtccatacctggcacttggagtcaatggattcccaaaaccgacaaatgagtcaacgctctactttttgtttgctggaacgaggcaatatccattgattcccttctcaacaaatgttatcgcggcaggaggacacaccggggccgcccgggcatcgattcgtaaccgcctgaatgtgatacaaccgataatccacttgtaagaaaatgtattctagggacttggcaccgtacggtatagtagctaatctatctaccgctagctcccggcgaaacatctggcggctaacgaccgcacacgtccaagtaattatcggcactgcgatctgagggatcatctcgcggacggagattaactagttaagaatacctatctccatcgcaatgcgactgtagccaatagctatttaaggcgtgt >18|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 328 bp accgattacaggcagtcggccttgtccgctcgtatatccagggatgttccaccgaaagtgggagtgtggcacttattggtaaaaggcatttttacgaacgacactgataggattgatcactcaagaaatgttctcgaccctgaggtaggagtcttaacagacggacatcctccgtagatacgtgagaattaagggacgcatgtcgaaaacgcttggaatctactgtagtggcccaccttacgcttcttccaataactcccttcatagtccggcaacctcggtgggggtttcccttaggcctcggtgattgctagaacctccgcgcaaa >19|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 249 bp ggagtggaaaattctgtagtccgttggcggcgaccgcaaaccagaataatatggtcacgttaggccctcgggccccttcatatgtacggagtcattgaattagcattatactaccgttacgcaagaccctatcccatccgcgactgtcaccactgctgtaaggttgcaaggctgtttcaatgtaaagtaggcgaattctgacgtgggctgataacgaatcccccgggttatctagtgcaagtgctatcc >20|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 440 bp atcttaaacagcccaatcggctcgccgaccaatttcccgcttcacagtacgcggaagaatctgcagatagaagtcagccctctcacgtcaataggaatgctgcccgtcatgtttaactactcaagttttaaggtgtcccttatcggttccaggatcatgtctgaaggaagatggtcgcaacgaaatctggagtggcatacatcgttcggtcgaagcataatctcagacgttatctataaagttagggcgctgtatggattgggattcaagctcgaagcctgttcctgccatacagcgccttagttaggatcacgcctgaaacgtcacgacggtgctaagcatcatggggcgtggcccggacgattatccatccgctacttgcgtatggtggtgtcacccaaaatatatgtcgcgaagagtgtccgtgtcatgctaccgag >21|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 840 bp cataactcgtgagtggccctgtacaagtcattgcatcacaatccttgcaatttgctcctttggccaagcgtacaagaccccggacccatacgctcccggctgataaactgctacagcatggtatatccggatgatgcccctgaaaactgcggaagtcaatttgttgatgaatccccgactttccgctgttcctgtggatggtcgaatgccaaatgaagagctgctccccccttctttaatatcaagcactacaaagataaagcctgtttggctgacggcgagccctcccctatcgtacgcaggatagatctggccaagtccgctgacgatggggtccacactgccagaagcgtagatctttgttgagtcggaccggaagagctaacctagctaagggtgtagagttttcaggagcttagagtcatgtcggattatggttggcgttacggacgggctccaaacgatcaaactctagtggccactttcatggccagaacggaaagagcggcgatgtctgccaagtaagaccttcactaccttccgttgattacagacgtcggtttgacagcttggggtcttatccggcgtttcagagaacttttggagcactgagcgcagacaccgacaagcttagctagacagctgaaccgtatcacttttgaaaccagagaaaacgcatagggtggttgaggtaccagaaggtgtggtttctaagttggaaaccacgtacatcactccttagatctccgaaagcgtcttcgcgtgttcggactccacatctatgcgtttactagcaagcggtttctgaccaatatgcctatgatatatcttaggtcggga >22|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 234 bp caccgcgaaagtgactcagttttcccggtcttatcacggtcgttgtcgtccagattccggttgttaactgcgggagctataacacttattccttactgcgacggctgatccactaagaacagttcatagagctcggctatataatttgaagacatagattccacggtacttgtagcccataaccgctgaggaggaacgtccaacggttcgcgcggagcatgtgacgcttaaagg >23|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 917 bp atcaagtgattacctggtaacccgccgctcttgcagtgttcaccctttgtgtcgtcttagtgtttgtacacgttaaggaaaagcgttagcttaaccattacgccccccaaagcccggtgtgtagttatctacatgccgtgtcaaagcggtgactaaatgtttatcaagttctgatgacaacgtgagctcttaaagccattgactagtataagcacggaacaatgataccaggcaagcttgaatataggataaggcctctaagctcgaagcggatcttacggaggtgtgaatcaacagcactcgagtagtacaccgtggatggttagtgaagttggtggtaaaagagtaaagggttctaacaccttaacaatgcgctacacttcaccatagccgagagtcagtatgtggtaccgttagttctttcaatcccaagagcgcataactgcttgccgccgcttagtttagggacattaatgtatatgatgaggggatgctcccttcattcggaccgaccccgacacatcgtatcctaatggctaaccgctcgcagccccctgcctgcatgcggtccgccgagcagtcgaccaagcactgtgaaagaatttgggaaatcgaacccagactaccggaacttaactcacgaccttcgatcttctgatcccagtttccatacttatatatactgcgcgaactgcagcggactttcctccggcccagcctatcgagctttgcgattgtattcgaggggcgtcggtatcttttatccaacagctgtcatcagttcggaccggggggataaaagcgccagactggatctggggtggtgtcaaccatgtgagtcaatcgtccccgagtgccgagggcctagggttcctacatggtgggtgctcctgtcgaaagaatcgagtgacgactcg >24|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 676 bp cacacggcatcgcaaagcgagctatccagagatgatacatgtggttgaaggtgattgcgtcaacatgggggttgctcagtttggttggtcaatcaacggtggcagaccatgcgataacgatgatggtaagactgtaaggtaagttaaatactctcgtctgccagttgggtcgtcaacgctgcagagacgccattcttcccagaaggtccgagctttctacagtgccgcggcgtcatgaccaaaggggtccaacctcgcagtaaaatgtctatgcttctggtttggaatgagaccgggccatcccgtgataaagagcttcatttatcagggaaagcgtcgcgtagctctagaatttatttatcttgagtcaaatgccatcatctaatgaatccactgagctggtaaggcctaggcaggcacggaggactttatagtccacgaattcgggcatccgcattatcttgttcgtccgcacttaacgactccatacccgaccctgttactctatcgagtacactacggttaaccgggcgtcattgtccacaggttcacagcaacattgggcgaagaggagtgctaactaagcgccatgccccattctaagtttacaacaaagtatttccaagcggacggtccgtgtttcagcctcatcttgcccccctggacattccgatgg >25|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 870 bp ctgatcaccaatagcttgcgcttaacacacgcgccttacaattatatgacgcccttgccaatgacagatagagccattaatcgtggaaaccaggcatttatacttgtccgatgtatcgattctcctctatctacagagcccggacatgcgaaatatcaaaattccatgtatactgaataaatacattgggcaagccgggctcatgcagcaatcccagcgttgccttacgcaaagatatcttacggagttgcctttagattaacagcacgtgttcaaaaacctagccaactctgtcggtctagggcggaacgaagtagccagagtcgccccacgcagttcacgattacagtaatccccttatggttggggcatcgggaaattaaccctaagatgcgccccttgagcgccgaaaagggatcagttcagagtttccccccattcattgcaaggcactgttcaggcgctaacatgaggcccaaaaactagctggttacttcctgcgtcgcgcaactgttcatgtgttctttccgtacctgtgccaaagtccatgttgaggtacacccttgggtgtgttagaaagtggcttgccctcatagctgctatgggaaatttgagttgcgaccgtcgggcctcagggcgccaggtttggctagtaggcggcgtcttgtgctgcgtcaactgcgaaatgatgtccggtaggcttttatgggtcgttgcggccatgcaagagcatgcggccggttgtcggttacagagtcttagatactgtcaaactcgtacacaataaagagaggtactaatgaatcatgggcagcgcgttcatagtatgtgaaacttggcaatcagtgcacggttggccccacggtccta >26|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 751 bp gggtgcgttatggggactaaagactgttactaccggtactccgccttatagagccgtcacgtattaatcagctatcaacagatactatcgtcacagccctccttctggcgaaggatctgagcatttgcaaagctataagttggtacgcaacggtagagggcttcgtagtcggggaaagggcttgcagtagtataggccgtaacttatctgttgcaacctcaaccgcacgaatcgattactctataactgccctcaatacagtatggttaccagtcaccttcacactgaagattaattcgcctacagaaggagaacatctaggtctccgtagaatagcagtcgtgacaacacgccgaaacttgaggcaagctcaggcgtgtgtagcgagctttcagcttaggcgggcattacctaattgttacggaccccccaaaaattgtcgactggtttcttctatgcgactataaaacaggaataggaaagtgggtgcatgcaacttgttcgtgtaccgttatgatcgattcctatgtgggagtttgcgcccacctcgtctgtgctgtcccggcgtactgcaccacgctgatttatcttgtagtaaggatggggtcaatacgagggctgaggcgtagagcccgacgggaaacacattcgtcacgccgcaaatcgcgtcgtcctcagacctcagcaagaccttttggtcagaatatggcggggctctaattgcctttacttccactccgtgaacttccgc >27|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 574 bp gtactgcaccttgcactgctatctacaatgccgagggtcgccctagtgctttgcatgtttggcctctacctacgagtctacgcgggcgtttttaagcaagctacgatcatcttgatccaagggtacgaggccccgcagaccaatggaggtcgtgaccaccctcgtgtatgcctcgcactaagcgagcattctggtatactgtctctctcctgtgataataacagtcggctcgatattcagttcacatgaaacagtatgttatataggtgggatggttataacacggaaaggtgaaaaagagtgcggaagttacttaggagtgccgtccttgatcaagcatgcgtagcaacaagcgcccgtaacaaccggatggaggttctgggtgaacaagggcgcccctacaggatatacaggacttgccctatggtccatttatagtatggtggtataccccggctcacctgtgaattagattgcgaaccaaataaaggatcatcgggttcacatttaggttagagccctcatacgttaacctgccggtacaccacttctttcgccgccgcatagtacatgc >28|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 169 bp agctccctaaacaacacccgcgtaaaaccttcagttatggtgccgactaaccctgtggatgtcttagcgctctcgttccgatgggtgctgatactagtaaatgagactcgagaccgagaacacgcaacggctacaacctggtcggttgttggggtttttataatcagtg >29|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 408 bp tgcagtgatgcatcgataagaccgcatagttacctccttacaggtgacgctaggctaattgggagtgctggcacttgtgccctacagtcaagcgctcacgcggtgttctcctcccgcaatcttagatattaggctctgtaccgcacgaaggatgaattttcttgactattggtccctgtttacgagggcttacctagagtgaggatgaacataaacaaggcctacttgacttaaggcttccaaatcacttgagggcaaatgactcctcaaacgcgagtgccagtactatccgtgagggaagaaaatctgccaaaaggccttggccgagatagattccgccgcccagcttacggggcatggtctataagttccgttttagcatgtactgtcaacaagcctgcgggatcc >30|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 52 bp caaagcgattcgggttaacgcacttaagagttcgacgtaggttagtcccctc >31|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 581 bp tgctctgacgtgtaagcgccttcgataacgtctttgcagcgccccacaaagtaaggaccggtctaacagggcttccgaatcaatagactgatagtaatgggatcctgaggctgggacccgacacacggcatattttactagaaacgctgatttaaactccaattatccttgacgcactgagccacagtcttagacgcagaatgtccgcaggagccctgtctttcccctaaatcattcgcggcatttgtttacgggttaagtcctgcggatcctagagtctgggccccgtacaaccaggaagagactgatactccgcgtattacggcccataagaacggtgggcctcgtttgtatttgactactgtacactcctgcctactgctgaacttaatgatgcgagatgaaagtcacagggggtgtagatcaagttgcacttaggtttcttccgatagattactatgcaagctatccacatgtgagaagctatagccacctctcttaacttctggtaagggcgcattgcggagcggcgagtacatatggtcttaatggacgccggtcgccgtccagatggagatagc >32|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 249 bp gcggaactacctctctaagaccgcacaacaagtgtagtagatgaagatcacgcagagtgctcggcactgcatttttatacgtcgaatcagaaacgaggttcctcctctaggcttgttaaaaatccgggcgcgatgggctggtaatctgtggccatgggagcctcgccatttaaagattttggttaaggctcctctgttgtgtccatcacccttgaacgagcccgtacaaaccgtgtacgatgttgacac >33|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 297 bp gccctcgcgccagcttacttttagaaaacatcgaccggtaagagatacctgggtgagctgggcttcacgacatgttcttaaatcaatactctaaatctgctttgtagcatgcctcaagtaaaaaaatgtgctggttccgcacaggtgtgacgattaacgttgcgcccgtttgcgtcagtccagatcaccgatcttccacaccaccggtgggctgccggactgcaggtaatgactcctggctgcattctctgacataaaggttgaatagaacggcgtccttgagaaggttatggaacg >34|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 573 bp gcctaggggtcttgaccacagggagtacgagcattgatcattggagcaggtggctaatattgatagtggttagaccaccggcgcatcatcgtacgagcgcgggcgatacgtgtctttcaccggcgcactaatcttatcttacttctcaagccccgacagcatgtacgccaagtgttgttctgatgaaactttcgaaatagcaactgttagtcagttatagttggggagggcagtgaatacctcaaatacacccaagaaataacttcgaagcggcgcctatatcacacccctgtttcttatgactggtttgcgtgtgctaacagcaatcaagtacctgaccgtatgtccttgaagcttgaggatagtacccggatccagaggactgaaaaccgtgtctacgctgttctcacgccgatgtttgaaataatgagtgtagcgtctgccaaactggcttaagcactcagcgtgaggcgagattctatggccttgcgttttcgtttcgcgcgaacggtgacaatccagaaccccgaccttaaatatgcgacgtaaccctcctggccccgtccgagtgaa >35|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 559 bp gaaaaagtcgccccattcagttacaatcgtcttcagaagccagctcggttggggctatctgcggggtaatgcaacagggggctaccagacggtaaaccagggtcttgctattggtgttacgaaacaaaggagctatgcgacctcattagatcgagattactctcacaggcagctccggccatagcacaactaatttcgggtgtggagctcaccacaggaacatcttgtgcgtcctttgttatttaattgtgcattgtaatgcaccggaccccgggaacatacagccattatctgtgttgccgctcatccgttgtacttcttaatacaatcagaattgtactcaaccgattgccaagcacgtacgcgtcagatacacatccggagtcagtcctcgtcctgctttgactcatgccaaggagtgcgcttcgcgcgggtgaatctcgttatcgatttatagtatttatcttatcgcggaagaccacgctagtagactgggtaacgtcgcgattgtcccaaagccagagtgaagataggcgacatcctctgtgagaggggtacc >36|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 187 bp ctaagtccttatctatgatgcatctttcgttactgcgacaatatccgagacgagcagagttacacgccgaggtgtaaacgaatacgattgctatatgcaacgagttggttacacgcgtgaaggcgaatgtggatgctgcacttggagtcccattttaccggccgcacgtgctagctcactcaccttg >37|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 549 bp ttcatatggggatttggaatcgggtttgtgcggaatatgcccacgagactgcttatgtcaacgagacgacccattgtcacgttgtaaggccaccaataacacacaggtcttcgtttgctgtctcagggcaatcgcatcgacaacatcgtatggataccgttttttatcagcttacggcgcatcatactaataaggtgtttgagagggcgcagactcgaagcagtgtgatcttcccggttcgaagatgcaaaaacggtcctatttcgatccaaaactcagcgcactagtccaatgcttttttggagggttttgtagaacaatcgaggcgcggagcagcgaaatagaaaacgggccagtgaacgacggatccacacggaggtttcactcgaggacgtgtgccccaacaaccggttatctccttacttattcaagtgtgcctgcacctcgataagtctaaactccgcctatcccagcgtaggtcatagtcgtaactcccaacaggtgccctgcgacttgttcctcgccggtgctgcaagaaggtagtgttct >38|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 916 bp gtggcctaccataaatcaatttgggttaacgctctttgatctacgcactatgttgattcacttaccccttgtcaccgggcagaagagagccagtttaggtgtggttgtatttgccaaaccgcaaaccgcctaatgagctggatccggccatggaattaatcccgtcgtttgactcgaggtgttcaaagactgtgcaacacgacgtgcattcatcactagaacttaatctagaccaggccttgtggccaggagaggcgacgtgatattgccctatacacagataattatatacccctcgcgcgcaaaccatctcggtctctttccaaggtgccagcacgcgataactcgtatctgggctggatgtgcgtttcccttagcccactcccccttttaagtactagcgtactcgggttctacggagtgcatggagtttccacaaagggacgcaacataatttaaagaaccgagccttacgaggagcttttgcaggcttccgtcgctatccgtcgtcatggagtgaggctttgaggaacgagcacttgggactctatataccccggagtaagtatctacagccggggtctgacgccaaccatttgttactttgttgcgagggctactcccgctagtagtagaactgctgtcaggcaacgacactaattaagtggccttgacccgtacgacttgagaatcttcggttcaatattccccgtctcgaaaggctgcttcaagtgcgctacacattacatacaactaggcgggagggctccaaaccggcggagctacaaggagtctaatagtgcgaaaaaagggccgcgcgacaaatgagtctggtgtggcattcggaacgagtgggaacgatccgaatccgctttgacaattaaccgtcctaattaggatttcgtgaatagtagtg >39|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 848 bp accccggtcgctttggccggtcgtagccctaatcaattctgttcgtatcactaaagtaacggtttgaaatcctttgcaaacttgatctgggtatatgaaccggtatgcggggatagtggtaaataagtagtttacgagctgagcgtggattatcccagagaagttgccttaggtccagagcccgcacctacaatcactcgaggccggtcgagcgttgcgtggcaaggaaacccagccggtcaccctaccctcaaactcacgtcattgatccaatcatacatggcgtctctcacggtggtgttgtttgctgtttcttgcggcccgtttattcgtgaacacgacgcaagccctaccgacctcgctagccgatctagacgactgggtgggttacccttcccagaggagtgactatggatatgtagtccttataggcatccagggcaccggatgcactagtcacacccctgctcagatagcgccaaaaagtgaattcaagcgttcagctggacacccattaaacacgagtgctactgggcttacataatacgagagaagattggccgattgttgcccttagaacttatgtgaggtaagtctgagacgccgattgcggcctagacattagtaaaaataagataagaactactcccactgactcgttcgggcctcctagagctagggcccccctgagcatgttcagttatatcctacgggctagcgtaaggttttttcgtttatgcgaggcttgagacgcgacatacgagcattccgttgctggggcgatagaccacgatacgctcagaagagggaatactaaattgataaaatgctttcattgtctagcacct >40|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 289 bp agagcaaaagaaagtctgctccgcgtgacacacttgctcgttgtagtaactgcacgcgccgtctactcgacagggaccccccgtcggttcctctctatagcaatcgcggaagtggttccctgcctcccgcgcagaagttcaaactagtaatccttaatgacttgtggggggggagatcagtttcttccacaatggagtaaacttatgcgagaatcaagatcgcagaggccattttttgatgatactgtcagatatgtggttagccgtatcacgttaccgacgcagaatt >41|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 642 bp ctggggaatgaccgtaccgatctaattccccgtcgaaaaacttatgacgcgcagttgtccttatgcttgagacatgaatccttgccccatattggcgatcttggccaatgagatctgtcgaaagtactggaggccggtaaattgggggctctagaggtccgcccctgaaggactaacgtgtgtgtgtgtctacgtgtcgggttatcagcgtgttggacgatggccgtggattcaacgcatgctagagagctaatgatcctccgaagtcaaaagcctcagtgcttcgatttatgagcgcgtggcgagtacgtctagtgatactctaactactataaacaaggcctcgtcgcagaatccttcaatacattgggccccgggagataagtcggaccaggactaaattacacatgggggccctaaccctaggtcttaacggatggcttgataaagcacgcgctaatcttccctatcgtgcacatatctcttgctaccccctgctaaatcgctcgggccttggcctacacaatatgcaactggaagtgtggtttcgcaagggtataaataactgaactgtgaagttcccctgttacaaagccattagccgctaatgaacatagacctaacggactcgctccttgtgct >42|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 993 bp gcaagaagccaaaaaccttgcaggaggtcatttaagtttacccgcgcataagcagagacggacctctctgagatctcgcaccgcgcgcccggccggcactatcgatgctagactagggttggtgactagcccgtcaaaaccagcctaaacgcaaagattgtaggcgctagtccggaactgactgcttcgtgtcggtgggagcctagtatgtttccgggtctatgacccctaaaatcatagacgtgtcttaatagctatacctgacttactttgaagtacttgccacgacgagtttatgagatattagtattctcagccttgtgctcctcctacgaccgggatgagtcatctggtcaccttgatccgtacggaactatagatagtcacttcgaggcatgcgcgtttggacacctactgcttctaagtcatagcgctcgcgtaatgcagcctcgcatgttctttacacgacacgagggattttgattgttataggtgaacgtacagacaaaatcactgtttcagaatacttggcttgtacgatctccagtactccccgtgccggctcggcgaacgggataagacatccacggcattctgtagtggttgaccgggtttgacagactcccttatctggatggggcccgataacgatgagcatagaaccgttgtaagtctcgattgtcacccgaggacaaaattttctcatcctaggttcactcagcgtcgttagagcatcagagttccgtctttaggttactttaaagatcgaaagaaaccttcgtgctggtaggaagtctcatataagtagccagcgtggaccgaggaatagattgttatgttgcatgtacttctgggatcgtgtccagccaacttcaccgcggcacacgtcgatggacccaggaaatgcctcggggtcagaatgagccttgtgcggcccgactgctgaacgacgtataaaatagagcgtcgtggaccccatacaacgcacataac >43|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 473 bp ggttgtccaggcgcgagcaagtagctgactcgctaatcttaacgagtattgcttaggacttccaaatactccaagacgtcaatacgctttatctttgtgaagtcatcccggaccgagcgcttgggtcgtgatttaaaatcccctgtgatgtggctacaggtgcggcctatacagccgagaagaaggccgtctttaggcgtccaatgaaccgttacagggacacaccaaactgcgccaactgatcccacgggtcacggtacgctctaagaccagtcgggattctgacttaacatcgcagcatctgatcgagtggcttctccagcgagcctagggcattacaccgtgcgttcgcaaactctgtatgcttgtcggaaggtatagcttgagcccctttggttcatatcgttaatacttaagtaagtggggtcaatctttttgaactaatcaaatgactcactggcgaaaggcagacg >44|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 272 bp gtagaacttgttccccatggacaatgctagttccgttaatgccaggtattcatgtgccaagcgcctgcctggggaatacgagcctctctacaaacttacggccaccatgcttaaagattcggtgacttcactaatgacctatacaagtaatgcggaggacgctgtcgcttattgctctttgctaaggccagttatgtccgtcagtcaacgatacgctgcggcggtgggtgacggcactagaccggaagcctgatgacaagttcgaatcaata >45|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 860 bp aaaagcatcactctaacgacgctaccgtctgaatagatcaagattgctatcggttcgaccttgatcgcatgtgaacccgcccaaaaacccgtctcgacaaaagttacgtcgcatgggctgcgccaccggatagctcctagcttatcttataaatcaggtagagctacaacatggtgctatgacaactggagtgtcatcgctttggcgaaaccgtaaagggtgggaattgctgcattctcaactgggccgaactattccgcattcggctgctcacaaatcgtggaatgtgtccttgaacgtcctgcttaaccatggtcattgccacgaaggccctggtcggttcagaagtgtatcagacttacaagggtccgaggaggttccggcggggggagaacaagcatcgaacacgactcactgacctgtaggggtattacctatcactgtgacccacatctgaggtactggtccattccataaagatcgagtcgtcttcctaaactgggcactcatacgtacaggaccaaaaaagaggttggttggtgaccgtgccacgcctggcggtcagaatgagttaatgggtcgaataggcgatcttgataacaatagagattcacaacgtgggctcaagccgcttcctgcgaccactattagcgaactagggctatcgccacggaaaagtacttctaacatatacccctatggagtttcagtgtgaagccactgagcagtggcgaggttgtcgcgtcttttgttcaattgttccctgtactttaatgttcgtatcggatttgctttccgttgatagcgaatctctaccctgtcgctgtgtctagcatccgtccgcgaagccggtgacacat >46|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 884 bp aagcctctacaggctctgcggtttggctttacttaacggtgagtcaggaaaacattactgctacgttcaccgtgttcagagatagagagtacattagggaccaatcacaacgttcgccagggcaccgcctaatccgcgttgttagcaagagtacaggctctcgtatactttcagacccttcaatactagacgacaaattgcagcccggggtcatcggtcgactcagatacgtgctaacgagtaccaggtctaccgttgcaacgttggatgcgttatactcggcataaggcgatgccctttgacatactgcactacgctttggcgatagcgctacagttgatgaccgggctaactgcacgcgcgcgtagacgggagcccaattttttgaaattggcgacgaccgacttacgcacctatgcctcgcacgaccagtagctgtaagctctggtcgttggcattagattggcgaaccgctgaccacgatttaagccttccaaatcgttctactatacgagctcagcgtggcggtatgctagttgataaagtaaggacgttctcccgggtccagaagcccgatccacttttaaaagggctgatgcataaactaagactgtatgtctggttcaagtcttcagtaacttcagcctatgtttccaagtgacaacttggagaggggatgctgcggattctgatctaagctgttatatatctgttatcacctcgaaactatcactctggatgaccctcctatgtgcgctcagcgtagaccttccgctgtactttcataaacatagcatggagtgactcagaagagctttaaaggcgggagatcataggtcgcggtcgagcgaatgggaccgaggggaatgccaccaattcccgct >47|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 888 bp gcgccttgaagaggcgaggtctaaaggcaaaaatttagatccgccctatgagacggccgacgcggagaattccctaaccactattgtcctctgcatcgatatcaggaataggcttacctgcaatctcttatggtgatagactgtttgggagctgaacctgagacgcgcacgaaatttggaaggatcaaataggccccgcagtctctggtagacttctgccgagcggactagcttggctaaggtgtacaagcctaaatcgtttttcacatcaattttatagctgattatagaggaacgacgcgatcgtgcagagtgatggtcaaagggtcggtacgtcggatgcccagaaatatggtctgaggggtagcctgttcagaggcgcttaacttgctccttgctcacaggagcgatatgcgctagggttctggacgatcgaccatcattgtaacccatccaatccgtccttattgatggcccactcccgcatgctggtccgaaggcgatcccgatatcccgagcactagaattcgacacagtctgtaccgtgcctaattcttatcagaccctttatgcccgtctcggccttagagttaaatggactatctccacggaatgggcagtgcatcgctaccaagggtcgcccgatcccggggtttccacttcggatcatttttgtggatagtacatatcccacttcaacaagatagcaaaagtccaagacgcagtagagcacgtttctcaaacaacggaccatgcacggttcccggtaacccagtccagggaacaatttgtggttcttctttgaactagttgggagcaacagactcaggggctggccagtagtattgaaccaagcgttttttacaattactttgctggcgctaactg >48|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 588 bp catcaagatgggttacgtaggaccgagattcagtctctgggttagagccgacagcggggccgctacatagtacacggcgaggaatgcggggttgggctgaaccgtacacagtgggctagctgcggtacctgccaccggcatgcgtttaaatcctttcctttggcgaagccaactgccgacgtccgcaacagagactcgttttccgaccccgttactaaatcagctaactggcgcctgaatcctcttacgtcggatgttaattagtgtatagaatatcggagggttgagtgcgacgcgcttcctgttctccgctacttcttgtattatgatttggtcaaatacagtcgacaatagtgctcgacaggatataacgctatggcaccccatagtatcagctaaacgttcatatcctagtagcttagaaagaaatttaatatgcagtggagcctggaaccttacttattgtcggctcatcccgcaatactcgcaaataatcacgcatggtgccggacggtacgggccacgtcaattaggtaaacgaaacacgttacgtggactgctcacccaaccttgagggactgtctactt >49|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 626 bp taacctcagtctcgttcccccctcggtagttcggacccttattcgcttatctcacattcatcactgtagaccaaggaccgggcatacttgcggatatctaccaggactaggcacttagggatacgctgttgaatacgggtttcgtcccgtgtactcaagtgtagtttaagataggtacgagtgctagtacatcgtacaatttacaactgacttaaacgagagtttattatgtcttgttcacttgttgacacgcctgggaaaataataaaaggcaacgtctaatctcagacccgttgattaactaagcagcgtgacgtggagtcatacctgctatatttgggaggtgggaagtattggtgaaccgagcccctcctagccgtggcggtaatgacattaagaaggcgcagttagtcagcactcgaggcaggtgcgcctctgcacgtctgctgatatcgtcggaacgagttaacgctcccgcccacccatcagtcagggaccactttcgacacctagttcgagatttctcctttcggtaaaatcgggctcaattaaggcttgggtaccccggccagggaatatgcacatgagggaactatgagactgcacctaccccgcacggatggagg >50|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 214 bp taactgtcggtcactgctcatcccgactagttcggctcactagacttactcgcggaagcgagaagtaggacgtcgtgtaatactccaacgtcgttacgcaatgttgtaaaacttcatcgcattccgtgcatggcctaaacgtgcagcattatataacgctctttggtcttaatatccatcgcgggagtaacgcgaaggggagacgtgtgcctga
\ No newline at end of file
'''
This function reads in FASTA files
argument is file_path
it returns a dictionary with the sequences
'''
import sys
def read_in_fasta(file_path):
sequences = {}
f = open(file_path)
for line in f:
if line[0] == '>':
defline = line.strip()
defline = defline.replace('>', '')
else:
if defline not in sequences:
sequences[defline] = ''
sequences[defline] += line.strip()
return sequences
'''
This function reads sequences
arguments: seq is a list of sequences
padding_probabilities is a number??
returns sequenced element
'''
import random
def read_sequence(seq, padding_probabilities, read_length):
reading_element = random.choice(seq)
bases =["A", "T", "C", "G"]
if read_length > len(reading_element):
for nt in [0:len(reading_element)]:
sequenced += reading_element[nt]
for nt2 in [len(reading_element):read_length]:
sequenced += random.choice(bases)
else:
for nt in [0:read_length]
sequenced += reading_element[nt]
return sequenced
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment