Skip to content
Snippets Groups Projects
Commit d8e95de2 authored by Alex Kanitz's avatar Alex Kanitz
Browse files

DO NOT MERGE: remove all content for review

parent 3fc67e1f
Branches
Tags
1 merge request!4Draft: remove all content for review (DO NOT MERGE!)
File deleted
# cDNA Generator
Description of the project
**Input files**
Copy_number_file:
- csv-formatted file ("NewTranscriptID,ID,Count")
- id of generated transcript
- id of original transcript (without intron inclusions)
count
_Eample_
`[id of generated transcript] [ID] [Count]`
transcript_sequences_file:
- fasta-formatted file
- id of generated transcript? (in the header)
_Eample_
`> [id of generated transcript]
AGUGACGUUAGACCAGAUAGAC....`
priming_site_file:
- gtf-formatted file
- id of generated transcript?
- position of priming site and binding likelihood
_Eample_
`[id of generated transcript] ... [position of priming site]... [binding likelihood ]`
**Output files**
cDNA_file:
- fasta-formatted file
- Includes all the uniquie "cDNA sequence" and "cDNA sequence ID"
cDNA_count_file:
- csv-formatted file
- Includes "cDNA sequence ID" and "cDNA count"
NewTranscriptID, ID, Count
1,1,12
2,1,11
3,2,33
4,3,11
5,4,55
Transcript_1 RIBlast Priming_site 10 25 . + . Accessibility_Energy "1.49"; Hybridization_Energy "-9.76"; Interaction_Energy "-8.74"; Number_of_binding_sites "2"; Binding_Probability "0.12"
Transcript_1 RIBlast Priming_site 640 655 . + . Accessibility_Energy "1.71"; Hybridization_Energy "-9.12"; Interaction_Energy "-8.34"; Number_of_binding_sites "2"; Binding_Probability "0.05"
Transcript_2 RIBlast Priming_site 3 18 . + . Accessibility_Energy "1.21"; Hybridization_Energy "-5.12"; Interaction_Energy "-2.34"; Number_of_binding_sites "1"; Binding_Probability "0.15"
>1
GAUAGCUAGAGGAUUCUCAGAGGAGAAGCUAGAGGAGCUAGAGGAGCUAGAGGAGCUAGAGGAGCUAGAGG
>2
AGCUAGAGGAUAGCUAGAGGAUAGCUAGAGGAUAGCUAGAGGAGCUAGAGGAGCUAGAGGAGCUAGAGG
>3
AGCUAGAGGAUAGCUAGAGGAUAGCUAGAGGAUAGCUAGAGGAUAGCUAGAGGAGCUAGAGG
>4
AGCUAGAGGAUAGCUAGAGGAUAGCUAGAGGAUAGCUAGAGGAUAGCUAGAGGAGCUAGAGGAGCUAGAGG
import sys
def translate(res):
translate_dict = {"A": "T", "U": "A", "G": "C", "C":"G"}
if res not in translate_dict.keys():
print("cDNA residue not A,T,U or G ")
sys.exit(1)
return translate_dict[res]
class cDNA_Gen:
def __init__(self,
fasta,
gtf,
cpn,
output_fasta = "cDNA.fasta",
output_csv = "cDNA.csv"
):
# inputs
self.fasta = fasta
self.gtf = gtf
self.cpn = cpn
self.output_fasta = output_fasta
self.output_csv = output_csv
# variables
self.prime_sites = []
self.fasta_seq = ""
self.fasta_id = ""
self.copy_numbers = {}
self.run()
def run(self):
self.read_fasta()
self.read_gtf()
def order_priming_sites(self):
pass
def generate_cdna(self):
pass
def read_fasta(self):
fasta = open(self.fasta).readlines()
self.fasta_id = fasta[0]
print(fasta[0])
self.fasta_seq = "".join([_.rstrip() for _ in fasta[1:]])
def read_gtf(self):
with open(self.gtf) as gtf_file:
gtf_lines = gtf_file.readlines()
for line in gtf_lines[:1000]:
if not line.startswith("#"):
temp_gtf = GTF_entry(line)
temp_gtf.set_sequence(self.fasta_seq)
self.prime_sites.append(temp_gtf)
def write_fasta(self):
pass
def read_copy_numbers(self):
with open(self.cpn) as cpn_file:
cpn_lines = cpn_file.readlines()
for line in cpn_lines:
csv = line.split(",")
trans_id = csv[0]
if trans_id:
gene_id = csv[1]
count = csv[2]
self.copy_numbers[gene_id] = count
def return_output(self):
return self.output_fasta, self.output_csv
class GTF_entry:
def __init__(self, string):
self.string = string
self.values = self.string.split("\t")
self.id = self.values[0]
self.start = int(self.values[3])
self.end = int(self.values[4])
self.score = float(0.5) #self.values[5]
self.sequence = "no sequence set"
self.length = self.end - self.start
def __repr__(self):
return self.sequence[:10] + "..." + f" len={self.length} score={self.score}"
def set_sequence(self, full_sequence):
self.sequence = full_sequence[self.start:self.end]
if __name__ == "__main__":
import argparse
pass
images/Screen_Shot_Git_2_Bastian_.png

74.5 KiB

images/Screen_Shot_Git_Bastian_.png

65.1 KiB

images/Screen_Shot_Markdown_Bastian.png

248 KiB

images/student3_github.png

113 KiB

images/student3_markdown.png

245 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment