Input file
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Author Owner
Input: Table
- Sample name
- Flag for single-end/paired-end
-
Location of fastq files in format
- mate1: location
- mate2: location
-
Fragment
- mean: value
- standard deviation: value
-
Adaptors
- 5p: sequence
- 3p: sequence
- Flag for reverse-complementing sequences
Edited by BIOPZ-Gypas Foivos - BIOPZ-Gypas Foivos changed title from Input to Input file
changed title from Input to Input file
- BIOPZ-Gypas Foivos changed the description
changed the description
- BIOPZ-Katsantoni Maria changed milestone to %Preprocessing
changed milestone to %Preprocessing
- BIOPZ-Gypas Foivos assigned to @boersch
assigned to @boersch
- Developer
This is the updated template for storing metadata for NGS data: https://labkey.scicore.unibas.ch/labkey/Zavolan%20Group/Test_aboersch/list-grid.view?listId=9
These are column names of fields relevant for further processing (see detailed description of each column in LabKey):
- Path_Fastq_Files
- Sample_Name
- Replicate_Name
- Single_Paired
- Mate1_File
- Mate2_File
- Mate1_Direction
- Mate2_Direction
- Mate1_Adapter
- Mate2_Adapter
- Fragment_Length_Mean
- Fragment_Length_SD
- Quality_Control_Flag
- Maintainer
Thanks @boersch. I think we might also need for each mate a 5' and a 3' adapter. So I recommend changing Mate1_Adapter and Mate2_Adapter to:
- Mate1_5p_Adapter
- Mate1_3p_Adapter
- Mate2_5p_Adapter
- Mate2_3p_Adapter
- Developer
Thanks for the feedback @gypas. I also asked for the feedback Maria and Mihaela. Then I will implement all comments at once.
- Maintainer
We also need organism and read length if available (not fragment length).
- BIOPZ-Gypas Foivos added Doing label
added Doing label
- Developer
Actually, these fields are already in the table. The field for the read length is called "Cycles". There are two fields for the organism:
- "Organism" containing the Latin name of the organism, e.g. Mus musculus
- "TaxonID" containing the unique taxon ID, e.g. 10090 for the house mouse
Edited by BIOPZ-Börsch Anastasiya - Maintainer
So is there any metadata table available? Maybe the one of Shreemoyee? Even if it is only paired-end or single-end it is ok. I just need it to start testing.
- Developer
OK, I will prepare it now. I will let you know.
- Developer
Here is a test input file: https://labkey.scicore.unibas.ch/labkey/Zavolan%20Group/Test_aboersch/list-grid.view?listId=9
- Maintainer
Features needed by Snakemake:
samples_tables.tsv features
name details fq1 fastq file path mate1 fq2 fastq file path mate2 fq1_3p 3 end adapter in mate1 fq1_5p 5 end adapter in mate1 fq2_3p 3 end adapter in mate2 fq2_5p 5 end adapter in mate2 fq1_polya (A or G or T or C or N stretches) fq2_polya (A or G or T or C or N stretches) organism (?) index_size (--sjdbOverhang option of STAR index) multimappers Number of allowed multimappers in STAR soft_clip (--alignEndsType argument of STAR) pass_mode (allowed values from STAR --twopassMode ) gtf_filtered (gtf filtered for nc RNAs) libtype (keyword in salmon for mode --libtype argument) kallisto_directionality (fr or rf) mean (number, provided by experiment) sd (number, , provided by experiment) genome (fasta file) gtf (gtf file) tr_fasta_filtered (transcriptome fasta) kmer (suggested value 31) seqmode (single_end or paired_end) config values
feature details star_indexes path to the index of STAR kallisto_indexes path to the kallisto needed indexes local_log output_dir Edited by BIOPZ-Katsantoni Maria - Maintainer
@katsanto Please take a look on issue #36
The path is:
/scicore/home/zavolan/GROUP/resources/homo_sapiens/ENSEMBL_99/
- BIOPZ-Gypas Foivos mentioned in merge request !7 (merged)
mentioned in merge request !7 (merged)
- Maintainer
@katsanto Changed to:
/scicore/home/zavolan/GROUP/resources/ENSEMBL_99/homo_sapiens
- Alex Kanitz changed title from Input file to LabKey input file
changed title from Input file to LabKey input file
- BIOPZ-Katsantoni Maria mentioned in issue #39 (closed)
mentioned in issue #39 (closed)