Workflow for the analysis of the synthetic data

The aim of the workflow is to automate the analysis of synthetic data. Inputs and intermediate outputs will be:

Inputs (I#)

  • I1. Path to genome sequence file
  • I3. Path to gene expression table (csv)
  • I14. Output directory
  • I16. Pattern specifying the reads file name for an individual cell

Intermediate outputs

  • O1. Sampled transcript structures (gtf)
  • O11. Files with read-to-genome alignments (bam)
  • O12. Path to estimates of gene expression
  • O13. Plot with mean vs. standard deviation for individual genes
  • O14. Path to table with mean vs. standard deviation for individual genes (csv)
  • O15. Plot with initial vs. inferred expression levels for all genes

This will be done as follows:

  • Mapping reads to genome (#19)
    • Inputs I1,I14,I16
    • Outputs O12
  • Quantifying gene expression (#11)
    • Inputs O1, O11
    • Outputs O12
  • MA plot (#12)
    • Inputs O12
    • Outputs O13, O14
  • Estimating accuracy of gene expression inference
    • Inputs I3, O14
    • Outputs O16
Edited by MihaelaZavolan