Skip to content
Snippets Groups Projects
  1. Jun 12, 2020
  2. Apr 27, 2020
    • Alex Kanitz's avatar
      Refactor LabKey to Snakemake script · 556f1e12
      Alex Kanitz authored
      - clean up command line interface
        - improve descriptions
        - add consistent structure
        - remove or merge superfluous CLI arguments
        - set defaults
        - update test calls
        - update docs
        - when importing data from LabKey, table is saved to 'samples.tsv.labkey' in same directory as Snakemake sample table
      - allow user to specify environment variables and relative paths in input table and on CLI
        - relative paths in the input table are interpreted with respect to the directory containing the input table
        - relative paths will are interpreted with respect to the current working directory; this is to achieve portability with respect to tests but is discouraged in production because its behavior is not very predictable from the user's perspective; consequently a warning is thrown
      - set STAR index size to read length - 1
      - remove `gtf_filtered` and `tr_fasta_filtered` and update Snakefiles and test sample tables accordingly
      - rename some MultiQC report-related parameters and update Snakefiles and test config files accordingly
      - add logging
      - add docstrings to module and all functions
      - add typing definitions to all functions
      - restructure and comment code to improve readability
      - linters `flake8` and `mypy` pass
      556f1e12
    • BIOPZ-Katsantoni Maria's avatar
      Major refactoring · 6cf28511
      BIOPZ-Katsantoni Maria authored and Alex Kanitz's avatar Alex Kanitz committed
      * Sequencing mode-related changes:
        * allowed sequencing modes in Snakemake input table changed from `paired_end` and `single_end` to `pe` and `se`, respectively
        * remove sequencing mode from output paths for each rule
        * corresponding wild cards removed entirely from all rules that do not depend on sequencing mode (currently all rules that are defined in the main `Snakefile` in the project root directory)
        * where absolutely necessary, sequencing mode is added as part of output file or directory instead
        * remove dependency of sequencing mode for rule for `FastQC`; now runs separately for each strand
      * Changes related to MultiQC and output file/directory structure
        * moving and renaming outputs for MultiQC is no longer required
        * code to create MultiQC custom config externalized into script `scripts/rhea_multiqc_config.py`
        * add MultiQC output files with deterministic output to md5 sum checks performed during execution of `tests/test_integration_workflow/test.{local,slurm}.sh`
        * output filenames for each rule now follow this general structure: `samples/{sample_name}/{rule}/{output_file}`
        * change log directory structure matches results directory structure
      * Miscellaneous changes
        * consistent, PEP8-compliant formatting in most parts, including Snakemake files, where allowed
        * remove rule `extract_decoys_salmon`; equivalent file `chrName.txt` produced by `star_index` is used instead
        * add rule `start` which copies sample data to the results directory and enforces uniform naming
        * refactoring of ALFA rules and modification of the CI/CD test to ensure compatibility
      6cf28511
  3. Feb 18, 2020
    • Alex Kanitz's avatar
      run tests in verbose mode · 0d95577e
      Alex Kanitz authored
      - trap call functionalized through cleanup() function
      - function added to all test scripts
      - function prints out exit status of last command before trap
      - flag `--verbose` added to Snakemake calls in all test scripts
      - script tests rename to follow naming convention 'test_script_<script_name>_<script_run_mode>
      0d95577e
  4. Feb 15, 2020
    • BIOPZ-Katsantoni Maria's avatar
      get Snakemake input from LabKey API · eea0206f
      BIOPZ-Katsantoni Maria authored and Alex Kanitz's avatar Alex Kanitz committed
      - add script that prepares Snakemake input files 'samples.tsv' and 'config.yaml' from LabKey table
      - script either connects to API directly (with '--remote' and related options) or processes a tab-separated LabKey dump file
      - add tests for both use cases
      - common input files for tests now in 'tests/input_files'
      - update all other tests to account for new file locations
      - update documentation
      eea0206f
  5. Feb 14, 2020
  6. Feb 07, 2020
  7. Feb 04, 2020
    • Alex Kanitz's avatar
      add documentation · 1ef8b6af
      Alex Kanitz authored
      `README.md` file describes
      - aim and background of the project (including the workflow DAG representation)
      - how to install requirements (including setting up a `conda` environment for the project)
      - how to execute the workflow run integration test
      - how to run the workflow on your own samples (including how to auto-generate required params from LabKey metadata)
      
      Additional minor changes:
      - minor changes in various test and related files, including updates of paths
      - root directory now includes subdirectory `runs/` for a user's workflow runs (contents not version-controlled)
      1ef8b6af
  8. Feb 03, 2020
    • BIOPZ-Katsantoni Maria's avatar
      generate Snakemake inputs from LabKey data table · cd541afe
      BIOPZ-Katsantoni Maria authored and Alex Kanitz's avatar Alex Kanitz committed
      Adds script `scripts/labkey_to_snakemake.py` which
      - maps LabKey table fields to Snakemake parameters
      - assembles required parameters from the table data
      - infers required parameters from the input data
      - produces files `config.yaml` and `samples.tsv` required by the Snakemake pipeline
      
      A self-contained integration test for the script is located at `tests/test_scripts_labkey_to_snakemake` (execute script `test.sh`) and was added to the CI/CD pipeline.
      
      Note that intermittent changes to the `master` branch were merged into this branch to forego conflicts during merging.
      
      Closes #39
      cd541afe
Loading