Skip to content
Snippets Groups Projects
  1. Jun 12, 2020
  2. Jun 10, 2020
  3. Apr 27, 2020
    • Alex Kanitz's avatar
      Refactor LabKey to Snakemake script · 556f1e12
      Alex Kanitz authored
      - clean up command line interface
        - improve descriptions
        - add consistent structure
        - remove or merge superfluous CLI arguments
        - set defaults
        - update test calls
        - update docs
        - when importing data from LabKey, table is saved to 'samples.tsv.labkey' in same directory as Snakemake sample table
      - allow user to specify environment variables and relative paths in input table and on CLI
        - relative paths in the input table are interpreted with respect to the directory containing the input table
        - relative paths will are interpreted with respect to the current working directory; this is to achieve portability with respect to tests but is discouraged in production because its behavior is not very predictable from the user's perspective; consequently a warning is thrown
      - set STAR index size to read length - 1
      - remove `gtf_filtered` and `tr_fasta_filtered` and update Snakefiles and test sample tables accordingly
      - rename some MultiQC report-related parameters and update Snakefiles and test config files accordingly
      - add logging
      - add docstrings to module and all functions
      - add typing definitions to all functions
      - restructure and comment code to improve readability
      - linters `flake8` and `mypy` pass
      556f1e12
    • BIOPZ-Katsantoni Maria's avatar
      Major refactoring · 6cf28511
      BIOPZ-Katsantoni Maria authored and Alex Kanitz's avatar Alex Kanitz committed
      * Sequencing mode-related changes:
        * allowed sequencing modes in Snakemake input table changed from `paired_end` and `single_end` to `pe` and `se`, respectively
        * remove sequencing mode from output paths for each rule
        * corresponding wild cards removed entirely from all rules that do not depend on sequencing mode (currently all rules that are defined in the main `Snakefile` in the project root directory)
        * where absolutely necessary, sequencing mode is added as part of output file or directory instead
        * remove dependency of sequencing mode for rule for `FastQC`; now runs separately for each strand
      * Changes related to MultiQC and output file/directory structure
        * moving and renaming outputs for MultiQC is no longer required
        * code to create MultiQC custom config externalized into script `scripts/rhea_multiqc_config.py`
        * add MultiQC output files with deterministic output to md5 sum checks performed during execution of `tests/test_integration_workflow/test.{local,slurm}.sh`
        * output filenames for each rule now follow this general structure: `samples/{sample_name}/{rule}/{output_file}`
        * change log directory structure matches results directory structure
      * Miscellaneous changes
        * consistent, PEP8-compliant formatting in most parts, including Snakemake files, where allowed
        * remove rule `extract_decoys_salmon`; equivalent file `chrName.txt` produced by `star_index` is used instead
        * add rule `start` which copies sample data to the results directory and enforces uniform naming
        * refactoring of ALFA rules and modification of the CI/CD test to ensure compatibility
      6cf28511
    • CJHerrmann's avatar
      Add rules for bigWig creation · 907082c3
      CJHerrmann authored and Alex Kanitz's avatar Alex Kanitz committed
      907082c3
  4. Mar 25, 2020
  5. Mar 24, 2020
  6. Mar 22, 2020
  7. Mar 21, 2020
  8. Mar 20, 2020
  9. Mar 19, 2020
  10. Mar 12, 2020
  11. Mar 06, 2020
  12. Feb 24, 2020
  13. Feb 21, 2020
  14. Feb 20, 2020
    • Alex Kanitz's avatar
      create log directories in Snakefile\ · 5e1ec85e
      Alex Kanitz authored
      - log and, if workflow is executed on cluster, cluster log directories are explicitly created in `Snakefile`
      - location of main log directory can be configured in `config.yaml` (field `log_dir`, previously: `local_log`; requires change in script `labkey_to_snakemake.py` as well as subworkflows as field name is hard-coded there)
      - location of cluster log directory can be configured in `cluster.json` (in field `__default__` -> `out`)
      - `config.yaml` and `cluster.json` in `tests/input_files` are set such that a directory `logs/` is created in the directory where Snakemake is run (i.e., the directory of each test); cluster logs are stored in a subdirectory `logs/cluster`
      - removes instructions to explicitly create log directories from docs and all test scripts
      - cleans up main `Snakefile` (apart from Snakemake-specific syntax, now passes `flake8` linter test)
      5e1ec85e
  15. Feb 17, 2020
    • BIOPZ-Bak Maciej's avatar
      add TIN score calculation · c538fe8b
      BIOPZ-Bak Maciej authored and Alex Kanitz's avatar Alex Kanitz committed
      - add rule for input preparation (GTF to BED12)
      - add rule for TIN score calculation
      - update rule graph and DAG image
      - update Slurm cluster config
      c538fe8b
  16. Feb 15, 2020
    • BIOPZ-Katsantoni Maria's avatar
      get Snakemake input from LabKey API · eea0206f
      BIOPZ-Katsantoni Maria authored and Alex Kanitz's avatar Alex Kanitz committed
      - add script that prepares Snakemake input files 'samples.tsv' and 'config.yaml' from LabKey table
      - script either connects to API directly (with '--remote' and related options) or processes a tab-separated LabKey dump file
      - add tests for both use cases
      - common input files for tests now in 'tests/input_files'
      - update all other tests to account for new file locations
      - update documentation
      eea0206f
  17. Feb 14, 2020
  18. Feb 08, 2020
  19. Feb 07, 2020
    • Alex Kanitz's avatar
      fix various small issues · 17818f4a
      Alex Kanitz authored
      - remove log files and add '.snakemake' directories to '.gitignore'
      - update wrong link in 'README.md'
      - delete superfluous script documentation 'scripts/labkey_api.md'
      - add Snakemake-specific file extension '.smk' to subworkflows
      - remove non-deterministic workflow output from md5 sums
      17818f4a
  20. Jan 24, 2020
    • BIOPZ-Gypas Foivos's avatar
      update test input files · 4e99b664
      BIOPZ-Gypas Foivos authored and Alex Kanitz's avatar Alex Kanitz committed
      - replace corrupt input files
      - add script that runs locally named `run_test.sh` in the snakemake directory
      - add script to CI/CD pipeline
      - fix typo in `paired_end.snakefile`
      - update samples.tsv with fake adapter `XXXXXXX` when we do not want to trim them
      4e99b664
  21. Dec 20, 2019
  22. Dec 13, 2019
  23. Feb 19, 2019
Loading