Skip to content
Snippets Groups Projects

wrapper for zarp execution

Closed BIOPZ-Katsantoni Maria requested to merge snakemake_call_wrapper into dev

wrapper in the prepare_inputs.py for executing zarp additional options added: --execution_env (allowed values local or slurm or none for now) --snakefile_path path to the snake file --bindings paths to be included in the singularity --cluster_json path to the cluster.json (used in the slurm run)

Fixed also a test_zarp_wrapper test (this needs to be refactored and be merged with existing tests) Fixes #130 (closed)

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
518 '--cores=4',
519 '--printshellcmds',
520 '--rerun-incomplete',
521 '--use-singularity',
522 f'--singularity-args=--bind {bindings}',
523 '--verbose']
524 subprocess.check_call(
525 command,
526 stdout=sys.stdout,
527 stderr=sys.stderr)
528 except subprocess.CalledProcessError:
529 pass
530 except OSError:
531 pass
532
533 elif execution_env == "slurm":
  • Would be better to define a basic call with all the list items (CLI options and values) that are common to both/all execution environments and then add insert only the specific ones into that list depending on which execution environment the user wants.

  • Please register or sign in to reply
  • Alex Kanitz
  • 499 the environment chosen by the user if no environment is
    500 chosen zarp workflow is not triggered
    501
    502 :param execution_env: environment where zarp will be executed;
    503 for now support only local and slurm
    504
    505 : config: config file to be executed
    506 : samples: samples table for which zarp is triggered
    507
    508 :returns: tuple of ALFA strand name suffixes for two coverage tracks of a
    509 paired-end sequencing library
    510 """
    511 report_path = os.path.join(result_dir, "snakemake_report.html")
    512 if execution_env == "local":
    513 try:
    514 command = [
  • Alex Kanitz
  • 509 paired-end sequencing library
    510 """
    511 report_path = os.path.join(result_dir, "snakemake_report.html")
    512 if execution_env == "local":
    513 try:
    514 command = [
    515 'snakemake',
    516 f'--snakefile={snakefile_path}',
    517 f'--configfile={config}',
    518 '--cores=4',
    519 '--printshellcmds',
    520 '--rerun-incomplete',
    521 '--use-singularity',
    522 f'--singularity-args=--bind {bindings}',
    523 '--verbose']
    524 subprocess.check_call(
  • Alex Kanitz
  • 513 try:
    514 command = [
    515 'snakemake',
    516 f'--snakefile={snakefile_path}',
    517 f'--configfile={config}',
    518 '--cores=4',
    519 '--printshellcmds',
    520 '--rerun-incomplete',
    521 '--use-singularity',
    522 f'--singularity-args=--bind {bindings}',
    523 '--verbose']
    524 subprocess.check_call(
    525 command,
    526 stdout=sys.stdout,
    527 stderr=sys.stderr)
    528 except subprocess.CalledProcessError:
    • Catching an error and ignoring it is dangerous, catching it and raising it is pointless. I think you should at least log the errors, otherwise the user will think everything's alright. Better is to re-raise the error with some custom error message. You could use the raise from pattern (introduced somewhere in Python 3):

      try:
          [code]
      except Exception as e:
          raise WhateverExceptionYouLike("Something went wrong and Snakemake wasn't properly executed. See above for a more detailed error message") from e

      Alternatively you could just not use try - catch so that the error is automatically raised (program will stop with an error traceback). Or if you don't want to bother the user with a traceback, you could just use the above code without from e. But note that the user would not have any chance to find out what exactly went wrong, so that's not so very nice.

      If you tell me what behavior you would like if something goes wrong there, perhaps I could give you better advice.

      Edited by Alex Kanitz
    • Please register or sign in to reply
  • Alex Kanitz
  • 562 pass
    563 try:
    564 report_command = [
    565 'snakemake',
    566 f'--snakefile={snakefile_path}',
    567 f'--configfile={config}',
    568 f'--report={report_path}']
    569 subprocess.check_call(
    570 report_command,
    571 stdout=sys.stdout,
    572 stderr=sys.stderr)
    573 except subprocess.CalledProcessError:
    574 pass
    575 except OSError:
    576 pass
    577 return
  • Alex Kanitz
  • 493 result_dir: str,
    494 cluster_json: str,
    495 snakefile_path: str,
    496 bindings):
    497 """
    498 trigger zarp snakemake workflow execution, according to
    499 the environment chosen by the user if no environment is
    500 chosen zarp workflow is not triggered
    501
    502 :param execution_env: environment where zarp will be executed;
    503 for now support only local and slurm
    504
    505 : config: config file to be executed
    506 : samples: samples table for which zarp is triggered
    507
    508 :returns: tuple of ALFA strand name suffixes for two coverage tracks of a
  • Alex Kanitz
  • 451 487 return os.path.normpath(path)
    452 488
    453 489
    490 def run_zarp(
    491 execution_env: str,
    492 config: str,
    493 result_dir: str,
    494 cluster_json: str,
    495 snakefile_path: str,
    496 bindings):
  • Alex Kanitz
  • 491 execution_env: str,
    492 config: str,
    493 result_dir: str,
    494 cluster_json: str,
    495 snakefile_path: str,
    496 bindings):
    497 """
    498 trigger zarp snakemake workflow execution, according to
    499 the environment chosen by the user if no environment is
    500 chosen zarp workflow is not triggered
    501
    502 :param execution_env: environment where zarp will be executed;
    503 for now support only local and slurm
    504
    505 : config: config file to be executed
    506 : samples: samples table for which zarp is triggered
    • I don't see this in the function signature. Also, all except config and execution_env are missing. For defining parameters, return values, raised errors etc., I would recommend one of three styles (just learned about that when I set up the API docs for one of my projects):

      Picking any of these will allow parsers to automatically create docs later on. I decided for going with the Google format, as it's most readable to me.

      Also, have a look at the general Python conventions on docstrings, defined in PEP 257.

      Note that docstrings are often the only help for users who want to do a bit more than just follow the one or two example use cases that are usually described in the docs. It's also the second best source of truth, apart from looking at the code itself, and using automatic building of docs can check, to some extent, that what's written in the docstring matches with the class/method/function definition, so there's some way to make sure that docstrings are up-to-date (at least better than just regular docs where you can write anything and never touch it again).

    • Please register or sign in to reply
  • Alex Kanitz
  • Alex Kanitz
  • 55 56 metavar="STR",
    56 57 )
    57 58
    59 zarp = parser.add_argument_group("ZARP execution")
    60 zarp.add_argument(
    61 "--execution_env",
    62 choices=['local', 'slurm', None],
    63 default=None,
    64 help="zarp execution wrapper",
  • Alex Kanitz
  • 70 default=None,
    71 help="slurm specific resources config for the zarp rules",
    72 metavar="STR",
    73 )
    74 zarp.add_argument(
    75 "--snakefile_path",
    76 type=str,
    77 default=None,
    78 help="path of the snakefile to be executed",
    79 metavar="STR",
    80 )
    81 zarp.add_argument(
    82 "--bindings",
    83 type=str,
    84 default=None,
    85 help="bindings for singularity",
    • Again, would be good to describe this in a little more detail: what is expected here. Bindings is a really general term in programming. Users who are new to Singularity (and Snakemake on Singularity) will not have a clue what this refers to.

    • Please register or sign in to reply
  • Alex Kanitz
  • 55 56 metavar="STR",
    56 57 )
    57 58
    59 zarp = parser.add_argument_group("ZARP execution")
    60 zarp.add_argument(
    61 "--execution_env",
    62 choices=['local', 'slurm', None],
    63 default=None,
    64 help="zarp execution wrapper",
    65 metavar="STR",
    66 )
    67 zarp.add_argument(
    68 "--cluster_json",
    69 type=str,
    70 default=None,
    71 help="slurm specific resources config for the zarp rules",
    • You print the help when the user specifies slurm as an execution environment but does not provide a cluster_json - but the help says nothing about it being required in that case.

    • Please register or sign in to reply
  • Alex Kanitz
  • 55 56 metavar="STR",
    56 57 )
    57 58
    59 zarp = parser.add_argument_group("ZARP execution")
    60 zarp.add_argument(
    61 "--execution_env",
  • Alex Kanitz
  • Alex Kanitz
  • 10 rm -rf results
    11 cd $user_dir
    12 echo "Exit status: $rc"
    13 }
    14 trap cleanup EXIT
    15
    16 # Set up test environment
    17 set -eo pipefail # ensures that script exits at first command that exits with non-zero status
    18 set -u # ensures that script exits when unset variables are used
    19 set -x # facilitates debugging by printing out executed commands
    20 user_dir=$PWD
    21 script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)"
    22 cd $script_dir/
    23
    24 # Run tests
    25 python "../../scripts/prepare_inputs.py" \
    • I just recently renamed it from labkey_to_snakemake.py to prepare_inputs.py because LabKey wasn't the only use case. Now preparing inputs isn't either because the script can also execute Snakemake. Why not use the chance to already rename it to zarp.py? :upside_down:

    • Please register or sign in to reply
  • I reeeeeally think we need to start writing unit tests for the different functions. Do you want to be the guinea pig and write one for the new function we implemented? In any case, I'll create an issue for writing the rest of them (I suppose you and me can split up as we were the ones working most on this script). Let's use pytest if you don't mind. Although it's easy to mix different test suites.

  • mentioned in issue #136 (closed)

  • We will use the https://github.com/zavolanlab/zarp-cli from now on. We will not merge this.

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading