Create scripts for merging and plotting TIN scores

added Doing label

changed the description

Sounds good. Gave you "developer" status on the TIN score calculation repo, so you can add your scripts, add tests and update Dockerfile and CI accordingly

@kanitz Please take a look at the structure of that repo. I am about to add 2 scripts under src and 3 TSV files under tests. I will adjust README, CI yaml and contributors.

What about requirements? Why do we have two versions of this file? Is the script in Python 2 or 3 in the end? It matters for me since I guess I will have to write my scripts in the same Python version as the TIN calculation...
We have many Dockerfiles, should I update all three of them?

Hi @bakma, sure, add what you need.

As for requirements: the original script was written in Python2. After linting re-factoring I think it also works in Python3, but I haven't extensively tested it - so no guarantees. I've left the requirements in in case somebody was interested to try/use it in Python3. But as I said, you'd have to try it out on a real library and compare it to what's expected from Python2. Might be worth it actually, given that Python2 is now deprecated...

As for Dockerfiles: Nah, just change the one that we're eventually gonna use. That should probably be the "slim" one. And please, try to keep it slim :)

OK, I will come back to this as soon as we update the tool repo with TIN scores.

added 1 deleted label

changed milestone to %v0.1.0 release

Hi Maciek, I've removed the milestone as I don't think having the plots for TIN score is critical for the first version, ideally by next Friday (March 20). If by then you have the TIN plots included as well - great. If not it will be included in one of the next versions. I'll send an email about this later to put a stop on adding issues for v0.1.0 unless they are critical (bugs).

removed milestone

Dear @kanitz @zavolan ,

If by then you have the TIN plots included as well - great

But I have included the scripts already. I am waiting for you the merge to master and build new Dockerfiles: https://git.scicore.unibas.ch/zavolan_group/tools/tin_score_calculation/-/merge_requests/6

I am already working on this and the branch I develop for MultiQC is already designed in a sense to expect an external boxplot and it makes no sense to throw this away. I am working on these issues in parallel and it would require even more work from me to revert to a version where I actually do not have TIN boxplots. I am adding the milestone again,

changed milestone to %v0.1.0 release

@bakma: Great then - the point is that from an management point of view it's not a requirement to have this in the milestone. If it is a requirement for you that's a different story (you can't expect us to always know what your internal dependencies are, you need to manage that yourself). Nobody said it should not be in, it's just not necessary. The important thing is that we will have MultiQC in. If anybody can just keep on adding whatever they think is important to a milestone, it kinda defeats the purpose of milestones. But of course, any issue that is included by then makes the release better

removed milestone

Finished with 6775b679

closed

reopened

I was testing the pipeline and found that calculated TIN scores are located not in the actual output file, but in the log (see here: /scicore/home/zavolan/boersch/ALS_project/rnaseqpipeline/logs_test/paired_end/MN4_d28_FUS_KO_A). The log file calculate_TIN_scores.log looks like this:

@ 2020-03-14 01:08:37: Get BAM file(s) ...
Total 1 BAM file(s):
        results_test/paired_end/MN4_d28_FUS_KO_A/map_genome/MN4_d28_FUS_KO_A_Aligned.sortedByCoord.out.bam
@ 2020-03-14 01:08:37: Processing results_test/paired_end/MN4_d28_FUS_KO_A/map_genome/MN4_d28_FUS_KO_A_Aligned.sortedByCoord.out.bam
ENST00000341423 32.0899306564
ENST00000341421 54.1742287367
ENST00000262613 83.8899595946
ENST00000642498 70.1699578924
ENST00000341426 76.0188843058
ENST00000618887 72.8451647485
ENST00000642496 73.540602066
ENST00000336949 56.7563470459
ENST00000642491 57.5678189436
ENST00000642492 76.3132712404
ENST00000434344 59.4474547727
....

The actual file with TIN scores is empty, plots are also empty. See files TIN_scores_boxplot.pdf, TIN_scores_boxplot.png and TIN_scores_merged.tsv here: /scicore/home/zavolan/boersch/ALS_project/rnaseqpipeline/results_test

Thanks a lot 4 the catch;

There was a minor bug in Ralf script: sys.stderr instead of sys.stdout at the end. We did not notice it previously since running the script in the terminal with no stream redirection resulted in everything nicely printed back in the command line. Then in this pipeline we have been previously redirecting both streams to the same file (&>) which resulted in a similar misinterpretation. It was easy to overlook then... This is a perfect example why it is important to keep separate streams, though :)

I have already corrected it and created a feature branch in the TIN repo of ours. I have created a merge request:
https://git.scicore.unibas.ch/zavolan_group/tools/tin_score_calculation/-/merge_requests/8

We will have to ask @kanitz to merge and re-build the dockerfiles once again, sorry, Alex ;)

This repo will not be affected, however you might need to rebuild the environment to pull and updated singularity image. Please let me know if the problem won't be fixed

Closed by: https://git.scicore.unibas.ch/zavolan_group/tools/tin_score_calculation/-/merge_requests/8

closed

Create scripts for merging and plotting TIN scores

Child items 0

Activity