Maciek mentioned that the titles equal the names of directories where MultiQC get's the results from, thus we should change those. This issue came up because of ambiguities in the ALFA plot names (Unique and UniqueMultiple, nobody was sure what they ment), we have to change this to a more intuitive name.
We could have some reformatting option to replace underscores. Make sure the names of the directories are meaningful and understandable in the context of the multiqc report.
As I have just checked with @burri0000 :
It seems that the only place in the MultiQC report where we would like to change section names are the subsections of ALFA.
Someone please take a look at the attached report after running on test data and confirm: multiqc_summary.zip@katsanto@herrmchr ?
If that is the case we just need to change two hard-coded words for output directories within the pipeline: Unique and UniqueMultiple. Please let me know what would you prefer instead and I can quickly fix it but only after the current Merge Requests are in.
I think titles are fine, just for ALPHA they are obscure. Could be changed to "UniqueMappers" and "MultimappersIncluded" or something like that (e.g. if spaces are allowed, the split the two words).
This is not as easy as I expected: these words: Unique and UniqueMultiple are hardcoded into STAR while running inputAlignmentsFromBAM. So it seems we put these words as wildcards in the pipeline because we have to adjust to what the tool outputs. Therefore I cannot easily just exchange them for others. Fixing this issue would require some more modifications.
Information flows as follows:
STAR contributes specific identifiers to names of its output files.
zarp creates subdirectories (under per-sample ALFA directory in results) which names correspond to these specific identifiers.
ALFA plotting MultiQC plugin (as a general design) parses the names of these sub-directories and puts them as parts of the sub-sections names in the final report.
I guess we need to decide on what to adjust here...
we replace in the {folder} the UniqueMultiple with MultimappersIncluded as the UniqueMultiple is confusing and the Unique with UniqueMappers as @zavolan suggested. That way the changes do not need to be applied in the zarp workflow.
@kanitz@bakma should I make a branch with these changes in the git or can somebody else take care of it as I haven't worked with this repo so far?
I am not sure if I understand you: do you propose to replace a placeholder {folder} with hard-coded directory names?
If so then this is a bad idea - it breaks the generality of the whole plugin and restricts it so that it works only on directories with these specific names, nothing else. Poor plugin that would be.
Hmmm... OK that would work.
But still, in my opinion this should not be a general principle for the plugin (names should be propagated as they are if there is no clear reason to adjust them mid-way).
However, we could add a boolean flag to the plugin (maybe --zarp?) To indicate that we do want to introduce modifications and would like to have output names adjusted specificly so that they are fit for the report of our zarp pipeline. If the flag is set to True we can then test: if the name is X, make it Y, if the name is A, make it B.
@katsanto Could you please create a branch and then a PR here: https://github.com/zavolanlab/multiqc-plugins ? We can move the rest of the discussion there. Once the PR is in I release another version of our multiqc-plugins container and here we will just need to substitute DockerHub URLs in the Snakefile.
I do not see why we should not change the names for the plugin if they make more sense to be called sth else. The whole idea is that the folder names are not always appropriate as section titles. And in any case the particular plugin where I suggested the change is in the alfa module anyway.. Wouldn't it be more useful for the user to see MultimappersIncluded instead of UniqueMultiple in the title? Isn't this name also automatic from ALFA? The only issue is if they change the naming convention..
I do not see why we should not change the names for the plugin if they make more sense to be called sth else
Because this is a general plugin for the whole community, not only for us and not only for zarp. You may think that one word sounds better over the other but someone else on the other side of the globe might disagree. To avoid subjective opinions on "names" the plugin has a clean design: whatever comes in as a directory name comes out as s section name . Hard-coding your selected words to overwrite that logic is a poor software design decision.
The whole idea is that the folder names are not always appropriate as section titles.
It is on the user's end to provide his/her data in such directories that he/she would like to see as titles in the report.
Wouldn't it be more useful for the user to see MultimappersIncluded instead of UniqueMultiple in the title?
Yes it would! That is why in this comment I described a solution very close to what you proposed, it is just that the plugin parsing mechanism is conditional - so that it behaves "as expected" normally, but you could provide a special flag which turns on names overwrite process (and thus fits what you would like to see in the report for zarp).
Isn't this name also automatic from ALFA?
These names are automaticaly assigned way back in STAR, as I described here. They are just propagated downstream since then.