Skip to content
Snippets Groups Projects
Commit 99bc8547 authored by Niels Schlusser's avatar Niels Schlusser
Browse files

Added license statement to readme

parent c556d243
No related branches found
No related tags found
No related merge requests found
......@@ -17,3 +17,6 @@ The preprocessing procedure for the clinvar variants downloads the clinvar vcf f
The learning scripts for all three usecases work the same way: There are a few parameters to specify in the middle of the script: maximum 5'UTR length in nt, the length of the region of the CDS to be considered in nt, the name of the 5'UTR input column in the input file, the name of the CDS column in the input file, the names of the non-sequential features columns in the input file, the number of non-sequential features the pretrained model was trained on (usually 5, for transfer learning only), the name of the output_column, the path to the data set, the path to the directory where to save the scalers, the path for saving the trained model, and finally the path for the pretrained model (for transfer learning, only). All these scripts can be run in normal mode (training and testing), with the suffix 'predict' after 'python3 <scriptname>' for prediction and scatterplot creation, only, and with the suffic 'train' for training only.
The transfer-learning directory also contains a script for making end-to-end predictions and a script for predicting the log TE of the entire clinvar data set.
For the end-to-end prediction, just the input sequences (UTR and CDS), and, if necessary, the number of exons per transcript need to be provided in a tsv file, all other non-sequential features are computed by the script.
This code is published under the MIT license.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment