diff --git a/README.md b/README.md new file mode 100644 index 0000000000000000000000000000000000000000..7dfeeb0cd03722cd48fb688a4c396f3d0d4da396 --- /dev/null +++ b/README.md @@ -0,0 +1,80 @@ +# Clinical and Bacterial Determinants of Unfavorable Tuberculosis Treatment Outcomes: An Observational Study in Georgia + +## Supplementary Material + +This repository contains the supplementary material for the study entitled +**"Clinical and bacterial determinants of unfavorable tuberculosis treatment outcomes: an observational study in Georgia"** +available as a non-reviewed pre-print in *medRxiv* [DOI: 10.1101/2025.01.20.25320828v1](https://www.medrxiv.org/content/10.1101/2025.01.20.25320828v1) + +The supplementary materials provided here include: + +- The complete dataset. +- A step-by-step description of all analyses conducted using R Markdown documents. +- Supplementary results + +## Repository Structure + +### 1. `data/` +This folder contains the data used in the study. + +- `georgia_all_cases.tsv` – The complete dataset of tuberculosis cases used in the analysis. +- `toutcomes_all.treefile` – The phylogenetic tree file, inferred with IQ-TREE2, including *M. canettii* as outgroup. + +--- +### 2. `analysis/` +Contains R Markdown scripts and supporting files for the analyses. + +#### Key files: +- `toutcomes_analysis.Rmd` – The main script describing and performing the analysis of the study. +- `GWAS_analysis.Rmd` – Script containing the analysis of GWAS output files of pyseer. +- `toutcomes_phylogeny.Rmd` – Script for plotting the phylogenetic tree shown in Figure 1. + +#### Subfolder: `GWAS/` +This subdirectory holds data and results related to GWAS analyses. + +**Input files:** +- `georgia_toutcomes.covariates` – Covariate data used in GWAS models (treatment outcome). +- `georgia_toutcomes.pheno` – Phenotype data of tuberculosis patients (treatment outcome). +- `georgia_toutcomes.sublineages` – Data on genetic sublineages used in the study (treatment outcome). + +**Results files (in `GWAS/results/`):** +- `gtoutcomes.pyseer.mixed_effects.results.txt` – Results from mixed effects model on treatment outcomes. +- `gtoutcomes.pyseer.mixed_effects.BURDEN.results.txt` – Burden analysis results for treatment outcomes. +- `cavity.pyseer.mixed_effects.results.txt` – Results for the analysis of cavitary disease. +- `cavity.pyseer.mixed_effects.BURDEN.results.txt` – Burden analysis for cavitary disease. +- `infiltration.pyseer.mixed_effects.results.txt` – Results for infiltration disease manifestation. +- `infiltration.pyseer.mixed_effects.BURDEN.results.txt` – Burden analysis for infiltration. +- `disemination.pyseer.mixed_effects.results.txt` – Results for disease dissemination. +- `disemination.pyseer.mixed_effects.BURDEN.results.txt` – Burden analysis for dissemination. +- `Rv1462_nonsyn_90freq_mutations.tsv` – List of samples with fixed mutations in the *Rv1462 (sufD)* gene. + +#### Subfolder: `disease_manifestation/` +Contains pyseer input data files used in GWAS models (disease manifestation). + +- `georgia_cavity.pheno` – Data on cavitary disease cases. +- `georgia_infiltrate.pheno` – Data on infiltration cases. +- `georgia_dissemination.pheno` – Data on dissemination cases. +- `georgia_disman.covariates` – Covariate data. +- `georgia_disman.sublineages` – Sublineage data. +--- + +### 3. `results/` +This folder contains results for different anaylses. For models, +it contains ROC-AUC and PR-AUC plots, forest plots, and tables with the results of the +associations. Additionally, it contains: + +- `EDA/summary_univariate_table.tsv` - Table with unadjusted odds ratios for all variables. +- `feature_selection/` – Results of feature selection. +- `phylogeny/` – The phylogeny shown in Figure 1. +- `imputed_toutcomes_all.tsv` – Final dataset with **imputed values** for TTP and BMI used for statistical analysis. +--- + +## Citation + +If you use this repository, please cite our pre-print available in *medRxiv*. [DOI: 10.1101/2025.01.20.25320828v1](https://www.medrxiv.org/content/10.1101/2025.01.20.25320828v1) + +--- + +For any questions or issues related to this repository, please contact [Galo A. Goig](galo.goig@swisstph.ch) + +---