# Clinical and Bacterial Determinants of Unfavorable Tuberculosis Treatment Outcomes: An Observational Study in Georgia
## Supplementary Material
This repository contains the supplementary material for the study entitled
**"Clinical and bacterial determinants of unfavorable tuberculosis treatment outcomes: an observational study in Georgia"**
available as a non-reviewed pre-print in *medRxiv*[DOI: 10.1101/2025.01.20.25320828v1](https://www.medrxiv.org/content/10.1101/2025.01.20.25320828v1)
The supplementary materials provided here include:
- The complete dataset.
- A step-by-step description of all analyses conducted using R Markdown documents.
- Supplementary results
## Repository Structure
### 1. `data/`
This folder contains the data used in the study.
-`georgia_all_cases.tsv` – The complete dataset of tuberculosis cases used in the analysis.
-`toutcomes_all.treefile` – The phylogenetic tree file, inferred with IQ-TREE2, including *M. canettii* as outgroup.
---
### 2. `analysis/`
Contains R Markdown scripts and supporting files for the analyses.
#### Key files:
-`toutcomes_analysis.Rmd` – The main script describing and performing the analysis of the study.
-`GWAS_analysis.Rmd` – Script containing the analysis of GWAS output files of pyseer.
-`toutcomes_phylogeny.Rmd` – Script for plotting the phylogenetic tree shown in Figure 1.
#### Subfolder: `GWAS/`
This subdirectory holds data and results related to GWAS analyses.
**Input files:**
-`georgia_toutcomes.covariates` – Covariate data used in GWAS models (treatment outcome).
-`georgia_toutcomes.pheno` – Phenotype data of tuberculosis patients (treatment outcome).
-`georgia_toutcomes.sublineages` – Data on genetic sublineages used in the study (treatment outcome).
**Results files (in `GWAS/results/`):**
-`gtoutcomes.pyseer.mixed_effects.results.txt` – Results from mixed effects model on treatment outcomes.
-`gtoutcomes.pyseer.mixed_effects.BURDEN.results.txt` – Burden analysis results for treatment outcomes.
-`cavity.pyseer.mixed_effects.results.txt` – Results for the analysis of cavitary disease.
-`cavity.pyseer.mixed_effects.BURDEN.results.txt` – Burden analysis for cavitary disease.
-`infiltration.pyseer.mixed_effects.results.txt` – Results for infiltration disease manifestation.
-`infiltration.pyseer.mixed_effects.BURDEN.results.txt` – Burden analysis for infiltration.
-`disemination.pyseer.mixed_effects.results.txt` – Results for disease dissemination.
-`disemination.pyseer.mixed_effects.BURDEN.results.txt` – Burden analysis for dissemination.
-`Rv1462_nonsyn_90freq_mutations.tsv` – List of samples with fixed mutations in the *Rv1462 (sufD)* gene.
#### Subfolder: `disease_manifestation/`
Contains pyseer input data files used in GWAS models (disease manifestation).
-`georgia_cavity.pheno` – Data on cavitary disease cases.
-`georgia_infiltrate.pheno` – Data on infiltration cases.
-`georgia_dissemination.pheno` – Data on dissemination cases.
-`georgia_disman.covariates` – Covariate data.
- `georgia_disman.sublineages` – Sublineage data.
---
### 3. `results/`
This folder contains results for different anaylses. For models,
it contains ROC-AUC and PR-AUC plots, forest plots, and tables with the results of the
associations. Additionally, it contains:
-`EDA/summary_univariate_table.tsv` - Table with unadjusted odds ratios for all variables.
-`feature_selection/` – Results of feature selection.
-`phylogeny/` – The phylogeny shown in Figure 1.
- `imputed_toutcomes_all.tsv` – Final dataset with **imputed values** for TTP and BMI used for statistical analysis.
---
## Citation
If you use this repository, please cite our pre-print available in *medRxiv*. [DOI: 10.1101/2025.01.20.25320828v1](https://www.medrxiv.org/content/10.1101/2025.01.20.25320828v1)
---
For any questions or issues related to this repository, please contact [Galo A. Goig](galo.goig@swisstph.ch)