# Modelling of Spongilla lacustris proteome with functional annotations

Main links:
- [Link Project in MA](https://modelarchive.org/doi/10.5452/ma-coffe-slac) (incl. background on project itself)
- [Jira-story](https://jira.biozentrum.unibas.ch/browse/SCHWED-5596)

Setup:
- Using ColabFold for monomer predictions with AlphaFold without links to sequence databases
- Input from them:
  - one tarball for the PDB files
  - one tarball for the JSON files
  - a CSV file with title and description for each protein
  - FASTA file with all sequences (used for sanity checks)

Special features here:
- Description is long multiline text which includes output from functional annotation
- Expected to be the first set of models converted by us to ModelCIF and fully imported into ModelArchive
- Includes generic code for handling of ColabFold setup based on config.json
- Includes test code for conversion of ModelCIF to content displayed in ModelArchive

Content:
- translate2modelcif.py : script to do conversion; compatible with Docker setup from [ma-wilkins-import](https://git.scicore.unibas.ch/schwede/ma-wilkins-import/-/tree/6bbd6fa7ec53e1a0971fba40c96fa971d1022f74) (and script based on code there)
- tests folder with
  - custom Docker setup used locally (Mac; with extra libraries for testing) and on work machine (managed CentOS with old docker version)
  - test_modelCIF_MA.py to convert ModelCIF to content displayed in ModelArchive (needs gemmi library)
  - test.ipynb and .html for tests performed during development