Skip to content
Snippets Groups Projects
Name Last commit Last update
..
tests
README.md
translate2modelcif.py

Modelling of African Swine Fever proteome from USDA

Main links:

Setup:

  • Using AlphaFold for monomer predictions with default CASP14 setup (no PAE, no pTM, templates used and relaxation enabled)
    • 196 models done with default setup, 1 model (QP509L) with done with AF colab notebook and separate GROMACS relaxation step
  • Input from them:
    • PDB files for top ranked relaxed model
    • CSV file with crosslinks (UniProt and NCBI), title, description and original filename

Special features here:

  • Somewhat generic code for AlphaFold modeling step and sequence DBs used (can distinguish full_dbs and reduced_dbs and template search)
  • pLDDT extracted from b-factors (simplest setup since no other QA scores anyway)
  • Model file names did not contain information on AlphaFold model number (hence info in CSV file)
  • Crosslinks to UniProt and NCBI (with sanity checks on both)
  • Dealing with entries which cover subset of reference sequence (CP2475L.. for UniProt A0A2X0THU5)
  • Special case (QP509L) with GROMACS model relaxation step (pLDDT fetched from separate file)

Content:

  • translate2modelcif.py : script to do conversion based on CoFFE-sponge-proteins project (identical Docker setup used)
  • tests folder with
    • test_modelCIF_MA.py to convert ModelCIF to content displayed in ModelArchive (needs gemmi library)
    • test.ipynb and .html for tests performed during development