Modelling of African Swine Fever proteome from USDA
Link to project in ModelArchive (incl. background on project itself)
Setup:
- Using AlphaFold for monomer predictions with default CASP14 setup (no PAE, no pTM, templates used and relaxation enabled)
- 196 models done with default setup, 1 model (QP509L) with done with AF colab notebook and separate GROMACS relaxation step
- Input from them:
- PDB files for top ranked relaxed model
- CSV file with crosslinks (UniProt and NCBI), title, description and original filename
Special features here:
- Somewhat generic code for AlphaFold modeling step and sequence DBs used (can distinguish full_dbs and reduced_dbs and template search)
- pLDDT extracted from b-factors (simplest setup since no other QA scores anyway)
- Model file names did not contain information on AlphaFold model number (hence info in CSV file)
- Crosslinks to UniProt and NCBI (with sanity checks on both)
- Dealing with entries which cover subset of reference sequence (CP2475L.. for UniProt A0A2X0THU5)
- Special case (QP509L) with GROMACS model relaxation step (pLDDT fetched from separate file)
Content:
- translate2modelcif.py : script to do conversion based on CoFFE-sponge-proteins project (identical Docker setup used)