Added cancer-PPI-domains project (SCHWED-5735)

631f4044 · Gerardo Tauriello · 1bfe473c · 631f4044 · 631f4044
Commit 631f4044 authored 2 years ago by Gerardo Tauriello
--- a/projects/cancer-PPI-domains/README.md
+++ b/projects/cancer-PPI-domains/README.md
+# Modelling of Spongilla lacustris proteome with functional annotations
+
+[Link to project in ModelArchive](https://modelarchive.org/doi/10.5452/ma-t3vr3) (incl. background on project itself)
+
+Setup:
+- Domains of interacting proteins extracted from full length proteins (sequences from UniProtKB)
+- Models generated using sequences of domains which can have discontinuous mapping to full length sequence
+- Same protocol used as in [model set for core eukaryotic protein complexes](https://www.modelarchive.org/doi/10.5452/ma-bak-cepc)
+  - Paired multiple sequence alignment (MSA) generated for each dimer
+  - Model using AlphaFold ("model 3" parameters; pTM monomer version) with a 200 residue gap between the two chains, without templates and without model relaxation
+- Input from them:
+  - one zip file with all the PDB files (no b-factor values, residue numbers matching position in UniProtKB sequence)
+  - one zip file with all the extra files (1 fasta file for alignment, 1 npz file with pLDDT, PAE and contact probabilities)
+  - a CSV file with description and UniProtKB links for each protein
+
+Special features here:
+- Custom MSA generation with intermediate result in accompanying data
+- PAE and contact probabilities only kept for inter-chain residue-pairs
+- Author provided residue numbers kept as auth_seq_num
+- Mapping to most recent UniProtKB sequence generated, checked and stored as fasta files (ModelCIF file only has covered range with respect to the originally used sequence)
+
+Content:
+- translate2modelcif.py : script to do conversion; compatible with Docker setup from [ma-wilkins-import](https://git.scicore.unibas.ch/schwede/ma-wilkins-import/-/tree/6bbd6fa7ec53e1a0971fba40c96fa971d1022f74) (and script based on code there)
--- a/projects/cancer-PPI-domains/translate2modelcif.py
+++ b/projects/cancer-PPI-domains/translate2modelcif.py