Commit ffae1181 authored by Studer Gabriel's avatar Studer Gabriel
Browse files

modelling: add templates, alignments and sequence profiles to benchmark set

parent 9320b710
......@@ -2,21 +2,25 @@ Scripts and data to reproduce the homology modelling accuracy benchmark.
The whole benchmark is a three step process:
1. Benchmark generation
1. Fetch Benchmark Set
2. Modelling with ProMod3 and MODELLER
3. Evaluation
1. Benchmark generation
Fetch Benchmark Set
First you need to download raw modeling evaluation data from cameo3d.org.
However, for the template selection you need access to a SWISS-MODEL instance
registered to CAMEO. We document the steps we performed but suggest to skip this
step and stick to the data we provide to you.
We suggest to use the already provided benchmark set as you need a running
SWISS-MODEL instance which is not publicly available. The steps are nevertheless
documented here for internal reference.
- download raw modelling evaluation data from CAMEO. Data for the last three
- Download raw modelling evaluation data from CAMEO. Data for the last three
months is available here: https://www.cameo3d.org/sp/3-months/
- adapt the **eval_dir**, **start_date** and **end_date** variables in
fetch_cameo_targets.py and run the script.
Adapt the **eval_dir**, **start_date** and **end_date** variables in
fetch_cameo_targets.py and run the script:
`python fetch_cameo_targets.py`
- Templates, alignments and sequence profiles are fetched from SWISS-MODEL.
Adapt the **sm_project_dir** variable in fetch_templates.py and run the script
using the SWISS-MODEL executable:
`sm fetch_templates.py`
......
>target
MTITAL-PTGLYAEVLSFYGHQMQKLDGRDFAGYAATFTEDGEFRHSPSLPA-----------AHTRAGITAVLEDFHRKFDARKIQRRHWFDHTALSQASDGSITATSYCLVLTVHADVKAPEFGPSCLVHDVLVRGADGELLLRSRHVTHDHVFPA
>pdb_id=3eby, chain=A, assembly_id=1, offset=1 atoms
-----EVAQVAQSAIDDFNAAYGLCLDDDRLEQWPTLFVDDCLYQ-VIARENVDNGLPAAVMYCDSKGMLADRVVALRKAN----HFNRHLIGRAVITGVEGDQVSAEASYVVFQTRNDGE-TRIYNAGKYVDRFDL-SGGTVRLKSRTCIYDTL---
HHsearch 1.5
NAME target
FAM
FILE query_hhblits
COM /scicore/soft/apps/HH-suite/2.0.16-foss-2018b/bin/hhmake -i <104 characters> -o <104 characters>
DATE Sat May 30 06:22:16 2020
LENG 146 match states, 146 columns in multiple alignment
FILT 52 out of 54 sequences passed filter (-id 90 -cov 0 -qid 0 -qsc -20.00 -diff 100)
NEFF 4.1
SEQ
>ss_pred PSIPRED predicted secondary structure
CCCCCCCCCHHHHHHHHHHHHHHHCCCCCHHHHHHHHCCCCEEECCCCCCCCCCHHHHHHHHHHHHHHHHHCCCCEEEEEECEEEEECCCCEEEEEEEEE
EEEECCCCCCCEEEEEEEEEEEEEECCCCCEEEEEEEECCCCCCCC
>ss_conf PSIPRED confidence values
9866677502799999999986531699836677530688245149998864563889999988654433058712555410467763898299999999
9984589764435327899679995559817897888416786579
>Consensus
MtxxxxxxxxyaeVqqfYArQmxxlDxgrxexwAxTFTxDGvFxhxxxxePxxGrxaixaxxrxxxxxxaaxgxxrRHwxxmlxvxxxxdgsvxxxxYal
vvxTxxggxxxxlxxsxxxxDxLVrxxxgxwrvrxRxVxrDxxxxx
>target
MTITALPTGLYAEVLSFYGHQMQKLDGRDFAGYAATFTEDGEFRHSPSLPAAHTRAGITAVLEDFHRKFDARKIQRRHWFDHTALSQASDGSITATSYCL
VLTVHADVKAPEFGPSCLVHDVLVRGADGELLLRSRHVTHDHVFPA
>gi|8926184|gb|AAF81722.1| EncG [Streptomyces maritimus]
ESALTQaPHKPeetaalFAEIQQFYGRHMRAMDEGRVEDWTGDFMPDAVFA-TNARpEPQRGRAEIARNAEAAARQLQEKGILRRHCVTTLELQQASGLT
LLAKTYALITKTVVGGR-SELEFVCTCEDVLVR-NEGRWFIRHRQVFRDDL---
>gi|305861167|gb|ADM72826.1| putative cyclase [Streptomyces aureofaciens]
MCDErghergGRRTTCrrrlYAEVQQFYARQMNLLEGeaADPDAWAETFTEDAVFE-SAGEpEPQTGREDLRETVRAGVNHIAAQGLDFRHWFGMVGVEQ
RPDDTLRTRYYALAMATPQGGP-LKIRGSLLCHDDLVRGEDGGWLVRRRSLDADGR---
>gi|183981579|ref|YP_001849870.1| hypothetical protein MMAR_1563 [Mycobacterium marinum M]gi|183174905|gb|ACC40015.1| conserved hypothetical protein [Mycobacterium marinum M]
IPSA------nrlrlfswprgkalhsclYAEVQQFYARQMRLL--dlRDIGSHAKTFTKDAT-fEH-I-pGaGPVSTRAAIEKALLKWD-SRAGDAIQRR
HWLSMID-lASRSAGVIEASAYLIEINTRPGTK-PAIAQSCVIHDKL-tR-VSGQLLISSRKVLEDG----
>gi|282862818|ref|ZP_06271879.1| aromatic-ring-hydroxylating dioxygenase beta subunit [Streptomyces sp. SirexAA-E]gi|282562504|gb|EFB68045.1| aromatic-ring-hydroxylating dioxygenase beta subunit [Streptomyces sp. SirexAA-E]
MNPNAPapaASARLAEIHQLYHLQSHLIDGGRAAEWAATFTADGSFT-SPSYpAPVTGTAALTAFAERFAADCAADGVVRRHVVSNVALTgDDGAGTVRV
EAYLQIVATPGGGT-PHTERFTTLTDRVVH-DGSGWRIVARVVRRDGA---
>gi|84619199|emb|CAJ42323.1| cyclase [Streptomyces steffisburgensis]
-----MSTAAdhrtfekiYAEVQQFYARHIQLMDEGRAEEAALTFTEDASLLSPPKIaEPIRGRLKLGAGLRKVADELDAEGVRYRRCHTMMSVEPRPDG
QVFVRAYVQVIRTRRGGE-STLHAMCVCEDLLAR-EDGELKVHRRVVTRDDS---
>gi|209863913|gb|ACI88858.1| Aln2 [Streptomyces sp. CM020]
----------YVEVTQFYARQMHRMDGDDFGGFAATFVAGAEFR-LAGGTVLTGPEAIEAGARAAAGRF--DGAQPRHWFDMMTVEEADDGTVSTSYYAT
VTVTSAQGA-VLVEPTCFVRDTLVR-VSGVLRSRSRVIERDDLV--
>gi|336178861|ref|YP_004584236.1| nuclear transport factor 2 [Frankia symbiont of Datisca glomerata]gi|334859841|gb|AEH10315.1| nuclear transport factor 2 [Frankia symbiont of Datisca glomerata]
MTVAHElDAAARQEITELYAQYTHAFDDNSPEDLADLFTDDGIFV-RDDAEPVHGRAALAELVR----GVAARGAGSRHLVSSVVVEP-SATGASGSAYV
QVISIDADTV--RLVVIGRYHDEFAR-SEGRWRFRSRRFT-------
>gi|300788974|ref|YP_003769265.1| hypothetical protein AMED_7146 [Amycolatopsis mediterranei U32]gi|299798488|gb|ADJ48863.1| conserved hypothetical protein [Amycolatopsis mediterranei U32]gi|340530606|gb|AEK45811.1| hypothetical protein RAM_36700 [Amycolatopsis mediterranei S699]
MPTATPgSTAQGgaevvAEVRHAIAAYCQALDDGRVDDLVSLFTPDGVSA-LPGMDPVEGHDALRELYR----GVTAQGGTTRHVVVNTAVSTGDGNQVS
AVSDLVFLSHGENGW--QIALAGRYDDVLRR-HDGHWLFAGRSLT-------
>gi|312196375|ref|YP_004016436.1| aromatic-ring-hydroxylating dioxygenase subunit beta [Frankia sp. EuI1c]gi|311227711|gb|ADP80566.1| aromatic-ring-hydroxylating dioxygenase beta subunit [Frankia sp. EuI1c]
------------mtagevgladqfAIHMTLARYCHRCDDADFEGLVALFTADAVFT-Y-GNRSAHGSAELLAFFR----DTQSRPe-QRgKHLTVNEVYE
PDGdrg-DRVLAASDFVFLRFASGRL--VPAIAGRYHDQFVR-VDGEWRIARREVL-------
#
NULL 3706 5728 4211 4064 4839 3729 4763 4308 4069 3323 5509 4640 4464 4937 4285 4423 3815 3783 6325 4665
HMM A C D E F G H I K L M N P Q R S T V W Y
M->M M->I M->D I->M I->I D->M D->D Neff Neff_I Neff_D
0 * * 0 * 0 * * * *
M 1 * * * 4745 * * * 4827 * * 109 * * * * * * * * * 1
0 * * * * * * 4237 0 0
T 2 4470 4996 5520 * * 4834 * * * * * 5086 3384 * 5167 1643 1346 * * * 2
0 * * * * * * 4237 0 0
I 3 2357 * 3397 3699 * 3843 5510 4322 * * * 4945 4119 5297 5167 3285 2612 3612 * * 3
113 3728 * 1765 503 * * 4237 1143 0
T 4 1898 * 4615 3562 * * 4564 5263 5167 4138 * 5086 4430 4461 5724 3835 2247 * * 4945 4
572 4235 1868 2139 372 * * 4237 1070 0
A 5 1964 * * 2921 * 3737 4178 * * * * * * * 4554 4606 1675 4955 * 4253 5
256 * 2620 * * * 0 3920 0 1717
L 6 4871 * 5117 3480 * 3505 5015 * * 3324 2778 * 3104 2852 4261 * 3320 3724 * * 6
1601 577 * 442 1923 1805 487 3868 2194 2015
P 7 2605 * 2953 * * 4120 * * * * * * 1944 5104 4677 2215 4363 * 4133 * 7
0 * * * * * 0 3964 0 1717
T 8 2161 * 5167 5377 * 4206 4680 * * 4252 * * 4062 * * 4067 1314 4133 * * 8
0 * * * * * 0 3964 0 1717
G 9 1309 * 3626 4023 * 4102 5104 4180 4680 * * * * 4771 * 5011 3344 3236 * * 9
0 * * * * * 0 3964 0 1717
L 10 2753 4677 4821 * * 5193 * * * 3110 * * 1533 4537 3692 4179 4168 4133 * * 10
1360 826 4432 2979 196 0 * 3964 2836 1717
Y 11 * * * * 5147 3615 6012 5759 * 5474 5768 * * * 3892 * * 3319 * 626 11
123 4719 4518 2322 322 * 0 4139 1034 1049
A 12 1025 * * * * 5924 5015 * * 3484 * * * 3814 * 5794 2794 2853 * * 12
0 * * 3585 126 * 0 4030 1063 1191
E 13 2880 * 5142 950 * 4787 * * * 4814 * * * 3043 3470 4837 * * * * 13
0 * * * * 0 * 4125 0 1191
V 14 * * * * * * * 1394 * 5778 * * * * * * * 734 * * 14
0 * * * * * * 4329 0 0
L 15 * * * 5069 * * 2440 * * 3164 * * * 996 3319 * 3783 * * * 15
0 * * * * * * 4329 0 0
S 16 3985 * 4393 4954 * * 2764 * * * 4783 * * 749 5056 4367 * * * * 16
0 * * * * * * 4329 0 0
F 17 5068 * * * 590 * 5731 * * 2472 * * * * * * 3865 4713 * * 17
0 * * * * * * 4329 0 0
Y 18 * * * * * * 5926 4009 * 3748 * * * 5731 * * * * * 272 18
0 * * * * * * 4329 0 0
G 19 370 * * * * 2286 5580 * * * * * * * * * * * * * 19
0 * * * * * * 4329 0 0
H 20 2647 * 5362 * * 5392 3487 * * 5580 * * * 4954 622 * * * * * 20
0 * * * * * * 4329 0 0
Q 21 * * * * * * 2715 * * * * * * 480 * * * * * 2936 21
0 * * * * * * 4329 0 0
M 22 4139 3925 * * * * * 5155 * * 756 * * * * 2777 3514 5403 * * 22
0 * * * * * * 4342 0 0
Q 23 5377 * * * * 4813 1568 * * * * 5536 * 1551 2226 5235 * * * * 23
0 * * * * * * 4342 0 0
K 24 1597 4856 * * 5880 * 3706 * 5620 1170 * * * * 4023 * * * * 6049 24
0 * * * * * * 4342 0 0
L 25 * 4789 * * 4962 * * 2777 * 707 3277 * * 4718 * * * 4960 * * 25
348 * 2222 * * * * 4342 0 0
D 26 * * 40 5199 * * * * * * * * * * * * * * * * 26
0 * * * * * 0 4133 0 1717
G 27 3569 * 1980 2431 * 2107 5081 * * * * * * * * 2346 5763 * * * 27
40 5199 * 1000 1000 0 * 4133 1797 1717
R 28 2964 6179 5302 * * 630 * * * 5266 5403 4962 * * 3639 * * 5343 * * 28
0 * * * * * * 4342 0 0
D 29 5054 * 2107 3472 5874 * 4735 * 4764 * * 5880 * 4898 1078 4962 * * * * 29
0 * * * * * * 4342 0 0
F 30 1194 * * * 2405 4944 * 5303 * 5266 * * 3362 5457 * * 3565 3898 * 5746 30
39 5235 * 1000 1000 * * 4342 1025 0
A 31 2576 * 2008 1071 * 4302 5880 * * * * * * 5457 * 5853 * * * * 31
0 * * * * * * 4342 0 0
G 32 2782 * 1983 1513 * 2796 * 6089 * * * * * * 6057 4511 4889 * * * 32
0 * * * * * * 4342 0 0
Y 33 5155 * * * 3438 * 5303 4960 * 3352 * * * * * * * 4718 908 2706 33
0 * * * * * * 4342 0 0
A 34 301 * * * * * * * * * * * * * * * 5284 2621 * * 34
0 * * * * * * 4342 0 0
A 35 1228 * 2657 3802 * 3298 * * 5303 5155 * * * 4242 3246 5076 * * * * 35
0 * * * * * * 4342 0 0
T 36 * * 4534 * * * * * * 2879 6052 * * * * 5403 354 * * * 36
0 * * * * * * 4342 0 0
F 37 * * * * 56 * * * * * * * * * * * * * * 4718 37
0 * * * * * * 4342 0 0
T 38 4107 * * * * * * * * 4960 5284 * * * * * 311 3683 * * 38
0 * * * * * * 4342 0 0
E 39 2228 * 3468 1399 * 5403 6110 * 5303 * * * 2185 * * 4897 * * * * 39
0 * * * * * * 4342 0 0
D 40 * * 109 5343 * 5302 * * * * * 5457 * * * * * * * * 40
0 * * * * * * 4342 0 0
G 41 1830 * * * * 476 * * * * * * * * * * * * * * 41
0 * * * * * * 4342 0 0
E 42 * * * 2142 * * * 4962 * 5746 * * * * * 3378 3163 1051 4960 * 42
348 * 2222 * * * * 4342 0 0
F 43 4701 * * * 288 * * * * 3206 * * * * * 4873 * * * * 43
0 * * 0 * 0 * 4133 1717 1717
R 44 2370 * 3202 3558 * * 3424 * 5865 4073 * 5910 5377 3697 3365 3665 3726 3757 * * 44
1870 * 461 * * * * 4342 0 0
H 45 * * * * * * 220 * * * * * * * * 2822 * * * * 45
1773 * 500 * * 30 5586 2751 0 3326
S 46 2949 * * 4404 4691 * * * * 2703 * * 3229 3994 4168 1687 5004 5509 * 4562 46
108 * 3793 * * 98 3934 4136 0 1770
P 47 2818 * 4811 6265 * * 6012 5207 * * * 2691 1309 * 5326 3143 3831 * * * 47
374 * 2131 * * 549 1660 4247 0 1208
S 48 2652 * 4050 * * 1819 5934 5516 4905 4921 * * * * * 1615 4421 5688 * * 48
0 * * 0 * 150 3338 4109 1717 1800
L 49 3121 * 6144 5488 4320 2426 4487 5125 * 4771 5063 4767 * 5180 2758 5300 5049 4151 * 2750 49
2129 374 * 0 * * 0 4322 3395 1016
P 50 3092 * 2616 1003 * 5268 * * * * * * 3125 5875 4767 * 5258 * * * 50
27 5768 * 0 * 0 * 4322 1000 1016
A 51 2996 * * * * 5377 * * * * * * 417 * 4631 4789 * 5302 * * 51
0 * * * * * * 4342 0 0
A 52 1832 * 5235 * 6072 * * 3623 * 3111 * * * 3961 * 4129 6024 1539 * * 52
0 * * * * * * 4342 0 0
H 53 4503 * * 3608 * * 3075 4483 * 5755 * * * 5865 1638 4036 3017 2566 * * 53
0 * * * * * * 4342 0 0
T 54 * * * * * 260 * * * * * * * * * 5377 2826 * * * 54
0 * * * * * * 4342 0 0
R 55 * * * * 5457 * 2848 * * * * * 5302 * 601 4294 3819 4976 * * 55
20 * 6193 * * * * 4342 0 0
A 56 1284 * 2351 2652 * * * * 5855 5133 * * 3416 6028 * * 4251 5220 * * 56
0 * * * * * 0 4347 0 1000
G 57 826 * 4069 3211 * 2427 * * 5133 6025 * * * * 5580 5831 * * * * 57
0 * * * * * 0 4347 0 1000
I 58 * * * * * * * 1022 * 1183 5258 * * * 5357 * * 5925 * * 58
0 * * * * * 0 4347 0 1000
T 59 1334 * * 3582 * 5133 * 5759 * 3483 * * * * 2348 * 3216 3656 * * 59
0 * * * * 0 * 4347 0 1000
A 60 829 * 4604 2635 * 5235 * * 4587 * * * * 4859 3354 5840 5880 * * * 60
0 * * * * * * 4342 0 0
V 61 2025 * 5746 2965 2457 2862 * * * 4018 * 5284 * * * 2982 4219 5620 * * 61
0 * * * * * * 4342 0 0
L 62 1427 * * * 3753 * * 5403 * 2062 5266 * * * * 4992 * 2544 * 4017 62
0 * * * * * * 4342 0 0
E 63 3898 * 5853 2698 * * 5865 * 4718 5303 6079 4880 * * 847 4960 * 5280 * 5746 63
265 * 2573 * * * * 4342 0 0
D 64 1972 * 3248 3088 * 4350 * * 2872 5798 * 5807 * 4828 2246 4839 5651 * * * 64
0 * * * * * 0 3820 0 1541
F 65 2126 * * * 1383 3698 5223 * * * * 5747 * * 4078 4704 3739 4874 5028 5164 65
0 * * * * * 0 3820 0 1541
H 66 1762 * 5028 * 5649 * 1954 * 5655 4366 * 4469 4667 5164 5782 4786 4848 3315 * 5262 66
45 * 5028 * * * 0 3820 0 1541
R 67 1490 * 3456 2298 * 4961 5318 * 4273 * * 4177 5045 5598 4161 4791 4559 * * * 67
238 2716 * 0 * 0 * 3772 1376 1639
K 68 2499 * 3499 4525 * 2615 5536 5235 4054 6409 * 5853 5840 4551 1798 4545 * * * * 68
0 * * * * * * 4342 0 0
F 69 3113 3888 * * 2408 * 4317 5536 * 1876 * * 4944 * 5303 * 4258 4018 4718 3747 69
214 * 2858 * * * * 4342 0 0
D 70 913 * 3519 4454 * 5808 * * * * * * * 3814 4026 4824 3776 3691 * * 70
0 * * * * 1853 468 4156 0 1437
A 71 1019 * 3979 2684 * 4085 * * * 5216 * 4854 * * 5690 3430 4805 5714 * * 71
22 6051 * 1585 585 0 * 4248 1000 1298
R 72 2807 * 2015 2985 * 5235 * * 4617 * * * 4173 3406 2637 3938 5836 5457 * * 72
0 * * * * * * 4342 0 0
K 73 5303 * 4251 5874 * 941 * * 4493 * * 5788 2080 * 4404 * 4718 * * * 73
77 4789 5956 0 * * * 4342 1063 0
I 74 3544 * * 2139 * 4960 * 2672 * 3579 * * * 4623 5979 * 4868 1627 * * 74
0 * * 1585 585 0 * 4224 1000 1121
Q 75 4869 * 3486 * * 4962 * 5056 * 5284 * * 4718 1286 3761 6311 3518 2581 * * 75
47 * 4960 * * * * 4342 0 0
R 76 * * * * 4654 * 2897 5156 * 3730 * * 5292 4196 1038 4935 4881 4039 * 5147 76
55 4736 * 0 * 0 * 4232 1063 1049
R 77 * * * * * * 5885 * 4789 4718 * * * * 138 * * * * * 77
0 * * * * * * 4342 0 0
H 78 * * 5956 * * * 65 * * * * * * * 5155 * * * * * 78
0 * * * * * * 4342 0 0
W 79 * 4218 * * * * * * * 3043 5782 * * * * * * 2081 896 4924 79
0 * * * * * * 4342 0 0
F 80 * * * * 1820 * 4218 3381 * 1970 * * 5836 * * * 4789 1953 * * 80
0 * * * * * * 4342 0 0
D 81 6089 * 3802 * * 1664 * * * * * 2932 * * * 2450 2196 3925 * * 81
0 * * * * * * 4342 0 0
H 82 * * * * * * 3487 * * 4944 985 2088 * 5985 * 4962 3788 5836 * * 82
0 * * * * * * 4342 0 0
T 83 6052 * * 4789 5836 * * 3068 * 1224 4226 * 4944 * * 6024 3314 2457 * * 83
0 * * * * * * 4342 0 0
A 84 2188 * 2585 3787 5626 4950 6179 * * 4960 * 5942 * 4568 * 4069 3074 2279 * * 84
348 * 2222 * * * * 4342 0 0
L 85 4173 * * * * * * 3666 * 2609 * * * * 5628 * * 644 * 4582 85
0 * * 330 2289 0 * 4133 1717 1717
S 86 3883 * 3035 1656 5403 * * 6179 * 6079 * * * 4134 2769 3606 3513 4847 4944 * 86
52 5588 6110 0 * * * 4342 1005 0
Q 87 4254 * 4232 3184 * * * * * 6169 * 6017 1244 3543 3961 3367 5063 5102 5221 * 87
87 5242 4952 1447 659 0 * 4346 1048 1000
A 88 2588 6368 4085 6174 3925 3146 * * 6052 4725 * * * 2283 2698 5202 4941 4091 4678 * 88
0 * * * * 0 * 4237 0 1044
S 89 3520 * 1929 4020 * 2763 * * 6110 * * * 2159 * 4429 3273 4237 * * * 89
53 4789 * 1585 585 * * 4342 1063 0
D 90 3605 * 590 4129 * 3204 * * * * * 5746 4360 * * * 5622 * * * 90
0 * * * * 0 * 4219 0 1063
G 91 * * 2876 4194 * 705 6072 * 4960 5284 * 5076 * * * 4611 4256 * * * 91
38 * 5266 * * * * 4342 0 0
S 92 3220 * * 3509 * 4924 * * * * * * * 3665 4763 1729 1725 4261 * * 92
0 * * * * 0 * 4303 0 1025
I 93 2831 * * * * * * 3634 * 1547 * * * * * * * 1341 5266 5985 93
0 * * * * * * 4342 0 0
T 94 5457 * 5755 3441 3824 * 4464 * 4960 3727 * * * 3956 1849 3083 2725 4971 * * 94
0 * * * * * * 4342 0 0
A 95 1448 * * * * 4962 * * * * * * * * * 3210 2393 1722 * * 95
0 * * * * * * 4342 0 0
T 96 3920 * 5740 3432 * 4884 4281 * 5302 5723 * * * * 1651 3279 2286 3908 * * 96
0 * * * * * * 4294 0 0
S 97 1904 3730 * * 3901 5477 * * * 5731 * * * * * 1576 4212 5097 * 2936 97
0 * * * * * * 4294 0 0
Y 98 * * 2586 * * * * * * 6057 * * * 6089 * * * * * 315 98
0 * * * * * * 4294 0 0
C 99 1155 5614 * * 4810 * * * * 2083 * * * * * 5250 4278 2465 * * 99
0 * * * * * * 4294 0 0
L 100 4734 * * * 5197 * * 5258 * 1105 * * * 1899 * 6057 4127 3269 * * 100
0 * * * * * * 4294 0 0
V 101 3758 * * 5258 3375 * * 2041 * 4258 * * 5197 * * * 5395 1129 * * 101
0 * * * * * * 4294 0 0
L 102 5395 * * * 6242 * * 2623 * 1758 4120 * * * * * 3266 1617 * 5838 102
0 * * * * * * 4294 0 0
T 103 2640 * * 4034 * 3580 * * 5302 3885 * 4137 * 5342 2439 3249 3592 2758 * * 103
0 * * * * * * 4294 0 0
V 104 * * * * 4810 * 5106 4361 4979 * * * * * 4734 4983 495 3711 * * 104
0 * * * * * * 4294 0 0
H 105 3774 * 4105 5040 * 3346 4849 * * * * * 1779 * 1776 4534 * 3665 * * 105
0 * * * * * * 4294 0 0
A 106 2716 * * 2750 * 5600 * 4826 5757 5202 * * 2504 3851 2252 4810 * 3657 * 5197 106
0 * * * * * * 4294 0 0
D 107 4601 * 3065 * * 523 * * 4979 * * 5106 * 5318 * 4734 5643 * * * 107
0 * * * * * * 4294 0 0
V 108 4315 * * 3105 * 525 * * * * * * * * 4810 * 4109 4471 * * 108
27 * 5740 * * * * 4294 0 0
K 109 3475 * 3357 2956 * * * * 2896 4789 * * 3158 4415 3422 4118 3813 4963 3319 * 109
3371 * 147 * * * 0 4293 0 1000
A 110 1746 * * 2340 * * * * * 2446 * * * 2681 * * * 2600 * * 110
0 * * * * 294 2439 1816 0 3800
P 111 3526 * * * * * * * * 2724 * * 1052 * * 2363 5346 4041 * * 111
0 * * * * 0 * 3816 0 1541
E 112 2867 * 4377 2304 * * 5600 * 4744 5318 5342 * * 4576 1892 4979 4431 3118 * * 112
0 * * * * * * 4294 0 0
F 113 * * * * 4032 * * 2034 * 1420 * * 4810 * * * 5600 1917 * * 113
0 * * * * * * 4294 0 0
G 114 2902 * 5056 3361 4819 3340 2616 * * 3859 * * * 4734 2537 * 4146 3482 5891 * 114
0 * * * * * * 4294 0 0
P 115 3297 * * * 4501 4341 * 4293 * 2467 5152 * 2252 5258 2166 5923 * 3849 * * 115
0 * * * * * * 4294 0 0
S 116 3674 * * * 4114 * 5197 3702 * 5723 3580 * * * * 1108 4284 2818 * * 116
0 * * * * * * 4294 0 0
C 117 5963 1337 * * 5250 2437 * * * 5501 * * * * * 5923 1987 3532 * * 117
0 * * * * * * 4294 0 0
L 118 * * * * 4330 * * * * 4085 * * 5740 * 2373 4513 1751 1562 * * 118
0 * * * * * * 4294 0 0
V 119 5731 1264 * * * * * 4541 * 3783 * * * * * * * 1819 * 2593 119
0 * * * * * * 4294 0 0
H 120 4162 * 2636 2852 5197 5845 2249 * * * * 5854 * 6044 2503 5643 2795 5893 * * 120
0 * * * * * * 4294 0 0
D 121 * * 0 * * * * * * * * * * * * * * * * * 121
0 * * * * * * 4294 0 0
V 122 * * 4753 2997 * * 3304 6117 5258 4554 * * * 4810 3117 * 3708 1231 * * 122
0 * * * * * * 4294 0 0
L 123 * * * * 3890 * * 5854 * 281 * * * * * * * 3439 * * 123
364 * 2163 * * * * 4294 0 0
V 124 3787 * * * * * * * * 5133 * * * * 3074 * * 358 * * 124
27 * 5751 0 * 0 * 4125 1717 1717
R 125 4340 * 5121 4621 * * 4734 * * * * * 4232 4968 542 5863 * 5084 * 5319 125
3238 * 162 * * * 0 4297 0 1000
G 126 * * 2457 2296 * 703 * * * * * * * * * * * * * * 126
0 * * * * 0 * 2355 0 3755
A 127 3397 * 2725 1755 * 4109 3597 * * 5731 * 5302 * * 6067 4974 3615 2792 * * 127
0 * * * * * * 4294 0 0
D 128 4380 * 1427 3754 * 1531 * * * * * * 4452 * 5757 3868 * 5197 * * 128
0 * * * * * * 4294 0 0
G 129 * * 2520 3356 * 569 * * * * * * 6242 5643 * 5600 * * * * 129
0 * * * * * * 4294 0 0
E 130 3990 * 5643 1868 * 1963 5106 * * * * * * 3798 2141 5845 5923 5318 * * 130
0 * * * * * * 4294 0 0
L 131 * * * * * * * 4638 * 1476 * * * * * * * 5097 808 * 131
0 * * * * * * 4294 0 0
L 132 5891 * * * 5302 * 5731 * 3137 1648 5757 * * 5883 1087 * * * * * 132
0 * * * * * * 4294 0 0
L 133 * * * * 3433 * * 2306 * 3487 * 4922 * * * 5318 2911 1235 * * 133
0 * * * * * * 4294 0 0
R 134 2490 5197 * 4979 * 5395 4551 * 3488 4939 * * * * 1000 4206 * 5600 * * 134
0 * * * * * * 4294 0 0
S 135 4866 * 4993 3580 5342 5106 2268 * * * * * * * 1920 1621 * * * * 135
0 * * * * * * 4294 0 0
R 136 * * * * * * * * * * * * * * 0 * * * * * 136
0 * * * * * * 4294 0 0
H 137 5854 * 3765 3371 * * 5614 * 3500 6337 * * * 3464 2884 4290 2941 2283 5740 4063 137
0 * * * * * * 4294 0 0
V 138 * * * * 4974 * * 3310 * 2562 * * * * * * 4734 598 * * 138
0 * * * * * * 4294 0 0
T 139 4556 * 4667 3468 5302 4941 3698 5838 * 3669 * * * * 2807 3676 1613 4979 * 5894 139
0 * * * * * * 4294 0 0
H 140 4043 * * 5010 * * 1879 * 6031 * * * 5151 5551 858 * * * * 5642 140
0 * * * * * * 3816 0 0
D 141 * * 32 5508 * * * * * * * * * * * * * * * * 141
0 * * * * * * 3816 0 0
H 142 5961 * 1421 * * 1727 3967 * 5060 * * 5442 4630 4314 5551 4836 * * * 4485 142
0 * * * * * * 3816 0 0
V 143 2854 * * 4374 4936 * * 4206 * 1815 * * * 3109 3987 3777 3770 3093 * * 143
0 * * * * * * 3516 0 0
F 144 * * * * 947 * * * * * * * * * * * * 1055 * * 144
0 * * * * * * 1528 0 0
P 145 * * * * * * * * * * * * 0 * * * * * * * 145
0 * * * * * * 1000 0 0
A 146 0 * * * * * * * * * * * * * * * * * * * 146
0 * * 0 * * * 1000 0 0
//
This diff is collapsed.
>target
MATANVAGAGGSGSEPTRIAILGKEDIIVDHGIWLNFVAHDLLQTLPSSTYVLITDTNLYTTYVPPFQAVFEAAAPRDVRLLTYAIPPGEYSKSRETKAEIEDWMLSHACTRDTVIIALGGGVIGDMIGYVAATFMRGVRFVQVPTTLLAMVDSSIGGKTAIDTPMGKNLIGAFWQPRRIYIDLAFLETLPVREFINGMAEVIKTAAIWNETEFTALEENAAAILEAVRSKASSPAARLAPIRHILKRIVLGSARVKAEVVSADEREGGLRNLLNFGHSIGHAYEAILAPQVLHGECVAIGMVKEAELARYLGVLRPSAVARLTKLIASYDLPTSVHDKRIAKLSAGKECPVDVLLQKMAVDKKNEGRKKKIVLLSAIGKTYEKKATVVDDRAIRLVLSPSVRVTPGVPKGLSVTVTPPGSKSISNRALVLAALGEGTTRIHGLLHSDDVQYMLAAIEQLHGADFSWEDAGEILVVTGKGGKLQASKEPLYLGNAGTASRFLTSVVALCAPSAVSSTVLTGNARMKVRPIGALVDALRANGVGVKYLEKEKSLPVEVDAAGGFAGGVIELAATVSSQYVSSILMAAPYAHQPVTLRLVGGKPISQPYIDMTIAMMASFGIKVERSAEDPNTYLIPKGVYKNPPEYVVESDASSATYPLAVAAITGTTCTIPNIGSESLQGDARFAVEVLRPMGCAVEQTATSTTVTGPPIGTLKAIPHVDMEPMTDAFLTAAVLAAVADGTTQITGIANQRVKECNRIAAMKDQLAKFGVQCNELEDGIEVIGKPYQELRNPVEGIYCYDDHRVAMSHSVLSTISPHPVLILERECTAKTWPGWWDILSQFFKVQLDGEEDPTKRTTQSTQQVRKGTDRSIFIVGMRGAGKSTAGRWMSELLKRPLVDLDAELERREGMTIPEIIRGERGWEGFRQAELELLQDVIKNQSKGYIFSCGGGIVETEAARKLLIDYHKNGGPVLLVHRDTDQVVEYLMRDKTRPAYSENIREVYERRKPWFYECSNLQYHSPHEDGSEALLQPPADFARFVKLIAGQSTHLEDVRAKKHSFFVSLTVPNVADALDIIPRVVVGSDAVELRVDLLESYE-------PEFVARQVALLRAAAQVPIVYTVRTQSQGGKFPDEDYDLALRLYQTGLRSGVEYLDLEMTMPDHILQAVTDAKGFTSIIASHHDPQCKLSWKSGS-WIPFYNKALQYGDVIKLVGVAREMADNFALTNFKAKMLAAHDNKPMIALNMGTAGKLSRVLNGFLTPVSHPALPSKAAPGQLSATEIRQALSLIGEIEPKSFYLFGKPISASRSPALHNTLFYKTGLPHHYSRFETDEASKALESLIRSPDFGGASVTIPLKLDIMPLLDSATDAARTIGAVNTIIPQTRDGSTTTLVGDNTDWRGMVHALLH-SSGSGSVVQRTAAPRGAAMVVGSGGTARAAIYALHDLGFAPIWIVARSEERVAELVRGFD-GYDLRRMTSPH---QGKDN-MPSVVISTIPATQPIDPSMREVIVEVLKHGHPSAEGKVLLEMAYQPPRTPLMTLAEDQGWRTVGGLEVLAAQGWYQFQLWTGITPLYEEARAAVMGEDSVELEHHHHHH
>pdb_id=5swv, chain=A, assembly_id=1, offset=3 atoms
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------QKFKTKKRSTFLTLNYPRIEDALPTLRDVTVGCDAIEVRVDYLKDPKSSNGISSLDFVAEQISLLRCSTTLPIIFTIRTISQGGLFPNDKEEEAKELMLSAMRYGCDFVDVELGWSSETINILYQHKGYTKLIMSWHDLSGTWSWARPHEWMQKVELASSYADVIKLVGMANNLNDNLELEEFRTRITNSM-DIPLILFNMGRFGQLSRILNKFMTPVTHPLLPSKAAPGQLTVKQLNEARVLIGEILPEKFFLFGKPIKHSRSPILHSTAYELLGLPHTYEAFETDTVDE-VQKVLNLPDFGGANVTIPYKLSVMKFMDELSDEARFFGAVNTIIPIRI-GDKLVLRGDNTDWRGIYDTFANALDGVS---L----RDTNGLVIGAGGTSRAAIYSLHRLGVSRIYLLNRTLANSYRVQDVFPPDYNIHIIDSDNIPSEELSSVTLSAVVSTIPADIELPEKVASVIKALLAN---KADGGVFLDMAYKPLHTPLMAVASDLEWKCCNGLEALVRQGLASFHLWTGMTAPFDAVYQKVI--------------
This diff is collapsed.
This diff is collapsed.
>target
GSMDQPAGLQVDYVFRGVEHAVRVMVSGQVLELEVEDRMTADQWRGEFDAGFIEDLTHKTGNFKQFNIFCHMLESALTQSSESVTLDLLTYTDLESLRNRKMGGRPGSLAPRSAQLNSKRYLILIYSVEFDRIHYPLPLPYQGKP
>pdb_id=1lbj, chain=A, assembly_id=1, offset=1 atoms
-------------------------------------------------------------------------------------VPIFTYGELQRMQEK---------------------------------------------
HHsearch 1.5
NAME target
FAM
FILE query_hhblits
COM /scicore/soft/apps/HH-suite/2.0.16-goolf-1.7.20-Boost-1.53.0-Python-2.7.11/bin/hhmake -i <104 characters> -o <104 characters>
DATE Sat Apr 25 05:56:29 2020
LENG 145 match states, 145 columns in multiple alignment
FILT 41 out of 43 sequences passed filter (-id 90 -cov 0 -qid 0 -qsc -20.00 -diff 100)
NEFF 3.6
SEQ
>ss_pred PSIPRED predicted secondary structure
CCCCCCCEEEEEEEECCEEEEEEEEECCCEEEEEEEECCCCCEEECCCCHHHHHHHHHHHCCCCCHHHHHHHHHHHHCCCCCEEEEEECCHHHHHHHHHC
CCCCCCCCCCCCCCCCCCCEEEEEEEECCCCEECCCCCCCCCCCC
>ss_conf PSIPRED confidence values
9998985079999983812799998549458999996478760412378576888765406741079999999998528986189985267769998723
489999988887656787318999985013223056565668999
>Consensus
xxmxxxxxxxxxxxFxgxeyxvxvxxxxxxLxvevexkxxxxxWxxxFxaxYIEdlTxKTGnfKxFxvFxxMLxsAlxxxsxxvsldlLTyxDLExLrxx
kxgxxxxsxxxxsxxxxxKRYLILTYxvEfDRvHYPLpLxxxgxP
>target
GSMDQPAGLQVDYVFRGVEHAVRVMVSGQVLELEVEDRMTADQWRGEFDAGFIEDLTHKTGNFKQFNIFCHMLESALTQSSESVTLDLLTYTDLESLRNR
KMGGRPGSLAPRSAQLNSKRYLILIYSVEFDRIHYPLPLPYQGKP
>gi|124783730|gb|ABN14938.1| variable flagellar number protein [Taenia asiatica]
------------FSFSCREHFVAFAnDRGKALKVEVRDLCSDEEWRGNFSPTYIEELTKKTGNFKRFDVFLTMLSSLLCQRTKSLSIDFVSYGDLENFQK
TRMKRNSLRFPVGKESSVNRRYLILAYISDFDRTFYPLPLTYFG--
>gi|145518169|ref|XP_001444962.1| hypothetical protein [Paramecium tetraurelia strain d4-2]gi|124412395|emb|CAK77565.1| unnamed protein product [Paramecium tetraurelia]
-------QLETDIILQGMEYVISMQASDHLLYIELESKYEPQIWKNTYTIDYIEELTRKTGNPKKFNIFLSMLQTALQKTNENVLFEILTYQDLENLKQQ
KSQDQ--SNLSRTSSnnqKVNKRYLILSYQTDFEKIHYPLALNYE---
>gi|308161996|gb|EFO64425.1| Hypothetical protein GLP15_909 [Giardia lamblia P15]
------------------------------FSVHLENDETGEHWFNAFTPEYIDEMTRKTGNQRRYDVFTSMLASALEKTSsrrpnsqtpLDLSLDLLFV
EDLERMRDANGGVHSDSVDatIQSAANKSRCFLIVAYAVEFERTFYPLPLNKSGPP
>gi|326435363|gb|EGD80933.1| hypothetical protein PTSG_01515 [Salpingoeca sp. ATCC 50818]
----------------------------------IEHTVTTEVWANTFSREYIEQLTHKAGSFKELSVFYRMLAKGIdRQSPEVVSVALYDQHDLQQLQK
tRRLGE---SVAANGTE-AHNKYLVLTYRTEYDRVQYPLPLLYAGEP
>gi|123379005|ref|XP_001298262.1| hypothetical protein [Trichomonas vaginalis G3]gi|121878763|gb|EAX85332.1| hypothetical protein TVAG_022530 [Trichomonas vaginalis G3]
-----------EAVFNIRDHHYRIGIEtrsSKSFGVDLLDEDIWERWSSDFNASYIKEISRKAGAEKRISIFWRMLQTAIEGTSNEISFDVLSTADVNSL
QTK--------LNPnyKSKDTEDKRYIILTQTTEFEKINFPLSLK-----
>gi|313232608|emb|CBY19278.1| unnamed protein product [Oikopleura dioica]
--------------------------STSVDGFRVRVRLERQSWANSFSPQYISELTKKTGCHKEFDTFITLLAKAITGKEnTECSLQILTPNDLESLRS
GSNSGRQPTAHMR--PPGDRRYLLLTVHNNYEKNHFPLSL------
>gi|325118600|emb|CBZ54151.1| conserved hypothetical protein [Neospora caninum Liverpool]
-------------------------------------------------------MTRRTGRFASVDAFWELLLRAVRrrageaieksrdGEDEDVTLNL
WDVTDLEALRRSAGGEKPSSPA----SSEDKHFLILTHSAGSRRVHYPLALHRQS--
#
NULL 3706 5728 4211 4064 4839 3729 4763 4308 4069 3323 5509 4640 4464 4937 4285 4423 3815 3783 6325 4665
HMM A C D E F G H I K L M N P Q R S T V W Y
M->M M->I M->D I->M I->I D->M D->D Neff Neff_I Neff_D
0 * * 0 * 0 * * * *
G 1 * * * * * 0 * * * * * * * * * * * * * * 1
0 * * * * * * 1000 0 0
S 2 * * * * * * * * * * * * * * * 0 * * * * 2
0 * * * * * * 1501 0 0
M 3 * * * * * * * * * * 299 * * * * 2419 * * * * 3
0 * * * * * * 2321 0 0
D 4 * * 2215 1591 * 2419 * * * * * 2748 * * 3097 * * * * * 4
0 * * 3170 170 * * 2321 1000 0
Q 5 1718 * 3296 1380 * * * * * * * * * 3340 * * * 3167 * * 5
0 * * * * * * 2321 0 0
P 6 * * 3100 * * 3167 * * * * * * 577 * * * 3296 * * * 6
0 * * * * * * 2321 0 0
A 7 1807 * * 3296 2748 * * * * * * 3097 * * 3074 * 3167 * * 3100 7
0 * * * * * * 2321 0 0
G 8 * 3432 * * 3637 2120 * * * * * * * 1610 * * 3442 3495 * 3495 8
0 * * * * * * 2623 0 0