Skip to content
Snippets Groups Projects
Commit d19e3230 authored by Studer Gabriel's avatar Studer Gabriel
Browse files

ligand scoring: docu update

parent f5a601fb
Branches
Tags
No related merge requests found
...@@ -395,141 +395,133 @@ Details on the usage (output of ``ost compare-ligand-structures --help``): ...@@ -395,141 +395,133 @@ Details on the usage (output of ``ost compare-ligand-structures --help``):
.. code-block:: console .. code-block:: console
usage: ost compare-ligand-structures [-h] -m MODEL [-ml [MODEL_LIGANDS ...]] usage: ost compare-ligand-structures [-h] -m MODEL [-ml [MODEL_LIGANDS ...]]
-r REFERENCE [-rl [REFERENCE_LIGANDS ...]] -r REFERENCE
[-o OUTPUT] [-mf {pdb,mmcif,cif}] [-rl [REFERENCE_LIGANDS ...]] [-o OUTPUT]
[-rf {pdb,mmcif,cif}] [-ft] [-rna] [-ec] [-sm] [-mf {pdb,cif,mmcif}]
[-gcm] [-c CHAIN_MAPPING [CHAIN_MAPPING ...]] [-rf {pdb,cif,mmcif}] [-mb MODEL_BIOUNIT]
[-ra] [--lddt-pli] [--rmsd] [--radius RADIUS] [-rb REFERENCE_BIOUNIT] [-ft] [-rna]
[--lddt-pli-radius LDDT_PLI_RADIUS] [-sm] [-cd COVERAGE_DELTA] [-u]
[--lddt-lp-radius LDDT_LP_RADIUS] [-v VERBOSITY] [--lddt-pli]
[-v VERBOSITY] [--n-max-naive N_MAX_NAIVE] [--lddt-pli-radius LDDT_PLI_RADIUS]
[--lddt-pli-amc] [--rmsd]
Evaluate model with non-polymer/small molecule ligands against reference. [--radius RADIUS]
[--lddt-lp-radius LDDT_LP_RADIUS] [-fbs]
Example: ost compare-ligand-structures \
-m model.pdb \ Evaluate model with non-polymer/small molecule ligands against reference.
-ml ligand.sdf \
-r reference.cif \ Example: ost compare-ligand-structures \
--lddt-pli --rmsd -m model.pdb \
-ml ligand.sdf \
Structures of polymer entities (proteins and nucleotides) can be given in PDB -r reference.cif \
or mmCIF format. If the structure is given in mmCIF format, only the asymmetric --lddt-pli --rmsd
unit (AU) is used for scoring.
Structures of polymer entities (proteins and nucleotides) can be given in PDB
Ligands can be given as path to SDF files containing the ligand for both model or mmCIF format.
(--model-ligands/-ml) and reference (--reference-ligands/-rl). If omitted,
ligands will be detected in the model and reference structures. For structures Ligands can be given as path to SDF files containing the ligand for both model
given in mmCIF format, this is based on the annotation as "non polymer entity" (--model-ligands/-ml) and reference (--reference-ligands/-rl). If omitted,
(i.e. ligands in the _pdbx_entity_nonpoly mmCIF category) and works reliably. ligands will be detected in the model and reference structures. For structures
For structures given in PDB format, this is based on the HET records and is given in mmCIF format, this is based on the annotation as "non polymer entity"
normally not what you want. You should always give ligands as SDF for (i.e. ligands in the _pdbx_entity_nonpoly mmCIF category) and works reliably.
structures in PDB format. For structures given in legacy PDB format, this is based on the HET records
which is usually only set properly on files downloaded from the PDB (and even
Polymer/oligomeric ligands (saccharides, peptides, nucleotides) are not then, this is not always the case). This is normally not what you want. You
supported. should always give ligands as SDF for structures in legacy PDB format.
Only minimal cleanup steps are performed (remove hydrogens, and for structures Polymer/oligomeric ligands (saccharides, peptides, nucleotides) are not
of polymers only, remove unknown atoms and cleanup element column). supported.
Ligands in mmCIF and PDB files must comply with the PDB component dictionary Only minimal cleanup steps are performed (remove hydrogens and deuteriums,
definition, and have properly named residues and atoms, in order for and for structures of polymers only, remove unknown atoms and cleanup element
ligand connectivity to be loaded correctly. Ligands loaded from SDF files column).
are exempt from this restriction, meaning any arbitrary ligand can be assessed.
Ligands in mmCIF and PDB files must comply with the PDB component dictionary
Output is written in JSON format (default: out.json). In case of no additional definition, and have properly named residues and atoms, in order for
options, this is a dictionary with three keys: ligand connectivity to be loaded correctly. Ligands loaded from SDF files
are exempt from this restriction, meaning any arbitrary ligand can be assessed.
* "model_ligands": A list of ligands in the model. If ligands were provided
explicitly with --model-ligands, elements of the list will be the paths to Output is written in JSON format (default: out.json). In case of no additional
the ligand SDF file(s). Otherwise, they will be the chain name, residue options, this is a dictionary with three keys:
number and insertion code of the ligand, separated by a dot.
* "reference_ligands": A list of ligands in the reference. If ligands were * "model_ligands": A list of ligands in the model. If ligands were provided
provided explicitly with --reference-ligands, elements of the list will be explicitly with --model-ligands, elements of the list will be the paths to
the paths to the ligand SDF file(s). Otherwise, they will be the chain name, the ligand SDF file(s). Otherwise, they will be the chain name, residue
residue number and insertion code of the ligand, separated by a dot. number and insertion code of the ligand, separated by a dot.
* "status": SUCCESS if everything ran through. In case of failure, the only * "reference_ligands": A list of ligands in the reference. If ligands were
content of the JSON output will be "status" set to FAILURE and an provided explicitly with --reference-ligands, elements of the list will be
additional key: "traceback". the paths to the ligand SDF file(s). Otherwise, they will be the chain name,
residue number and insertion code of the ligand, separated by a dot.
Each score is opt-in and, be enabled with optional arguments and is added * "status": SUCCESS if everything ran through. In case of failure, the only
to the output. Keys correspond to the values in "model_ligands" above. content of the JSON output will be "status" set to FAILURE and an
Unassigned ligands are reported with a message in additional key: "traceback".
"unassigned_model_ligands" and "unassigned_reference_ligands".
Each score is opt-in and must be enabled with optional arguments. The scores
options: perform a model/reference ligand assignment and report a score for each assigned
-h, --help show this help message and exit model ligand. Optionally, unassigned model ligands are reported with a null
-m MODEL, --mdl MODEL, --model MODEL score and a reason why no assignment has been performed (--unassigned/-u).
Path to model file.
-ml [MODEL_LIGANDS ...], --mdl-ligands [MODEL_LIGANDS ...], options:
--model-ligands [MODEL_LIGANDS ...] -h, --help show this help message and exit
Path to model ligand files. -m MODEL, --mdl MODEL, --model MODEL
-r REFERENCE, --ref REFERENCE, --reference REFERENCE Path to model file.
Path to reference file. -ml [MODEL_LIGANDS ...], --mdl-ligands [MODEL_LIGANDS ...], --model-ligands [MODEL_LIGANDS ...]
-rl [REFERENCE_LIGANDS ...], --ref-ligands [REFERENCE_LIGANDS ...], Path to model ligand files.
--reference-ligands [REFERENCE_LIGANDS ...] -r REFERENCE, --ref REFERENCE, --reference REFERENCE
Path to reference ligand files. Path to reference file.
-o OUTPUT, --out OUTPUT, --output OUTPUT -rl [REFERENCE_LIGANDS ...], --ref-ligands [REFERENCE_LIGANDS ...], --reference-ligands [REFERENCE_LIGANDS ...]
Output file name. The output will be saved as a JSON Path to reference ligand files.
file. default: out.json -o OUTPUT, --out OUTPUT, --output OUTPUT
-mf {pdb,mmcif,cif}, --mdl-format {pdb,mmcif,cif}, Output file name. The output will be saved as a JSON
--model-format {pdb,mmcif,cif} file. default: out.json
Format of model file. Inferred from path if not -mf {pdb,cif,mmcif}, --mdl-format {pdb,cif,mmcif}, --model-format {pdb,cif,mmcif}
given. Format of model file. pdb reads pdb but also pdb.gz,
-rf {pdb,mmcif,cif}, --reference-format {pdb,mmcif,cif}, same applies to cif/mmcif. Inferred from filepath if
--ref-format {pdb,mmcif,cif} not given.
Format of reference file. Inferred from path if not -rf {pdb,cif,mmcif}, --reference-format {pdb,cif,mmcif}, --ref-format {pdb,cif,mmcif}
given. Format of reference file. pdb reads pdb but also
-ft, --fault-tolerant pdb.gz, same applies to cif/mmcif. Inferred from
Fault tolerant parsing. filepath if not given.
-rna, --residue-number-alignment -mb MODEL_BIOUNIT, --model-biounit MODEL_BIOUNIT
Make alignment based on residue number instead of Only has an effect if model is in mmcif format. By
using a global BLOSUM62-based alignment (NUC44 for default, the asymmetric unit (AU) is used for scoring.
nucleotides). If there are biounits defined in the mmcif file, you
-ec, --enforce-consistency can specify the ID (as a string) of the one which
Enforce consistency of residue names between the should be used.
reference binding site and the model. By default -rb REFERENCE_BIOUNIT, --reference-biounit REFERENCE_BIOUNIT
residue name discrepancies are reported but the Only has an effect if reference is in mmcif format. By
program proceeds. If this is set to True, the program default, the asymmetric unit (AU) is used for scoring.
will fail with an error message if the residues names If there are biounits defined in the mmcif file, you
differ. Note: more binding site mappings may be can specify the ID (as a string) of the one which
explored during scoring, but only inconsistencies in should be used.
the selected mapping are reported. -ft, --fault-tolerant
-sm, --substructure-match Fault tolerant parsing.
Allow incomplete target ligands. -rna, --residue-number-alignment
-gcm, --global-chain-mapping Make alignment based on residue number instead of
Use a global chain mapping. using a global BLOSUM62-based alignment (NUC44 for
-c CHAIN_MAPPING [CHAIN_MAPPING ...], nucleotides).
--chain-mapping CHAIN_MAPPING [CHAIN_MAPPING ...] -sm, --substructure-match
Custom mapping of chains between the reference and Allow incomplete (ie partially resolved) target
the model. Each separate mapping consist of key:value ligands.
pairs where key is the chain name in reference and -cd COVERAGE_DELTA, --coverage-delta COVERAGE_DELTA
value is the chain name in model. Only has an effect Coverage delta for partial ligand assignment.
if global-chain-mapping flag is set. -u, --unassigned Report unassigned model ligands in the output together
-ra, --rmsd-assignment with assigned ligands, with a null score, and reason
Use RMSD for ligand assignment. for not being assigned.
-u, --unassigned Report unassigned model ligands in the output -v VERBOSITY, --verbosity VERBOSITY
together with assigned ligands, with a null score, Set verbosity level. Defaults to 3 (INFO).
and reason for not being assigned. --lddt-pli Compute lDDT-PLI score and store as key "lddt-pli".
--lddt-pli-radius LDDT_PLI_RADIUS
--lddt-pli Compute lDDT-PLI score and store as key "lddt-pli". lDDT inclusion radius for lDDT-PLI.
--rmsd Compute RMSD score and store as key "rmsd". --lddt-pli-amc Add model contacts (amc) when computing lDDT-PLI.
--radius RADIUS Inclusion radius for the binding site. Any residue --rmsd Compute RMSD score and store as key "rmsd".
with atoms within this distance of the ligand will --radius RADIUS Inclusion radius to extract reference binding site
be included in the binding site. that is used for RMSD computation. Any residue with
--lddt-pli-radius LDDT_PLI_RADIUS atoms within this distance of the ligand will be
lDDT inclusion radius for lDDT-PLI. included in the binding site.
--lddt-lp-radius LDDT_LP_RADIUS --lddt-lp-radius LDDT_LP_RADIUS
lDDT inclusion radius for lDDT-LP. lDDT inclusion radius for lDDT-LP.
-v VERBOSITY, --verbosity VERBOSITY -fbs, --full-bs-search
Set verbosity level. Defaults to 3 (INFO). Enumerate all potential binding sites in the model
--n-max-naive N_MAX_NAIVE when searching rigid superposition for RMSD
If number of chains in model and reference are computation
below or equal that number, the global chain
mapping will naively enumerate all possible
mappings. A heuristic is used otherwise.
Additional information about the scores and output values is available in
:meth:`rmsd_details <ost.mol.alg.ligand_scoring.LigandScorer.rmsd_details>` and
:meth:`lddt_pli_details <ost.mol.alg.ligand_scoring.LigandScorer.lddt_pli_details>`.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment