# ontology-toolkit
Maintain version and dependency info in RDF ontologies.
## Installation
## Easy Install
To install the most recent released version of the toolkit use `pip install onto-tool`.
## Development Install
To experiment with unreleased features currently in development, clone this repo and navigate to the installed directory. Run `python -m setup install`, which
will install the `onto_tool` command and all its dependencies into your environment.
```
$ onto_tool -h
usage: onto_tool [-h] [-k] [-v] {update,export,bundle,graphic} ...
Ontology toolkit.
positional arguments:
{update,export,bundle,graphic}
sub-command help
update Update versions and dependencies
export Export ontology
bundle Bundle ontology for release
graphic Create PNG graphic and dot file from OWL files or SPARQL Endpoint
optional arguments:
-h, --help show this help message and exit
-k, --insecure Allow insecure server connections when using SSL
-v, --version Report onto-tool version and exit
```
## Sub-Commands
### Update
The `update` sub-command modifies ontology version and dependency information
```
$ onto_tool update -h
usage: onto_tool update [-h] [-f {xml,turtle,nt} | -i] [--debug] [-o OUTPUT]
[-b [{all,strict}]] [--retain-definedBy]
[--versioned-definedBy] [-v SET_VERSION]
[--version-info [VERSION_INFO]]
[-d DEPENDENCY VERSION]
[ontology [ontology ...]]
positional arguments:
ontology Ontology file or directory containing OWL files
optional arguments:
-h, --help show this help message and exit
-f {xml,turtle,nt}, --format {xml,turtle,nt}
Output format
-i, --in-place Overwrite each input file with update, preserving
format
--debug Emit verbose debug output
-o OUTPUT, --output OUTPUT
Path to output file. Will be ignored if --in-place is
specified.
-b [{all,strict}], --defined-by [{all,strict}]
Add rdfs:isDefinedBy to every resource defined. If the
(default) "strict" argument is provided, only
owl:Class, owl:ObjectProperty, owl:DatatypeProperty,
owl:AnnotationProperty and owl:Thing entities will be
annotated. If "all" is provided, every entity that has
any properties other than rdf:type will be annotated.
Will override any existing rdfs:isDefinedBy
annotations on the affected entities unless --retain-
definedBy is specified.
-v SET_VERSION, --set-version SET_VERSION
Set the version of the defined ontology
--version-info [VERSION_INFO]
Adjust versionInfo, defaults to "Version X.x.x"
-d DEPENDENCY VERSION, --dependency-version DEPENDENCY VERSION
Update the import of DEPENDENCY to VERSION
```
### Export
The `export` sub-command will transform the ontology into the desired format, and remove version information, as required by tools such as Top Braid Composer.
```
usage: onto_tool export [-h] [-f {xml,turtle,nt} | -c CONTEXT] [--debug]
[-o OUTPUT] [-s] [-m IRI VERSION] [-b [{all,strict}]]
[--retain-definedBy] [--versioned-definedBy]
[ontology [ontology ...]]
positional arguments:
ontology Ontology file or directory containing OWL files
optional arguments:
-h, --help show this help message and exit
-f {xml,turtle,nt}, --format {xml,turtle,nt}
Output format
-c CONTEXT, --context CONTEXT
Export as N-Quads in CONTEXT.
--debug Emit verbose debug output
-o OUTPUT, --output OUTPUT
Path to output file.
-s, --strip-versions Remove versions from imports.
-m IRI VERSION, --merge IRI VERSION
Merge all inputs into a single ontology with the given
IRI and version
-b [{all,strict}], --defined-by [{all,strict}]
Add rdfs:isDefinedBy to every resource defined. If the
(default) "strict" argument is provided, only
owl:Class, owl:ObjectProperty, owl:DatatypeProperty,
owl:AnnotationProperty and owl:Thing entities will be
annotated. If "all" is provided, every entity that has
any properties other than rdf:type will be annotated.
--retain-definedBy When merging ontologies, retain existing values of
rdfs:isDefinedBy
```
### Graphic
The `graphic` sub-command will create either
* a comprehensive diagram showing ontology modules together with classes, object properties and individuals
together with the path of imports, or (if the 'wee' option is selected) a simple diagram of the ontology
import hierarchy, or
* a diagram of the use of classes and object and data properties in a triple store or local ontology files.
Graphics are exported both as ```png``` files and also as a ```dot``` file. This ```dot``` file can be used with Graphviz or with web tools such as [Dot Viewer](http://www.semantechs.co.uk/turtle-editor-viewer/)
```
usage: onto_tool graphic [-h] [-e ENDPOINT] [--schema | --data]
[--single-ontology-graphs] [--debug] [-o OUTPUT]
[--show-shacl]
[--link-concentrator-threshold LINK_CONCENTRATOR_THRESHOLD]
[--instance-limit INSTANCE_LIMIT]
[--predicate-threshold PREDICATE_THRESHOLD]
[--include [INCLUDE [INCLUDE ...]] |
--include-pattern [INCLUDE_REGEX [INCLUDE_REGEX ...]]
| --exclude [EXCLUDE [EXCLUDE ...]] |
--exclude-pattern
[EXCLUDE_REGEX [EXCLUDE_REGEX ...]]] [-v VERSION]
[-w [WEE [WEE ...]]]
[--label-language LABEL_LANGUAGE]
[--hide [HIDE [HIDE ...]]] [--no-image] [-t TITLE]
[ontology [ontology ...]]
positional arguments:
ontology Ontology file, directory or name pattern
optional arguments:
-h, --help show this help message and exit
-e ENDPOINT, --endpoint ENDPOINT
URI of SPARQL endpoint to use to gather data
--schema Generate ontology import graph (default)
--data Analyze instances for types and links
--single-ontology-graphs
If specified in combination with --endpoint when
generating a schema graph, assume that every ontology
is in its own named graph in the triple store.
Otherwise rdfs:isDefinedBy will be used to locate
entities defined by each ontology.
--debug Emit verbose debug output
-o OUTPUT, --output OUTPUT
Output directory for generated graphics
--show-shacl Attempts to discover which classes and properties have
corresponding SHACL shapes and colors them green on
the graph. This detection relies on the presence of
sh:targetClass targeting, and can be confused by
complex logical shapes or Advanced SHACL features such
as SPARQL queries.
--link-concentrator-threshold LINK_CONCENTRATOR_THRESHOLD
When the number links originating from the same class
that share a single predicate exceed this threshold
(default 10), use more compact display. Setting the
value to 0 disables this behavior.
-v VERSION, --version VERSION
Version to place in graphic
-w [WEE [WEE ...]], --wee [WEE [WEE ...]]
For ontologies matching the patterns specified, only
render the name and import information. If no patterns
are specified, applies to all ontologies.
--label-language LABEL_LANGUAGE
In case entities have labels in multiple languages,
select either the specified language (default: en) or
a non-lanugage label.
--hide [HIDE [HIDE ...]]
When visualizing data, hide classes and properties
matching the regexpatterns specified with this option.
--no-image Do not generate PNG image, only .dot output.
-t TITLE, --title TITLE
Title to use for graph. If not supplied, the repo URI
will be used if graphing an endpoint, or 'Gist' if
graphing local files.
Sampling Limits:
--instance-limit INSTANCE_LIMIT
Specify a limit on how many triples to consider that
use any one predicate to find (default 500000). This
option may result in an incomplete version of the
diagram, missing certain links.
--predicate-threshold PREDICATE_THRESHOLD
Ignore predicates which occur fewer than
PREDICATE_THRESHOLD times (default 10)
Filters (only one can be used):
--include [INCLUDE [INCLUDE ...]]
If specified for --schema, only ontologies matching
the specified URIs will be shown in full detail. If
specified with --data, only triples in the named
graphs mentioned will be considered (this also
excludes any triples in the default graph).
--include-pattern [INCLUDE_REGEX [INCLUDE_REGEX ...]]
If specified for --schema, only ontologies matching
the specified URI pattern will be shown in full
detail. If specified with --data, only triples in the
named graphs matching the pattern will be considered
(this also excludes any triples in the default graph).
For large graphs this option is significantly slower
than using --include.
--exclude [EXCLUDE [EXCLUDE ...]]
If specified for --schema, ontologies matching the
specified URIs will be omitted from the graph. If
specified with --data, triples in the named graphs
mentioned will be excluded (this also excludes any
triples in the default graph).
--exclude-pattern [EXCLUDE_REGEX [EXCLUDE_REGEX ...]]
If specified for --schema, ontologies matching the
specified URI pattern will be omitted from the graph.
If specified with --data, triples in the named graphs
matching the pattern will be ignored (this also
excludes any triples in the default graph). For large
graphs this option is significantly slower than using
--exclude.
```
### Bundle
The `bundle` sub-command supports creating an ontology deployment containing both RDF and non-RDF artifacts for delivery or web hosting.
```
$ onto_tool bundle -h
usage: onto_tool bundle [-h] [--debug] [-v VARIABLE VALUE] bundle
positional arguments:
bundle JSON or YAML bundle definition
optional arguments:
-h, --help show this help message and exit
--debug Emit verbose debug output
-v VARIABLE VALUE, --variable VARIABLE VALUE
Set value of VARIABLE to VALUE
```
The bundle definition is either YAML or JSON, and contains the following sections:
#### Variable definition
```yaml
variables:
name: "gist"
version: "X.x.x"
input: "."
rdf-toolkit: "{input}/tools/rdf-toolkit.jar"
output: "{name}{version}_webDownload"
```
Variables are initialized with the default values provided, but can be overriden via the `--variable` command line option.
Values can reference other values using the `{name}` template syntax.
#### Tool definition
All tools require a `name` by which they are referenced in `transform` actions. Three different tool types are supported:
* Java tools (`type: "Java"`) require a path to the executable Jar file specified via the `jar` option,
and a list of `arguments` that will be applied to each file processed.
The `inputFile` and `outputFile` variables will be bound during execution, but other variables can be
used to construct the arguments.
tools:
```yaml
- name: "serializer"
type: "Java"
jar: "{rdf-toolkit}"
arguments:
- "-tfmt"
- "rdf-xml"
- "-sdt"
- "explicit"
- "-dtd"
- "-ibn"
- "-s"
- "{inputFile}"
- "-t"
- "{outputFile}"
```
* Shell tools (`type: "shell"`) execute a command specified via
a list of `arguments` that will be applied to each file processed.
The `inputFile` and `outputFile` variables will be bound during execution, but other variables can be
used to construct the arguments.
tools:
```yaml
tools:
- name: "java_version"
type: "shell"
arguments:
- "java"
- "-version"
```
* SPARQL tools apply a SPARQL Update query to each input file and serialize the resulting graph into the
output file. RDF format is preserved unless overridden with the `format` option. If the query is specified
inline, template substitution will be applied to it, so bundle variables can be used, but double braces
(`{{` instead of `{`, `}}` instead of `}`) have to be used to escape actual braces.
```yaml
- name: "add-language-en"
type: "sparql"
query: >
prefix skos: <http://www.w3.org/2004/02/skos/core#>
DELETE {{
?subject skos:prefLabel ?nolang .
}}
INSERT {{
?subject skos:prefLabel ?withlang
}}
where {{
?subject skos:prefLabel ?nolang .
FILTER(lang(?nolang) = '')
BIND(STRLANG(?nolang, '{lang}') as ?withlang)
}}
```
#### Actions
Actions are executed in the order they are listed. Each action must have an `action` attribute,
and any action can contain a `message` attribute, the contents of which will be
emitted as a `INFO`-level log message prior to the execution of the action.
##### Basic File Manipulation
- `mkdir`, which requires a `directory` attribute to specify the path of the directory to be created
(only if it doesn't already exist)
- `copy`, which copies files into the bundle, and supports the following arguments:
- `source`, `target`, `includes` and `excludes` - if neither `includes` or `excludes` is present,
`source` and `target` are both assumed to be file paths to a single file. If either`includes`
or `excludes` is provided, `source` and `target` are
assumed to be directories, and each member of the `includes`/`excludes` lists is treated as a glob
pattern inside the `source` directory. If `includes` is not present, it's presumed to be `*`, and
`excludes` is applied after `includes`.
- `rename` - If provided, must contain `from` and `to` attributes. When specified, each file
is renamed as it is copied, where `from` is treated as a Python regular expression
applied to the base name of the source file, and `to` is the substitution string which
replaces it in the name of the target file. Backreferences are available for capturing groups, e.g.
```yaml
rename:
from: "(.*)\\.owl"
to: "\\g<1>{version}.owl"
```
will add a version number to the base name of each `.owl` file. Further documentation on
Python regular expression replace functionality can be found
[here](https://docs.python.org/3/howto/regex.html#search-and-replace).
- `replace` - If provided, must contain `from` and `to` attributes. When specified, each file
is processed after being copied, and each instance of the `from` pattern is replaced
with `to` string in the file contents. Python regular expression syntax and backreferences are
supported as shown in the `rename` documentation.
- `move`, which moves files according the provided options, which are identical to the ones supported
by `copy`.
##### RDF Transformation
- `definedBy`, which inspects each input file to identify a single defined ontology, and then
adds a `rdfs:isDefinedBy` property to every `owl:Class`, `owl:ObjectProperty`, `owl:DatatypeProperty`
and `owl:AnnotationProperty` defined in the file referencing the identified ontology. Existing
`rdfs:isDefinedBy` values are removed prior to the addition. Input and output file specification
options are identical to those used by the `copy` action.
- `export`, which functions similarly to the command-line export functionality, gathering one or
more input ontologies and exporting them as a single file, with some optional transformations,
depending on the following specified options:
- `source`, `target`, `includes` and `excludes` - treated identically to the `copy` operation
described above, except `target` is always treated as a single file path.
- `merge` - if provided, it must have two mandatory fields, `iri` and `version`. In this case, all
ontologies declared in the input files are removed, and a single new ontologies, specified by the
`iri` is created, using `version` to build `owl:versionInfo` and `owl:versionIRI`. Any imports on
the removed ontologies which are not satisfied internally are transferred to the new ontology.
- `definedBy` - has two possible values, `strict` and `all`. If provided, a `rdfs:isDefinedBy` is
added to all non-blank node subjects in the exported RDF linking them to the ontology defined in the
combined graph. If more that one ontology is defined, the export will fail. If `strict` is specified,
only classes and properties will be annotated, whereas `all` does not filter by type.
- `retainDefinedBy` - by default, `definedBy` will override any existing `rdfs:definedBy` annotations,
but if this option is provided, existing annotations will be left in place.
- `format` - One of `turtle`, `xml`, or `nt` (N-Triples), specifies the output format for the export.
The default output format is `turtle`.
- `context` - If provided, generates a N-Quads export with the `context` argument as the name of the
graph. When this option is present, the value of `format` is ignored.
- `compress` - when this is `true`, the output is `gzip`-ed.
- `transform`, which applies the specified tool to a set of input files, and supports the following
arguments:
- `tool`, which references the `name` of a tool which must be defined in the `tools` section.
- `source`, `target`, `includes` and `excludes`, which function just like they do for the `copy` and `move`
actions, with each input and output path bound into the `inputFile` and `outputFile` variables
before the tool arguments are interpreted.
- `replace` and `rename`, which are applied after the tool invocation, and work as described above.
- `sparql` reads RDF files provided via the `source` and `includes`/`excludes` options and executes a SPARQL
query on the resulting combined graph.
* If the `query` option is a valid file path, the query is read from that file,
otherwise the contents of the `query` option are interpreted as the query.
* `SELECT` query results are stored in the file specified via `target` as a CSV.
* RDF results from a `CONSTRUCT` query are
stored as either Turtle, RDF/XML or N-Triples, depending on the `format` option (`turtle`, `xml`, or `nt`).
Update queries will alter the input data in place, and the resulting
graph will be output in the specified format.
* `UPDATE` queries executed on local files will modify the in-memory graph and then serialize the
resulting graph to the `target`.
* The default functionality is to combine all RDF sources specified via `includes`
and execute queries on the resulting graph. However, if `eachFile: true` is added,
all queries will be applied to each source file separately, and will produce a
separate output file. In this case, `target` will be treated as a directory, and
the `rename` option should be used when needed to construct the output file names. For example, the following
action extracts the labels out of each RDF file into a separate CSV with matching names:
```yaml
- action: 'sparql'
message: "Multi-file processing with SELECT"
eachFile: true
source: '{input}'
includes:
- '*_ontology.ttl'
target: "{output}/each/select"
rename:
from: "(.*)\\.ttl"
to: "\\g<1>.csv"
query: >
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix skos: <http://www.w3.org/2004/02/skos/core#>
select ?label
WHERE {{
?s rdfs:label ?label .
}} order by ?label
```
* As an alternative to operating on local RDF specified via 'source', a query can
be executed on a triple store by specifying an `endpoint`, which must
contain a `query_uri`, and can optionally specify `user`/`password` which will
authenticate via HTTP basic authentication. Update queries will modify the
triple store directly, and a separate `update_uri` can be specified
for databases which require it.
##### Utility Actions
- `markdown` transforms a `.md` file referenced in `source` into an HTML output specified in `target`.
- `graph` reads RDF files provided via the `source` and `includes`/`excludes` options and generates a graphical
representation of the ontology, as in the `graphic` sub-command described above. Both `.dot` and
`.png` outputs are written to the directory specified in the `target` option, and `title` and
`version` attributes configure the title on the generated graph. If `compact` is specified as
`True`, a concise graph including only ontology names and imports is generated.
##### Validation
The `verify` action reads RDF files provided via the `source` and `includes`/`excludes` options and performs validation on the
resulting combined graph. If the validation fails, the bundle process exits with a non-zero status and
does not execute subsequent actions. The type of verification performed depends on the
value of the `type` option:
* If `type` is `select`, one or more SPARQL `SELECT` queries are executed against the graph, and the
first query to return a non-empty result will terminate the bundle. The results of the query will
be output to the log, and also written as CSV to a file path specified by the `target` option, if
provided. Queries can be specified in one of two ways (only one can be present):
* If the `query` option is a valid file path, the query is read from that file,
otherwise the contents of the `query` option are interpreted as the query, e.g.
```yaml
query: >
prefix skos: <http://www.w3.org/2004/02/skos/core#>
select ?unlabeled where {{
?unlabeled a ?type .
filter not exists {{ ?unlabeled skos:prefLabel ?label }}
}}
```
* If `queries` is provided, a list of queries will be built from the `source` and `includes`/`excludes`
sub-options. The queries will be executed in order specified. If `stopOnFail` is omitted or
is `true`, the first query that produces a failing result will cause `verify` to abort. If
`stopOnFail` is `false`, all queries will be executed regardless of failures, and the value
of `target` is treated as a directory where the results of _each_ failing query will be written.
```yaml
- action: 'verify'
type: 'select'
source: '{input}'
includes:
- 'verify_data.ttl'
target: '{output}/verify_select_results'
stopOnFail: false
queries:
source: '{input}'
includes:
- 'verify_*_select_query.rq'
```
* If `type` is `ask`, one or more SPARQL `ASK` queries will be executed. Queries are
specified similarly to the `select` validation. Unless `stopOnFail` is set to `false`, the first
query producing a result that does not match the required `expected` option, the bundle will terminate.
For example:
```yaml
actions:
- action: 'verify'
type: 'ask'
source: '{input}'
includes:
- 'verify_data.ttl'
queries:
source: '{input}'
includes:
- '*_ask_query.rq'
expected: false
```
* If `type` is `shacl`, a SHACL shape graph will be constructed from the file specified via the `shapes`
option (which must have a `source`, and optionally `includes`/`excludes`), with the bundle terminating only if
any `sh:Violation` results are present, unless the `failOn` option specifies otherwise.`
The report is emitted to the log, and saved as Turtle to the path specified in the `target` option if it's provided.
For example:
```yaml
- action: 'verify'
type: 'shacl'
inference: 'rdfs'
source: '{input}'
includes:
- 'verify_data.ttl'
target: '{output}/verify_shacl_errors.ttl'
failOn: "warning"
shapes:
source: '{input}/verify_shacl_shapes.ttl'
```
If the `inference` option is provided, the reasoner will be run on the graph prior
to applying the SHACL rules. The valid values are:
* `rdfs`,
* `owlrl`,
* `both`, or
* `none` (default).
* If `type` is `construct`, the queries are expected to `CONSTRUCT` a [SHACL ValidationReport](https://www.w3.org/TR/shacl/#validation-report).
The validation will be considered as a failure if the resulting graph is non-empty. `target`,
`stopOnFail` and `query`/`queries` are handled same as `select` validation, and `failOn` is used to determine which
violations will terminate execution.
* Validation can be performed against a SPARQL endpoint instead of local RDF
data by specifying `endpoint` instead of `source`/`includes`. `endpoint` must
contain a `query_uri`, and can optionally specify `user`/`password` which will
authenticate via HTTP basic authentication. For example:
```
- action: 'verify'
type: 'construct'
endpoint:
query_uri: 'https://my.endpoint.com/sparql'
user: 'test-user'
password: 'test-user'
target: '{output}/verify_construct_results'
stopOnFail: false
query: '{input}/verify_via_construct.rq'
```
Raw data
{
"_id": null,
"home_page": "https://github.com/semanticarts/ontology-toolkit",
"name": "onto-tool",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "",
"author": "Boris Pelakh",
"author_email": "boris.pelakh@semanticarts.com",
"download_url": "https://files.pythonhosted.org/packages/fa/87/a999de517a96ef88c29db453740a380d4372485c1b75045a2ab9415bbaae/onto_tool-1.8.0.tar.gz",
"platform": null,
"description": "# ontology-toolkit\n\nMaintain version and dependency info in RDF ontologies.\n\n## Installation\n\n## Easy Install \n\nTo install the most recent released version of the toolkit use `pip install onto-tool`. \n\n## Development Install\n\nTo experiment with unreleased features currently in development, clone this repo and navigate to the installed directory. Run `python -m setup install`, which\nwill install the `onto_tool` command and all its dependencies into your environment.\n\n```\n$ onto_tool -h\nusage: onto_tool [-h] [-k] [-v] {update,export,bundle,graphic} ...\n\nOntology toolkit.\n\npositional arguments:\n {update,export,bundle,graphic}\n sub-command help\n update Update versions and dependencies\n export Export ontology\n bundle Bundle ontology for release\n graphic Create PNG graphic and dot file from OWL files or SPARQL Endpoint\n\noptional arguments:\n -h, --help show this help message and exit\n -k, --insecure Allow insecure server connections when using SSL\n -v, --version Report onto-tool version and exit\n```\n\n## Sub-Commands\n\n### Update\n\nThe `update` sub-command modifies ontology version and dependency information\n```\n$ onto_tool update -h\nusage: onto_tool update [-h] [-f {xml,turtle,nt} | -i] [--debug] [-o OUTPUT]\n [-b [{all,strict}]] [--retain-definedBy]\n [--versioned-definedBy] [-v SET_VERSION]\n [--version-info [VERSION_INFO]]\n [-d DEPENDENCY VERSION]\n [ontology [ontology ...]]\n\npositional arguments:\n ontology Ontology file or directory containing OWL files\n\noptional arguments:\n -h, --help show this help message and exit\n -f {xml,turtle,nt}, --format {xml,turtle,nt}\n Output format\n -i, --in-place Overwrite each input file with update, preserving\n format\n --debug Emit verbose debug output\n -o OUTPUT, --output OUTPUT\n Path to output file. Will be ignored if --in-place is\n specified.\n -b [{all,strict}], --defined-by [{all,strict}]\n Add rdfs:isDefinedBy to every resource defined. If the\n (default) \"strict\" argument is provided, only\n owl:Class, owl:ObjectProperty, owl:DatatypeProperty,\n owl:AnnotationProperty and owl:Thing entities will be\n annotated. If \"all\" is provided, every entity that has\n any properties other than rdf:type will be annotated.\n Will override any existing rdfs:isDefinedBy\n annotations on the affected entities unless --retain-\n definedBy is specified.\n -v SET_VERSION, --set-version SET_VERSION\n Set the version of the defined ontology\n --version-info [VERSION_INFO]\n Adjust versionInfo, defaults to \"Version X.x.x\"\n -d DEPENDENCY VERSION, --dependency-version DEPENDENCY VERSION\n Update the import of DEPENDENCY to VERSION\n```\n\n### Export\n\nThe `export` sub-command will transform the ontology into the desired format, and remove version information, as required by tools such as Top Braid Composer.\n```\nusage: onto_tool export [-h] [-f {xml,turtle,nt} | -c CONTEXT] [--debug]\n [-o OUTPUT] [-s] [-m IRI VERSION] [-b [{all,strict}]]\n [--retain-definedBy] [--versioned-definedBy]\n [ontology [ontology ...]]\n\npositional arguments:\n ontology Ontology file or directory containing OWL files\n\noptional arguments:\n -h, --help show this help message and exit\n -f {xml,turtle,nt}, --format {xml,turtle,nt}\n Output format\n -c CONTEXT, --context CONTEXT\n Export as N-Quads in CONTEXT.\n --debug Emit verbose debug output\n -o OUTPUT, --output OUTPUT\n Path to output file.\n -s, --strip-versions Remove versions from imports.\n -m IRI VERSION, --merge IRI VERSION\n Merge all inputs into a single ontology with the given\n IRI and version\n -b [{all,strict}], --defined-by [{all,strict}]\n Add rdfs:isDefinedBy to every resource defined. If the\n (default) \"strict\" argument is provided, only\n owl:Class, owl:ObjectProperty, owl:DatatypeProperty,\n owl:AnnotationProperty and owl:Thing entities will be\n annotated. If \"all\" is provided, every entity that has\n any properties other than rdf:type will be annotated.\n --retain-definedBy When merging ontologies, retain existing values of\n rdfs:isDefinedBy\n```\n\n### Graphic\n\nThe `graphic` sub-command will create either \n* a comprehensive diagram showing ontology modules together with classes, object properties and individuals\n together with the path of imports, or (if the 'wee' option is selected) a simple diagram of the ontology\n import hierarchy, or\n* a diagram of the use of classes and object and data properties in a triple store or local ontology files.\n \nGraphics are exported both as ```png``` files and also as a ```dot``` file. This ```dot``` file can be used with Graphviz or with web tools such as [Dot Viewer](http://www.semantechs.co.uk/turtle-editor-viewer/)\n\n```\nusage: onto_tool graphic [-h] [-e ENDPOINT] [--schema | --data]\n [--single-ontology-graphs] [--debug] [-o OUTPUT]\n [--show-shacl]\n [--link-concentrator-threshold LINK_CONCENTRATOR_THRESHOLD]\n [--instance-limit INSTANCE_LIMIT]\n [--predicate-threshold PREDICATE_THRESHOLD]\n [--include [INCLUDE [INCLUDE ...]] |\n --include-pattern [INCLUDE_REGEX [INCLUDE_REGEX ...]]\n | --exclude [EXCLUDE [EXCLUDE ...]] |\n --exclude-pattern\n [EXCLUDE_REGEX [EXCLUDE_REGEX ...]]] [-v VERSION]\n [-w [WEE [WEE ...]]]\n [--label-language LABEL_LANGUAGE]\n [--hide [HIDE [HIDE ...]]] [--no-image] [-t TITLE]\n [ontology [ontology ...]]\n\npositional arguments:\n ontology Ontology file, directory or name pattern\n\noptional arguments:\n -h, --help show this help message and exit\n -e ENDPOINT, --endpoint ENDPOINT\n URI of SPARQL endpoint to use to gather data\n --schema Generate ontology import graph (default)\n --data Analyze instances for types and links\n --single-ontology-graphs\n If specified in combination with --endpoint when\n generating a schema graph, assume that every ontology\n is in its own named graph in the triple store.\n Otherwise rdfs:isDefinedBy will be used to locate\n entities defined by each ontology.\n --debug Emit verbose debug output\n -o OUTPUT, --output OUTPUT\n Output directory for generated graphics\n --show-shacl Attempts to discover which classes and properties have\n corresponding SHACL shapes and colors them green on\n the graph. This detection relies on the presence of\n sh:targetClass targeting, and can be confused by\n complex logical shapes or Advanced SHACL features such\n as SPARQL queries.\n --link-concentrator-threshold LINK_CONCENTRATOR_THRESHOLD\n When the number links originating from the same class\n that share a single predicate exceed this threshold\n (default 10), use more compact display. Setting the\n value to 0 disables this behavior.\n -v VERSION, --version VERSION\n Version to place in graphic\n -w [WEE [WEE ...]], --wee [WEE [WEE ...]]\n For ontologies matching the patterns specified, only\n render the name and import information. If no patterns\n are specified, applies to all ontologies.\n --label-language LABEL_LANGUAGE\n In case entities have labels in multiple languages,\n select either the specified language (default: en) or\n a non-lanugage label.\n --hide [HIDE [HIDE ...]]\n When visualizing data, hide classes and properties\n matching the regexpatterns specified with this option.\n --no-image Do not generate PNG image, only .dot output.\n -t TITLE, --title TITLE\n Title to use for graph. If not supplied, the repo URI\n will be used if graphing an endpoint, or 'Gist' if\n graphing local files.\n\nSampling Limits:\n --instance-limit INSTANCE_LIMIT\n Specify a limit on how many triples to consider that\n use any one predicate to find (default 500000). This\n option may result in an incomplete version of the\n diagram, missing certain links.\n --predicate-threshold PREDICATE_THRESHOLD\n Ignore predicates which occur fewer than\n PREDICATE_THRESHOLD times (default 10)\n\nFilters (only one can be used):\n --include [INCLUDE [INCLUDE ...]]\n If specified for --schema, only ontologies matching\n the specified URIs will be shown in full detail. If\n specified with --data, only triples in the named\n graphs mentioned will be considered (this also\n excludes any triples in the default graph).\n --include-pattern [INCLUDE_REGEX [INCLUDE_REGEX ...]]\n If specified for --schema, only ontologies matching\n the specified URI pattern will be shown in full\n detail. If specified with --data, only triples in the\n named graphs matching the pattern will be considered\n (this also excludes any triples in the default graph).\n For large graphs this option is significantly slower\n than using --include.\n --exclude [EXCLUDE [EXCLUDE ...]]\n If specified for --schema, ontologies matching the\n specified URIs will be omitted from the graph. If\n specified with --data, triples in the named graphs\n mentioned will be excluded (this also excludes any\n triples in the default graph).\n --exclude-pattern [EXCLUDE_REGEX [EXCLUDE_REGEX ...]]\n If specified for --schema, ontologies matching the\n specified URI pattern will be omitted from the graph.\n If specified with --data, triples in the named graphs\n matching the pattern will be ignored (this also\n excludes any triples in the default graph). For large\n graphs this option is significantly slower than using\n --exclude.\n```\n\n### Bundle\n\nThe `bundle` sub-command supports creating an ontology deployment containing both RDF and non-RDF artifacts for delivery or web hosting.\n\n```\n$ onto_tool bundle -h\nusage: onto_tool bundle [-h] [--debug] [-v VARIABLE VALUE] bundle\n\npositional arguments:\n bundle JSON or YAML bundle definition\n\noptional arguments:\n -h, --help show this help message and exit\n --debug Emit verbose debug output\n -v VARIABLE VALUE, --variable VARIABLE VALUE\n Set value of VARIABLE to VALUE\n```\n\nThe bundle definition is either YAML or JSON, and contains the following sections:\n\n#### Variable definition\n\n```yaml\nvariables:\n name: \"gist\"\n version: \"X.x.x\"\n input: \".\"\n rdf-toolkit: \"{input}/tools/rdf-toolkit.jar\"\n output: \"{name}{version}_webDownload\"\n```\nVariables are initialized with the default values provided, but can be overriden via the `--variable` command line option.\nValues can reference other values using the `{name}` template syntax.\n\n#### Tool definition\n\nAll tools require a `name` by which they are referenced in `transform` actions. Three different tool types are supported:\n* Java tools (`type: \"Java\"`) require a path to the executable Jar file specified via the `jar` option,\n and a list of `arguments` that will be applied to each file processed.\n The `inputFile` and `outputFile` variables will be bound during execution, but other variables can be\n used to construct the arguments.\n tools:\n ```yaml\n - name: \"serializer\"\n type: \"Java\"\n jar: \"{rdf-toolkit}\"\n arguments:\n - \"-tfmt\"\n - \"rdf-xml\"\n - \"-sdt\"\n - \"explicit\"\n - \"-dtd\"\n - \"-ibn\"\n - \"-s\"\n - \"{inputFile}\"\n - \"-t\"\n - \"{outputFile}\"\n ```\n* Shell tools (`type: \"shell\"`) execute a command specified via\n a list of `arguments` that will be applied to each file processed.\n The `inputFile` and `outputFile` variables will be bound during execution, but other variables can be\n used to construct the arguments.\n tools:\n ```yaml\n tools:\n - name: \"java_version\"\n type: \"shell\"\n arguments:\n - \"java\"\n - \"-version\"\n ```\n* SPARQL tools apply a SPARQL Update query to each input file and serialize the resulting graph into the \n output file. RDF format is preserved unless overridden with the `format` option. If the query is specified\n inline, template substitution will be applied to it, so bundle variables can be used, but double braces\n (`{{` instead of `{`, `}}` instead of `}`) have to be used to escape actual braces.\n ```yaml\n - name: \"add-language-en\"\n type: \"sparql\"\n query: >\n prefix skos: <http://www.w3.org/2004/02/skos/core#>\n DELETE {{\n ?subject skos:prefLabel ?nolang .\n }}\n INSERT {{\n ?subject skos:prefLabel ?withlang\n }}\n where {{\n ?subject skos:prefLabel ?nolang .\n FILTER(lang(?nolang) = '')\n BIND(STRLANG(?nolang, '{lang}') as ?withlang)\n }}\n ```\n\n#### Actions\n\nActions are executed in the order they are listed. Each action must have an `action` attribute,\nand any action can contain a `message` attribute, the contents of which will be\nemitted as a `INFO`-level log message prior to the execution of the action.\n\n##### Basic File Manipulation\n- `mkdir`, which requires a `directory` attribute to specify the path of the directory to be created \n (only if it doesn't already exist)\n- `copy`, which copies files into the bundle, and supports the following arguments:\n - `source`, `target`, `includes` and `excludes` - if neither `includes` or `excludes` is present,\n `source` and `target` are both assumed to be file paths to a single file. If either`includes`\n or `excludes` is provided, `source` and `target` are \n assumed to be directories, and each member of the `includes`/`excludes` lists is treated as a glob\n pattern inside the `source` directory. If `includes` is not present, it's presumed to be `*`, and\n `excludes` is applied after `includes`.\n - `rename` - If provided, must contain `from` and `to` attributes. When specified, each file\n is renamed as it is copied, where `from` is treated as a Python regular expression\n applied to the base name of the source file, and `to` is the substitution string which\n replaces it in the name of the target file. Backreferences are available for capturing groups, e.g.\n ```yaml\n rename:\n from: \"(.*)\\\\.owl\"\n to: \"\\\\g<1>{version}.owl\"\n ```\n will add a version number to the base name of each `.owl` file. Further documentation on\n Python regular expression replace functionality can be found\n [here](https://docs.python.org/3/howto/regex.html#search-and-replace).\n - `replace` - If provided, must contain `from` and `to` attributes. When specified, each file\n is processed after being copied, and each instance of the `from` pattern is replaced\n with `to` string in the file contents. Python regular expression syntax and backreferences are\n supported as shown in the `rename` documentation.\n- `move`, which moves files according the provided options, which are identical to the ones supported\n by `copy`.\n \n##### RDF Transformation\n\n- `definedBy`, which inspects each input file to identify a single defined ontology, and then\n adds a `rdfs:isDefinedBy` property to every `owl:Class`, `owl:ObjectProperty`, `owl:DatatypeProperty`\n and `owl:AnnotationProperty` defined in the file referencing the identified ontology. Existing\n `rdfs:isDefinedBy` values are removed prior to the addition. Input and output file specification\n options are identical to those used by the `copy` action.\n- `export`, which functions similarly to the command-line export functionality, gathering one or\n more input ontologies and exporting them as a single file, with some optional transformations,\n depending on the following specified options:\n - `source`, `target`, `includes` and `excludes` - treated identically to the `copy` operation\n described above, except `target` is always treated as a single file path.\n - `merge` - if provided, it must have two mandatory fields, `iri` and `version`. In this case, all\n ontologies declared in the input files are removed, and a single new ontologies, specified by the \n `iri` is created, using `version` to build `owl:versionInfo` and `owl:versionIRI`. Any imports on\n the removed ontologies which are not satisfied internally are transferred to the new ontology.\n - `definedBy` - has two possible values, `strict` and `all`. If provided, a `rdfs:isDefinedBy` is\n added to all non-blank node subjects in the exported RDF linking them to the ontology defined in the\n combined graph. If more that one ontology is defined, the export will fail. If `strict` is specified,\n only classes and properties will be annotated, whereas `all` does not filter by type.\n - `retainDefinedBy` - by default, `definedBy` will override any existing `rdfs:definedBy` annotations,\n but if this option is provided, existing annotations will be left in place.\n - `format` - One of `turtle`, `xml`, or `nt` (N-Triples), specifies the output format for the export.\n The default output format is `turtle`.\n - `context` - If provided, generates a N-Quads export with the `context` argument as the name of the\n graph. When this option is present, the value of `format` is ignored.\n - `compress` - when this is `true`, the output is `gzip`-ed.\n- `transform`, which applies the specified tool to a set of input files, and supports the following\n arguments:\n - `tool`, which references the `name` of a tool which must be defined in the `tools` section.\n - `source`, `target`, `includes` and `excludes`, which function just like they do for the `copy` and `move`\n actions, with each input and output path bound into the `inputFile` and `outputFile` variables\n before the tool arguments are interpreted.\n - `replace` and `rename`, which are applied after the tool invocation, and work as described above.\n- `sparql` reads RDF files provided via the `source` and `includes`/`excludes` options and executes a SPARQL\n query on the resulting combined graph.\n * If the `query` option is a valid file path, the query is read from that file,\n otherwise the contents of the `query` option are interpreted as the query.\n * `SELECT` query results are stored in the file specified via `target` as a CSV.\n * RDF results from a `CONSTRUCT` query are\n stored as either Turtle, RDF/XML or N-Triples, depending on the `format` option (`turtle`, `xml`, or `nt`).\n Update queries will alter the input data in place, and the resulting\n graph will be output in the specified format.\n * `UPDATE` queries executed on local files will modify the in-memory graph and then serialize the\n resulting graph to the `target`.\n * The default functionality is to combine all RDF sources specified via `includes`\n and execute queries on the resulting graph. However, if `eachFile: true` is added,\n all queries will be applied to each source file separately, and will produce a \n separate output file. In this case, `target` will be treated as a directory, and\n the `rename` option should be used when needed to construct the output file names. For example, the following\n action extracts the labels out of each RDF file into a separate CSV with matching names:\n ```yaml\n - action: 'sparql'\n message: \"Multi-file processing with SELECT\"\n eachFile: true\n source: '{input}'\n includes:\n - '*_ontology.ttl'\n target: \"{output}/each/select\"\n rename:\n from: \"(.*)\\\\.ttl\"\n to: \"\\\\g<1>.csv\"\n query: >\n prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n prefix skos: <http://www.w3.org/2004/02/skos/core#>\n select ?label\n WHERE {{\n ?s rdfs:label ?label .\n }} order by ?label\n ```\n * As an alternative to operating on local RDF specified via 'source', a query can\n be executed on a triple store by specifying an `endpoint`, which must\n contain a `query_uri`, and can optionally specify `user`/`password` which will\n authenticate via HTTP basic authentication. Update queries will modify the\n triple store directly, and a separate `update_uri` can be specified\n for databases which require it.\n\n \n##### Utility Actions\n- `markdown` transforms a `.md` file referenced in `source` into an HTML output specified in `target`.\n- `graph` reads RDF files provided via the `source` and `includes`/`excludes` options and generates a graphical\n representation of the ontology, as in the `graphic` sub-command described above. Both `.dot` and\n `.png` outputs are written to the directory specified in the `target` option, and `title` and \n `version` attributes configure the title on the generated graph. If `compact` is specified as\n `True`, a concise graph including only ontology names and imports is generated.\n\n##### Validation\nThe `verify` action reads RDF files provided via the `source` and `includes`/`excludes` options and performs validation on the\nresulting combined graph. If the validation fails, the bundle process exits with a non-zero status and\ndoes not execute subsequent actions. The type of verification performed depends on the \nvalue of the `type` option:\n* If `type` is `select`, one or more SPARQL `SELECT` queries are executed against the graph, and the\n first query to return a non-empty result will terminate the bundle. The results of the query will\n be output to the log, and also written as CSV to a file path specified by the `target` option, if\n provided. Queries can be specified in one of two ways (only one can be present):\n * If the `query` option is a valid file path, the query is read from that file,\n otherwise the contents of the `query` option are interpreted as the query, e.g.\n ```yaml\n query: >\n prefix skos: <http://www.w3.org/2004/02/skos/core#>\n select ?unlabeled where {{\n ?unlabeled a ?type .\n filter not exists {{ ?unlabeled skos:prefLabel ?label }}\n }}\n ```\n * If `queries` is provided, a list of queries will be built from the `source` and `includes`/`excludes`\n sub-options. The queries will be executed in order specified. If `stopOnFail` is omitted or\n is `true`, the first query that produces a failing result will cause `verify` to abort. If\n `stopOnFail` is `false`, all queries will be executed regardless of failures, and the value\n of `target` is treated as a directory where the results of _each_ failing query will be written.\n ```yaml\n - action: 'verify'\n type: 'select'\n source: '{input}'\n includes:\n - 'verify_data.ttl'\n target: '{output}/verify_select_results'\n stopOnFail: false\n queries:\n source: '{input}'\n includes:\n - 'verify_*_select_query.rq'\n ```\n* If `type` is `ask`, one or more SPARQL `ASK` queries will be executed. Queries are\n specified similarly to the `select` validation. Unless `stopOnFail` is set to `false`, the first\n query producing a result that does not match the required `expected` option, the bundle will terminate.\n For example:\n ```yaml\n actions:\n - action: 'verify'\n type: 'ask'\n source: '{input}'\n includes:\n - 'verify_data.ttl'\n queries:\n source: '{input}'\n includes:\n - '*_ask_query.rq'\n expected: false\n ```\n* If `type` is `shacl`, a SHACL shape graph will be constructed from the file specified via the `shapes`\n option (which must have a `source`, and optionally `includes`/`excludes`), with the bundle terminating only if\n any `sh:Violation` results are present, unless the `failOn` option specifies otherwise.`\n The report is emitted to the log, and saved as Turtle to the path specified in the `target` option if it's provided.\n For example:\n ```yaml\n - action: 'verify'\n type: 'shacl'\n inference: 'rdfs'\n source: '{input}'\n includes:\n - 'verify_data.ttl'\n target: '{output}/verify_shacl_errors.ttl'\n failOn: \"warning\"\n shapes:\n source: '{input}/verify_shacl_shapes.ttl'\n ```\n If the `inference` option is provided, the reasoner will be run on the graph prior\n to applying the SHACL rules. The valid values are:\n * `rdfs`,\n * `owlrl`,\n * `both`, or\n * `none` (default).\n* If `type` is `construct`, the queries are expected to `CONSTRUCT` a [SHACL ValidationReport](https://www.w3.org/TR/shacl/#validation-report).\n The validation will be considered as a failure if the resulting graph is non-empty. `target`,\n `stopOnFail` and `query`/`queries` are handled same as `select` validation, and `failOn` is used to determine which\n violations will terminate execution.\n* Validation can be performed against a SPARQL endpoint instead of local RDF\n data by specifying `endpoint` instead of `source`/`includes`. `endpoint` must\n contain a `query_uri`, and can optionally specify `user`/`password` which will\n authenticate via HTTP basic authentication. For example:\n ```\n - action: 'verify'\n type: 'construct'\n endpoint:\n query_uri: 'https://my.endpoint.com/sparql'\n user: 'test-user'\n password: 'test-user'\n target: '{output}/verify_construct_results'\n stopOnFail: false\n query: '{input}/verify_via_construct.rq'\n ```\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Ontology Maintenance and Release Tool",
"version": "1.8.0",
"project_urls": {
"Homepage": "https://github.com/semanticarts/ontology-toolkit"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "916baa143a3595eda00789c6d8ad9f6d5d29b834921b9a7774c19aad6b341a15",
"md5": "10c5a583ed6976451b1e231fea58ff71",
"sha256": "05086850ca0daea4e6ee415fd289616dab074b08a58fd7d58ed4ab58c9efc712"
},
"downloads": -1,
"filename": "onto_tool-1.8.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "10c5a583ed6976451b1e231fea58ff71",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 50998,
"upload_time": "2023-10-17T18:51:33",
"upload_time_iso_8601": "2023-10-17T18:51:33.083293Z",
"url": "https://files.pythonhosted.org/packages/91/6b/aa143a3595eda00789c6d8ad9f6d5d29b834921b9a7774c19aad6b341a15/onto_tool-1.8.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "fa87a999de517a96ef88c29db453740a380d4372485c1b75045a2ab9415bbaae",
"md5": "c471f5ebebcc3c25e19050f52e8bc937",
"sha256": "f0f3bb95843ba4cc97ff09f361541e323ebc2c4c40585277d257b544bd79b639"
},
"downloads": -1,
"filename": "onto_tool-1.8.0.tar.gz",
"has_sig": false,
"md5_digest": "c471f5ebebcc3c25e19050f52e8bc937",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 52455,
"upload_time": "2023-10-17T18:51:48",
"upload_time_iso_8601": "2023-10-17T18:51:48.161928Z",
"url": "https://files.pythonhosted.org/packages/fa/87/a999de517a96ef88c29db453740a380d4372485c1b75045a2ab9415bbaae/onto_tool-1.8.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-17 18:51:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "semanticarts",
"github_project": "ontology-toolkit",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "rdflib",
"specs": [
[
">=",
"6.1.1"
]
]
},
{
"name": "pyshacl",
"specs": [
[
"~=",
"0.17.0"
]
]
},
{
"name": "pydot",
"specs": [
[
">=",
"1.4.1"
]
]
},
{
"name": "jinja2",
"specs": [
[
">=",
"2.11.2"
]
]
},
{
"name": "markdown2",
"specs": [
[
">=",
"2.4.9"
]
]
},
{
"name": "pyyaml",
"specs": [
[
">=",
"5.3.1"
]
]
},
{
"name": "pip",
"specs": [
[
">=",
"19.0.3"
]
]
},
{
"name": "MarkupSafe",
"specs": [
[
">=",
"1.1.1"
]
]
},
{
"name": "six",
"specs": [
[
">=",
"1.14.0"
]
]
},
{
"name": "pyparsing",
"specs": [
[
"==",
"2.4.7"
]
]
},
{
"name": "isodate",
"specs": [
[
">=",
"0.6.0"
]
]
},
{
"name": "jsonschema",
"specs": [
[
">=",
"3.2.0"
]
]
},
{
"name": "attrs",
"specs": [
[
">=",
"19.3.0"
]
]
},
{
"name": "pyrsistent",
"specs": [
[
">=",
"0.16.0"
]
]
},
{
"name": "setuptools",
"specs": [
[
">=",
"40.8.0"
]
]
},
{
"name": "namedentities",
"specs": [
[
">=",
"1.5.2"
]
]
},
{
"name": "zipp",
"specs": [
[
">=",
"3.1.0"
]
]
},
{
"name": "pytest",
"specs": [
[
"~=",
"6.1.2"
]
]
},
{
"name": "SPARQLWrapper",
"specs": [
[
"~=",
"1.8.5"
]
]
}
],
"lcname": "onto-tool"
}