# Universal Ink Library
[![Python package](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/python-package.yml/badge.svg)](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/python-package.yml)
[![Pylint](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/pylint.yml/badge.svg)](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/pylint.yml)
[![PyTest](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/pytest.yml/badge.svg)](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/pytest.yml)
![License: Apache 2](https://img.shields.io/badge/License-Apache2-green.svg)
[![PyPI](https://img.shields.io/pypi/v/universal-ink-library.svg)](https://pypi.python.org/pypi/universal-ink-library)
[![PyPI](https://img.shields.io/pypi/pyversions/universal-ink-library.svg)](https://pypi.python.org/pypi/universal-ink-library)
[![Documentation](https://img.shields.io/badge/api-reference-blue.svg)](https://developer-docs.wacom.com/sdk-for-ink/docs/model)
![Contributors](https://img.shields.io/github/contributors/Wacom-Developer/universal-ink-library.svg)
![GitHub forks](https://img.shields.io/github/forks/Wacom-Developer/universal-ink-library.svg)
![GitHub stars](https://img.shields.io/github/stars/Wacom-Developer/universal-ink-library.svg)
Universal Ink Library is a pure Python package for working with Universal Ink Models ([UIM](https://developer.wacom.com/products/universal-ink-model)).
The UIM defines a language-neutral and platform-neutral data model for representing and manipulating digital ink data captured using an electronic pen or stylus, or using touch input.
The main aspects of the UIM are:
- Interoperability of ink-based data models by defining a standardized interface with other systems
- Biometric data storage mechanism
- Spline data storage mechanism
- Rendering configurations storage mechanism
- Ability to compose spline/raw-input based logical trees, which are contained within the ink model
- Portability, by enabling conversion to common industry standards
- Extensibility, by enabling the description of ink data related semantic meta-data
- Standardized serialization mechanism
This reference document defines a RIFF container and Protocol Buffers schema for serialization of ink models as well as
a standard mechanism to describe relationships between different parts of the ink model, and/or between parts of the ink
model and external entities.
The specified serialization schema is based on the following standards:
- **Resource Interchange File Format (RIFF)** - A generic file container format for storing data in tagged chunks
- **Protocol Buffers v3** - A language-neutral, platform-neutral extensible mechanism for serializing structured data
- **Resource Description Framework (RDF)** - A standard model for data interchange on the Web
- **OWL 2 Web Ontology Language (OWL2)** - An ontology language for the Semantic Web with formally defined meaning
## Data Model
The *Universal Ink Model* has five fundamental categories:
- **Input data**: A collection of data repositories, holding raw sensor input, input device/provider configurations, sensor channel configurations, etc. Each data repository keeps certain data-sets isolated and is responsible for specific type(s) of data
- **Ink data**: The visual appearance of the digital ink, presented as ink geometry with rendering configurations
- **Meta-data**: Related meta-data about the environment, input devices, etc.
- **Ink Trees / Views**: A collection of logical trees, representing structures of hierarchically organized paths or raw input data-frames
- **Semantic triple store**: An RDF compliant triple store, holding semantic information, such as text structure, handwriting recognition results, and semantic entities
The diagram below illustrates the different logical parts of the ink model.
![Logical Parts of Ink Model.](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/uim-v1.png)
This UML diagram illustrates the complete Ink Model in terms of logical models and class dependencies.
![UML Diagram](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/uim-uml-all-v9.png)
The *Universal Ink Model* provides the flexibility required for a variety of applications, since the display of pen data is only one aspect.
For example, the same data can be used for data mining or even signature comparison, while the ink display can be on a range of platforms potentially requiring different scaling and presentation.
## Input data
In reality, pen data is captured from a pen device as a set of positional points:
![Digital-ink-w](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/overview_ink_device_sensor_channels.png)
Depending on the type of hardware, in addition to the x/y positional coordinates, the points can contain further information such as pen tip force and angle.
Collectively, this information is referred to as sensor data and the *Universal Ink Model* provides a means of storing all the available data.
For example, with some types of hardware, pen hover coordinates can be captured while the pen is not in contact with the surface.
The information is saved in the *Universal Ink Model* and can be used when required.
## Ink data
Ink data is the result of the [ink geometry pipeline](https://developer-docs.wacom.com/sdk-for-ink/docs/pipeline) of the [WILL SDK for ink](https://developer.wacom.com/products/will-sdk-for-ink).
Pen strokes are identified as continuous sets of pen coordinates captured while the pen is in contact with the surface.
For example, writing the letter ‘w', as illustrated below.
The process converts each pen stroke into a mathematical representation, which can then be used to render the shape on a display.
Steps in the so-called Ink Geometry pipeline are illustrated below where each step is configured by an application to generate the desired output:
![Digital-ink-rendering](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/pen-data-w-rendering.png)
As a result, the data points are smoothed and shaped to produce the desired representation.
For example, simulating the appearance of a felt-tip ink pen.
Raster and vector rendering is supported with a selection of rendering brush types.
The results are saved as Ink data, containing ink geometry and rendering information.
## Meta-data
Meta-data is added as data about the pen data.
The *Universal Ink Model* allows for administrative information such as author name, location, pen data source, etc.
Further meta-data is computed by analysis of the pen data.
An example of digital ink is annotated below:
![Digital-ink-annotated](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/pen-data-annotated.png)
The labels identify pen strokes *s1, s2, s3*, etc.
In addition, groups of strokes are identified as *g1, g2, g3*, etc.
Pen strokes are passed to a handwriting recognition engine, and the results are stored as additional meta-data, generally referred to as semantic data.
The semantic data is stored with reference to the groups, categorized as single characters, individual words, lines of text, and so on.
## Semantic Data
The Ink Model Specification provides a standard mechanism to describe relationships between different parts of the ink model, and/or between parts of the ink model and external entities.
The Ink Model keeps an instance of a RDF or WODL-compliant triple store, called Knowledge Graph in the scope of this document.
This triple store holds a list of semantic triples to encode relationships between subject, predicate and object as defined in the RDF specification.
Using the knowledge graph nodes of the ink trees, contained within the ink model, could be annotated with additional metadata in order to describe different aspects of the ink model, for instance - text segmentation view, named entity recognition view, etc.
### Wacom Ontology Definition Language (WODL)
At a fundamental level, digital ink can be used producing content, such as text, math, diagrams, sketches etc.
The Wacom Ontology Description Language (WODL) provides a standardized, JSON-based way of annotating ink with a specialized schema definition.
The specification of WODL language is available [here](https://developer-docs.wacom.com/docs/specifications/wodl/).
Some of the most common schema definitions are:
- (Segmentation Schema)[https://developer-docs.wacom.com/docs/specifications/schemas/segmentation/]
- (Math Structures Schema)[https://developer-docs.wacom.com/docs/specifications/schemas/math-structures/]
- (Named Entity Recognition Schema)[https://developer-docs.wacom.com/docs/specifications/schemas/ner/]
# Installation
Our Universal Ink Library can be installed using pip.
``
$ pip install universal-ink-library
``
# Quick Start
## File handling
### Loading UIM
The `UIMParser` is be used to load a serialized Universal Ink Model in version 3.0.0 or 3.1.0 and you receive the memory model `InkModel` which can be used for extracting the relevant data.
```python
from uim.codec.parser.uim import UIMParser
from uim.model.ink import InkModel
parser: UIMParser = UIMParser()
# ---------------------------------------------------------------------------------
# Parse a UIM file version 3.0.0
# ---------------------------------------------------------------------------------
ink_model_1: InkModel = UIMParser().parse('../ink/uim_3.0.0/1) Value of Ink 1.uim')
# ---------------------------------------------------------------------------------
# Parse a UIM file version 3.1.0
# ---------------------------------------------------------------------------------
ink_model_2: InkModel = UIMParser().parse('../ink/uim_3.1.0/1) Value of Ink 1.uim')
```
### Saving of UIM
Saving the `InkModel` as a Universal Ink Model file.
```python
from uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310
from uim.model.ink import InkModel
ink_model: InkModel = InkModel()
...
# Save the model, this will overwrite an existing file
with open('3_1_0.uim', 'wb') as uim:
# unicode(data) auto-decodes data to unicode if str
uim.write(UIMEncoder310().encode(ink_model))
```
Find the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_file_handling.py)
## InkModel
### Iterate over semantics
If the `InkModel` is enriched with semantics from handwriting recognition and named entity recognition, or named entity linking.
The semantics an be access with a helper function `uim_extract_text_and_semantics_from` or by iterating the views, like shown in `uim_extract_text_and_semantics_from` function:
```python
from pathlib import Path
from uim.codec.parser.uim import UIMParser
from uim.model.helpers.text_extractor import uim_extract_text_and_semantics_from
from uim.model.ink import InkModel
from uim.model.semantics.schema import CommonViews
if __name__ == '__main__':
parser: UIMParser = UIMParser()
ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'uim_3.1.0' /
'2) Digital Ink is processable 1 (3.1 delta).uim')
if ink_model.has_knowledge_graph() and ink_model.has_tree(CommonViews.HWR_VIEW.value):
# The sample
words, entities, text = uim_extract_text_and_semantics_from(ink_model, hwr_view=CommonViews.HWR_VIEW.value)
print('=' * 100)
print(' Recognised text: ')
print(text)
print('=' * 100)
print(' Words:')
print('=' * 100)
for word_idx, word in enumerate(words):
print(f' Word #{word_idx + 1}:')
print(f' Text: {word["text"]}')
print(f' Alternatives: {word["alternatives"]}')
print(f' Bounding box: x:={word["bounding_box"]["x"]}, y:={word["bounding_box"]["y"]}, '
f'width:={word["bounding_box"]["width"]}, height:={word["bounding_box"]["height"]}')
print('')
print('=' * 100)
print(' Entities:')
print('=' * 100)
entity_idx: int = 1
for entity_uri, entity_mappings in entities.items():
print(f' Entity #{entity_idx}: URI: {entity_uri}')
print("-" * 100)
print(f" Label: {entity_mappings[0]['label']}")
print(' Ink Stroke IDs:')
for word_idx, entity in enumerate(entity_mappings):
print(f" #{word_idx + 1}: Word match: {entity['path_id']}")
print('=' * 100)
entity_idx += 1
```
Find the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_semantic_data.py)
A more generic approach to extract schema elements is the helper function `uim_schema_semantics_from`.
This function extracts schema elements from the ink model, like math structures, text, tables, etc..
An example for extracting math structures is shown below:
```python
{
'node_uri': UUID('16918b3f-b192-466e-83a3-54835ddfff11'),
'parent_uri': UUID('16918b3f-b192-466e-83a3-54835ddfff11'),
'path_id': [
UUID('16918b3f-b192-466e-83a3-54835ddfff11')
],
'bounding_box': {
'x': 175.71, 'y': 150.65,
'width': 15.91, 'height': 27.018
},
'type': 'will:math-structures/0.1/Symbol',
'attributes': [
('symbolType', 'Numerical'), ('representation', 'e')
]
}
```
With the extracted schema elements, you can build a tree structure, like shown in the following example:
```python
from pathlib import Path
from typing import List, Dict, Any, Tuple
from uim.codec.parser.uim import UIMParser
from uim.model.helpers.schema_content_extractor import uim_schema_semantics_from
from uim.model.ink import InkModel
from uim.model.semantics.schema import CommonViews
def build_tree(node_list: List[Dict[str, Any]]):
"""
Build a tree structure from the node list.
Parameters
----------
node_list: `List[Dict[str, Any]]`
List of nodes
"""
# Step 1: Create dictionaries for nodes and parent-child relationships
children_dict: Dict[str, Any] = {}
for node in node_list:
parent_uri: str = node['parent_uri']
if parent_uri is not None:
if parent_uri not in children_dict:
children_dict[parent_uri] = []
children_dict[parent_uri].append(node)
# Step 2: Define a recursive function to print the tree
def print_tree(node_display: Dict[str, Any], indent: int = 0):
info: str = ""
attributes: List[Tuple[str, Any]] = node_display.get('attributes', [])
if "path_id" in node_display:
info = f"(#strokes:={len(node_display['path_id'])})"
elif "bounding_box" in node_display:
info = (f"(x:={node_display['bounding_box']['x']}, y:={node_display['bounding_box']['y']}, "
f"width:={node_display['bounding_box']['width']}, "
f"height:={node_display['bounding_box']['height']})")
print('|' + '-' * indent + f" [type:={node_display['type']}] - {info}")
if len(attributes) > 0:
print('|' + ' ' * (indent + 4) + "| -[Attributes:]")
for key, value in attributes:
print('|' + ' ' * (indent + 8) + f"\t|-- {key}:={value}")
if node_display['node_uri'] in children_dict:
for child in children_dict[node_display['node_uri']]:
print_tree(child, indent + 4)
# Step 3: Find the root node (where parent_uri is None) and start printing the tree
for node in node_list:
if node['parent_uri'] is None:
print_tree(node)
if __name__ == '__main__':
parser: UIMParser = UIMParser()
# Parse UIM v3.0.0
ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'schemas' / 'math-structures.uim')
math_structures: List[Dict[str, Any]] = uim_schema_semantics_from(ink_model,
semantic_view=CommonViews.HWR_VIEW.value)
# Print the tree structure
build_tree(math_structures)
```
Find the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_extract_math.py)
### Accessing input and ink data
In order to access ink input configuration data, sensor data, or stroke data from `InkModel`, you can use the following functions:
```python
from pathlib import Path
from typing import Dict
from uuid import UUID
from uim.codec.parser.uim import UIMParser
from uim.model.ink import InkModel
from uim.model.inkinput.inputdata import InkInputType, InputContext, SensorContext, InputDevice
from uim.model.inkinput.sensordata import SensorData
if __name__ == '__main__':
parser: UIMParser = UIMParser()
# This file contains ink from different providers: PEN, TOUCH, MOUSE
ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'uim_3.1.0' /
'6) Different Input Providers.uim')
mapping_type: Dict[UUID, InkInputType] = {}
if ink_model.has_ink_structure():
print('InkInputProviders:')
print('-------------------')
# Iterate Ink input providers
for ink_input_provider in ink_model.input_configuration.ink_input_providers:
print(f' InkInputProvider. ID: {ink_input_provider.id} | type: {ink_input_provider.type}')
mapping_type[ink_input_provider.id] = ink_input_provider.type
print()
print('Strokes:')
print('--------')
# Iterate over strokes
for stroke in ink_model.strokes:
print(f'|- Stroke (id:={stroke.id} | points count: {stroke.points_count})')
if stroke.style and stroke.style.path_point_properties:
print(f'| |- Style (render mode:={stroke.style.render_mode_uri} | color:=('
f'red: {stroke.style.path_point_properties.red}, '
f'green: {stroke.style.path_point_properties.green}, '
f'blue: {stroke.style.path_point_properties.green}, '
f'alpha: {stroke.style.path_point_properties.alpha}))')
# Stroke is produced by sensor data being processed by the ink geometry pipeline
sd: SensorData = ink_model.sensor_data.sensor_data_by_id(stroke.sensor_data_id)
# Get InputContext for the sensor data
input_context: InputContext = ink_model.input_configuration.get_input_context(sd.input_context_id)
# Retrieve SensorContext
sensor_context: SensorContext = ink_model.input_configuration\
.get_sensor_context(input_context.sensor_context_id)
for scc in sensor_context.sensor_channels_contexts:
# Sensor channel context is referencing input device
input_device: InputDevice = ink_model.input_configuration.get_input_device(scc.input_device_id)
print(f'| |- Input device (id:={input_device.id} | type:=({mapping_type[scc.input_provider_id]})')
# Iterate over sensor channels
for c in scc.channels:
print(f'| | |- Sensor channel (id:={c.id} | name: {c.type.name} '
f'| values: {sd.get_data_by_id(c.id).values}')
print('|')
```
Find the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_input_and_ink.py)
## Creating an Ink Model
Creating an `InkModel` from the scratch.
The [csv file](ink/sensor_data/ink.csv) contains sensor data for strokes.
The script loads the sensor data from the CSV file and creates strokes from the sensor data.
```csv
idx,SPLINE_X,SPLINE_Y,SENSOR_TIMESTAMP,SENSOR_PRESSURE,SENSOR_ALTITUDE,SENSOR_AZIMUTH
0,277.1012268066406,183.11183166503906,1722443386312.649,0.07,0.6,0.72
0,277.1012268066406,183.1713409423828,1722443386312.653,0.11000000000000001,0.6,0.72
...
```
```python
import csv
import uuid
from collections import defaultdict
from pathlib import Path
from typing import List, Dict
from uim.codec.parser.base import SupportedFormats
from uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310
from uim.model.base import UUIDIdentifier
from uim.model.helpers.serialize import json_encode
from uim.model.ink import InkModel, InkTree, ViewTree
from uim.model.inkdata.brush import VectorBrush, BrushPolygon, BrushPolygonUri
from uim.model.inkdata.strokes import Spline, Style, Stroke, LayoutMask
from uim.model.inkinput.inputdata import Environment, InkInputProvider, InkInputType, InputDevice, SensorChannel, \
InkSensorType, InkSensorMetricType, SensorChannelsContext, SensorContext, InputContext, unit2unit, Unit
from uim.model.inkinput.sensordata import SensorData, InkState
from uim.model.semantics import schema
from uim.model.semantics.node import StrokeGroupNode, StrokeNode, URIBuilder
from uim.utils.matrix import Matrix4x4
def create_sensor_data(data_collection: Dict[str, List[float]],
input_context_id: uuid.UUID, channels: List[SensorChannel]) -> SensorData:
"""
Create sensor data from a data collection.
Parameters
----------
data_collection: Dict[str, List[float]]
input_context_id
channels
Returns
-------
SensorData
Instance of SensorData
"""
sd: SensorData = SensorData(UUIDIdentifier.id_generator(), input_context_id=input_context_id, state=InkState.PLANE)
sd.add_data(channels[0], [unit2unit(Unit.DIP, Unit.M, v) for v in data_collection['SPLINE_X']])
sd.add_data(channels[1], [unit2unit(Unit.DIP, Unit.M, v) for v in data_collection['SPLINE_Y']])
sd.add_timestamp_data(channels[2], data_collection['SENSOR_TIMESTAMP'])
sd.add_data(channels[3], data_collection['SENSOR_PRESSURE'])
sd.add_data(channels[4], data_collection['SENSOR_AZIMUTH'])
sd.add_data(channels[5], data_collection['SENSOR_ALTITUDE'])
return sd
def load_sensor_data(csv_path: Path, input_context_id: uuid.UUID, channels: List[SensorChannel]) -> List[SensorData]:
"""
Load sensor data from a CSV file.
Parameters
----------
csv_path: Path
Path to the CSV file
input_context_id: uuid.UUID
Input context ID
channels: List[SensorChannel]
List of sensor channels
Returns
-------
List[SensorData]
List of sensor data
"""
sensor_data_values: List[SensorData] = []
data_collection: Dict[str, List[float]] = defaultdict(list)
with csv_path.open('r') as f:
reader = csv.reader(f)
header: List[str] = next(reader)
if header != ['idx', 'SPLINE_X', 'SPLINE_Y',
'SENSOR_TIMESTAMP', 'SENSOR_PRESSURE', 'SENSOR_ALTITUDE', 'SENSOR_AZIMUTH']:
raise ValueError("Invalid CSV file format")
last_idx: int = 0
for row in reader:
row_idx: int = int(row[0])
if row_idx != last_idx:
sensor_data_values.append(create_sensor_data(data_collection, input_context_id, channels))
data_collection.clear()
for idx, value in enumerate(row[1:], start=1):
data_collection[header[idx]].append(float(value))
last_idx = row_idx
if len(data_collection) > 0:
sensor_data_values.append(create_sensor_data(data_collection, input_context_id, channels))
return sensor_data_values
def create_strokes(sensor_data_items: List[SensorData], style_stroke: Style, x_id: uuid.UUID, y_id: uuid.UUID) \
-> List[Stroke]:
"""
Create strokes from sensor data.
Parameters
----------
sensor_data_items: List[SensorData]
List of sensor data
style_stroke: Style
Style of the stroke
x_id: uuid.UUID
Reference id of x sensor channel
y_id: uuid.UUID
Reference id of y sensor channel
Returns
-------
List[Stroke]
List of strokes
"""
stroke_items: List[Stroke] = []
for sensor_data_i in sensor_data_items:
path: List[float] = []
# The spline path contains x, y values
mask: int = LayoutMask.X.value | LayoutMask.Y.value
for x, y in zip(sensor_data_i.get_data_by_id(x_id).values, sensor_data_i.get_data_by_id(y_id).values):
path.append(unit2unit(Unit.M, Unit.DIP, x))
path.append(unit2unit(Unit.M, Unit.DIP, y))
spline: Spline = Spline(layout_mask=mask, data=path)
# Create a stroke from spline
s_i: Stroke = Stroke(sid=UUIDIdentifier.id_generator(), spline=spline, style=style_stroke)
stroke_items.append(s_i)
return stroke_items
if __name__ == '__main__':
# Creates an ink model from the scratch.
ink_model: InkModel = InkModel(version=SupportedFormats.UIM_VERSION_3_1_0.value)
# Setting a unit scale factor
ink_model.unit_scale_factor = 1.5
# Using a 4x4 matrix for scaling
ink_model.transform = Matrix4x4.create_scale(1.5)
# Properties are added as key-value pairs
ink_model.properties.append(("Author", "Markus"))
ink_model.properties.append(("Locale", "en_US"))
# Create an environment
env: Environment = Environment()
# This should describe the environment in which the ink was captured
env.properties.append(("wacom.ink.sdk.lang", "js"))
env.properties.append(("wacom.ink.sdk.version", "2.0.0"))
env.properties.append(("runtime.type", "WEB"))
env.properties.append(("user.agent.brands", "Chromium 126, Google Chrome 126"))
env.properties.append(("user.agent.platform", "macOS"))
env.properties.append(("user.agent.mobile", "false"))
env.properties.append(("app.id", "sample_create_model_vector"))
env.properties.append(("app.version", "1.0.0"))
ink_model.input_configuration.environments.append(env)
# Ink input provider can be pen, mouse or touch.
provider: InkInputProvider = InkInputProvider(input_type=InkInputType.PEN)
ink_model.input_configuration.ink_input_providers.append(provider)
# Input device is the sensor (pen tablet, screen, etc.)
input_device: InputDevice = InputDevice()
input_device.properties.append(("dev.manufacturer", "Wacom"))
input_device.properties.append(("dev.model", "Wacom One"))
input_device.properties.append(("dev.product.code", "DTC-133"))
input_device.properties.append(("dev.graphics.resolution", "1920x1080"))
ink_model.input_configuration.devices.append(input_device)
# Create a group of sensor channels
sensor_channels: list = [
SensorChannel(channel_type=InkSensorType.X, metric=InkSensorMetricType.LENGTH, resolution=1.0,
ink_input_provider_id=provider.id, input_device_id=input_device.id),
SensorChannel(channel_type=InkSensorType.Y, metric=InkSensorMetricType.LENGTH, resolution=1.0,
ink_input_provider_id=provider.id, input_device_id=input_device.id),
SensorChannel(channel_type=InkSensorType.TIMESTAMP, metric=InkSensorMetricType.TIME, resolution=1000.0,
precision=0,
ink_input_provider_id=provider.id, input_device_id=input_device.id),
SensorChannel(channel_type=InkSensorType.PRESSURE, metric=InkSensorMetricType.NORMALIZED, resolution=1.0,
channel_min=0., channel_max=1.0,
ink_input_provider_id=provider.id, input_device_id=input_device.id),
SensorChannel(channel_type=InkSensorType.ALTITUDE, metric=InkSensorMetricType.ANGLE, resolution=1.0,
channel_min=0., channel_max=1.5707963705062866,
ink_input_provider_id=provider.id, input_device_id=input_device.id),
SensorChannel(channel_type=InkSensorType.AZIMUTH, metric=InkSensorMetricType.ANGLE, resolution=1.0,
channel_min=-3.1415927410125732, channel_max=3.1415927410125732,
ink_input_provider_id=provider.id, input_device_id=input_device.id)
]
# Create a sensor channels context
scc_wacom_one: SensorChannelsContext = SensorChannelsContext(channels=sensor_channels,
ink_input_provider_id=provider.id,
input_device_id=input_device.id,
latency=0,
sampling_rate_hint=240)
# Add sensor channel contexts
sensor_context: SensorContext = SensorContext()
sensor_context.add_sensor_channels_context(scc_wacom_one)
ink_model.input_configuration.sensor_contexts.append(sensor_context)
# Create the input context using the Environment and the Sensor Context
input_context: InputContext = InputContext(environment_id=env.id, sensor_context_id=sensor_context.id)
ink_model.input_configuration.input_contexts.append(input_context)
# Create sensor data
# The CSV file contains sensor data for strokes
# idx,SPLINE_X,SPLINE_Y,SENSOR_TIMESTAMP,SENSOR_PRESSURE,SENSOR_ALTITUDE,SENSOR_AZIMUTH
sensor_data = load_sensor_data(Path(__file__).parent / '..' / 'ink' / 'sensor_data' / 'ink.csv', input_context.id,
sensor_channels)
# Add sensor data to the model
for sensor_data_i in sensor_data:
ink_model.sensor_data.add(sensor_data_i)
# We need to define a brush polygon
points: list = [(10, 10), (0, 10), (0, 0), (10, 0)]
brush_polygons: list = [BrushPolygon(min_scale=0., points=points)]
# Create the brush object using polygons
vector_brush_0: VectorBrush = VectorBrush(
"app://qa-test-app/vector-brush/MyTriangleBrush",
brush_polygons)
# Add it to the model
ink_model.brushes.add_vector_brush(vector_brush_0)
# Add a brush specified with shape Uris
poly_uris: list = [
BrushPolygonUri("will://brush/3.0/shape/Circle?precision=20&radius=1", 0.),
BrushPolygonUri("will://brush/3.0/shape/Ellipse?precision=20&radiusX=1&radiusY=0.5", 4.0)
]
# Define a second brush
vector_brush_1: VectorBrush = VectorBrush(
"app://qa-test-app/vector-brush/MyEllipticBrush",
poly_uris)
# Add it to the model
ink_model.brushes.add_vector_brush(vector_brush_1)
# Specify the layout of the stroke data, in this case the stroke will have variable X, Y and Size properties.
layout_mask: int = LayoutMask.X.value | LayoutMask.Y.value | LayoutMask.SIZE.value
# Create some style
style: Style = Style(brush_uri=vector_brush_1.name)
# Set the color of the strokes
style.path_point_properties.red = 0.1
style.path_point_properties.green = 0.2
style.path_point_properties.blue = 0.4
style.path_point_properties.alpha = 1.0
# Create the strokes
strokes = create_strokes(sensor_data, style, sensor_channels[0].id, sensor_channels[1].id)
# First you need a root group to contain the strokes
root: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())
# Assign the group as the root of the main ink tree
ink_model.ink_tree = InkTree()
ink_model.ink_tree.root = root
# Adding the strokes to the root group
for stroke in strokes:
root.add(StrokeNode(stroke))
# Adding view for handwriting recognition results
hwr_tree: ViewTree = ViewTree(schema.CommonViews.HWR_VIEW.value)
# Add view right after creation, to avoid warnings that tree is not yet attached
ink_model.add_view(hwr_tree)
# Create a root node for the HWR view
hwr_root: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())
hwr_tree.root = hwr_root
ink_model.knowledge_graph.append(schema.SemanticTriple(hwr_root.uri, schema.IS, schema.SegmentationSchema.ROOT))
ink_model.knowledge_graph.append(schema.SemanticTriple(hwr_root.uri, schema.SegmentationSchema.REPRESENTS_VIEW,
schema.CommonViews.HWR_VIEW.value))
# Here you can add the same strokes as in the main tree, but you can organize them in a different way
# (put them in different groups)
# You are not supposed to add strokes that are not already in the main tree.
text_region: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())
hwr_root.add(text_region)
ink_model.knowledge_graph.append(schema.SemanticTriple(text_region.uri, schema.IS,
schema.SegmentationSchema.TEXT_REGION))
# The text_line root denotes the text line
text_line: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())
text_region.add(text_line)
ink_model.knowledge_graph.append(schema.SemanticTriple(text_line.uri, schema.IS,
schema.SegmentationSchema.TEXT_LINE))
# The word node denotes a word
word: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())
text_line.add(word)
ink_model.knowledge_graph.append(schema.SemanticTriple(word.uri, schema.IS, schema.SegmentationSchema.WORD))
ink_model.knowledge_graph.append(schema.SemanticTriple(word.uri, schema.SegmentationSchema.HAS_CONTENT, "ink"))
ink_model.knowledge_graph.append(schema.SemanticTriple(word.uri, schema.SegmentationSchema.HAS_LANGUAGE, "en_US"))
# Add the strokes to the word
for stroke_i in strokes:
word.add(StrokeNode(stroke_i))
# We need a URI builder
uri_builder: URIBuilder = URIBuilder()
# Create a named entity
named_entity_uri: str = uri_builder.build_named_entity_uri(UUIDIdentifier.id_generator())
ink_model.knowledge_graph.append(schema.SemanticTriple(word.uri,
schema.NamedEntityRecognitionSchema.PART_OF_NAMED_ENTITY,
named_entity_uri))
# Add knowledge for the named entity
ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri, "hasPart-0", word.uri))
ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,
schema.NamedEntityRecognitionSchema.HAS_LABEL, "Ink"))
ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,
schema.NamedEntityRecognitionSchema.HAS_LANGUAGE, "en_US"))
ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,
schema.NamedEntityRecognitionSchema.HAS_CONFIDENCE, "0.95"))
ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,
schema.NamedEntityRecognitionSchema.HAS_ARTICLE_URL,
'https://en.wikipedia.org/wiki/Ink'))
ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,
schema.NamedEntityRecognitionSchema.HAS_UNIQUE_ID, 'Q127418'))
# Save the model, this will overwrite an existing file
with open('3_1_0_vector.uim', 'wb') as uim:
# unicode(data) auto-decodes data to unicode if str
uim.write(UIMEncoder310().encode(ink_model))
# Convert the model to JSON
with open('ink.json', 'w') as f:
# json_encode is a helper function to convert the model to JSON
f.write(json_encode(ink_model))
```
Find the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_create_model_vector.py)
## Converting Ink Model
### To JSON
The `InkModel` can be converted to JSON format using the `json_encode` helper function.
This is useful for debugging purposes or for storing the model in a human-readable format.
Deserialization is not supported.
```python
from pathlib import Path
from uim.codec.parser.uim import UIMParser
from uim.model.helpers.serialize import json_encode
from uim.model.ink import InkModel
if __name__ == '__main__':
parser: UIMParser = UIMParser()
ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'special' / 'ink.uim')
# Convert the model to JSON
with open('ink.json', 'w') as f:
# json_encode is a helper function to convert the model to JSON
f.write(json_encode(ink_model))
```
Find the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_extract_to_json.py)
### Sensor data to CSV
The sensor data can be exported to a CSV file.
```python
from pathlib import Path
from typing import List
from uim.codec.parser.uim import UIMParser
from uim.model.helpers.serialize import serialize_sensor_data_csv
from uim.model.ink import InkModel
from uim.model.inkdata.strokes import InkStrokeAttributeType
if __name__ == '__main__':
parser: UIMParser = UIMParser()
# This file contains ink from different providers: PEN, TOUCH, MOUSE
ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'special' / 'ink.uim')
# Decide which attributes to serialize
layout: List[InkStrokeAttributeType] = [
InkStrokeAttributeType.SPLINE_X, InkStrokeAttributeType.SPLINE_Y, InkStrokeAttributeType.SENSOR_TIMESTAMP,
InkStrokeAttributeType.SENSOR_PRESSURE, InkStrokeAttributeType.SENSOR_ALTITUDE,
InkStrokeAttributeType.SENSOR_AZIMUTH
]
# Serialize the model to CSV
serialize_sensor_data_csv(ink_model, Path('sensor_data.csv'), layout=layout)
```
Find the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_extract_to_csv.py)
## Extracting statistics
The `StatisticsAnalyzer` can be used to extract statistics from the `InkModel`.
The statistics are extracted from the ink data, sensor data, and input configuration.
```python
from pathlib import Path
from typing import Dict, Any
from uim.codec.parser.uim import UIMParser
from uim.model.ink import InkModel
from uim.utils.statistics import StatisticsAnalyzer
def print_model_stats(key: str, value: Any, indent: str = ""):
"""
Print the model statistics.
Parameters
----------
key: str
Key string
value: Any
Value
indent: str
Indentation
"""
if isinstance(value, float):
print(f'{indent}{key}: {value:.2f}')
elif isinstance(value, int):
print(f'{indent}{key}: {value:d}')
elif isinstance(value, str):
print(f'{indent}{key}: {value}')
elif isinstance(value, Dict):
print(f'{indent}{key}:')
for key_str_2, next_value in value.items():
print_model_stats(key_str_2, next_value, indent + " ")
if __name__ == '__main__':
parser: UIMParser = UIMParser()
ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'uim_3.1.0'/
'2) Digital Ink is processable 1 (3.1 delta).uim')
model_analyser: StatisticsAnalyzer = StatisticsAnalyzer()
stats: Dict[str, Any] = model_analyser.analyze(ink_model)
for key_str, value_str in stats.items():
print_model_stats(key_str, value_str)
```
Find the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_analyse.py)
## Convert InkML to UIM
In the following examples, we will demonstrate how to convert an InkML file from well-known datasets to UIM.
### IAM On-Line Handwriting Database
The implementation supports the [IAM On-Line Handwriting Database](https://fki.tic.heia-fr.ch/databases/iam-on-line-handwriting-database) as a sample dataset for testing the conversion of InkML to UIM.
Its annotations can be converted to Wacom Ontology Definition Language (WODL) segmentation schema, by configuring the InkMLParser as follows:
```python
from pathlib import Path
from typing import Dict, Any, List
from uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310
from uim.model.helpers.schema_content_extractor import uim_schema_semantics_from
from uim.model.ink import InkModel
from uim.model.semantics.schema import SegmentationSchema, IS
from uim.utils.print import print_tree
from uim.codec.parser.inkml import InkMLParser
if __name__ == '__main__':
parser: InkMLParser = InkMLParser()
parser.set_typedef_pred(IS)
parser.register_type('type', 'Document', SegmentationSchema.ROOT)
parser.register_type('type', 'Formula', SegmentationSchema.MATH_BLOCK)
parser.register_type('type', 'Arrow', SegmentationSchema.CONNECTOR)
parser.register_type('type', 'Table', SegmentationSchema.TABLE)
parser.register_type('type', 'Structure', SegmentationSchema.BORDER)
parser.register_type('type', 'Diagram', SegmentationSchema.DIAGRAM)
parser.register_type('type', 'Drawing', SegmentationSchema.DRAWING)
parser.register_type('type', 'Correction', SegmentationSchema.CORRECTION)
parser.register_type('type', 'Symbol', '<T>')
parser.register_type('type', 'Marking', SegmentationSchema.MARKING)
parser.register_type('type', 'Marking_Bracket', SegmentationSchema.MARKING,
subtypes=[(SegmentationSchema.HAS_MARKING_TYPE, 'other')])
parser.register_type('type', 'Marking_Encircling', SegmentationSchema.MARKING,
subtypes=[(SegmentationSchema.HAS_MARKING_TYPE, 'encircling')])
parser.register_type('type', 'Marking_Angle', SegmentationSchema.MARKING,
subtypes=[(SegmentationSchema.HAS_MARKING_TYPE, 'other')])
parser.register_type('type', 'Marking_Underline', SegmentationSchema.MARKING,
subtypes=[(SegmentationSchema.HAS_MARKING_TYPE,
"underlining")])
parser.register_type('type', 'Marking_Sideline', SegmentationSchema.MARKING,
subtypes=[(SegmentationSchema.HAS_MARKING_TYPE, 'other')])
parser.register_type('type', 'Marking_Connection', SegmentationSchema.CONNECTOR)
parser.register_type('type', 'Textblock', SegmentationSchema.TEXT_REGION)
parser.register_type('type', 'Textline', SegmentationSchema.TEXT_LINE)
parser.register_type('type', 'Word', SegmentationSchema.WORD)
parser.register_type('type', 'Garbage', SegmentationSchema.GARBAGE)
parser.register_type('type', 'List', SegmentationSchema.LIST)
parser.register_value('transcription', SegmentationSchema.HAS_CONTENT)
parser.cropping_ink = False
parser.cropping_offset = 10
ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'inkml' / 'iamondb.inkml')
structures: List[Dict[str, Any]] = uim_schema_semantics_from(ink_model, "custom")
print_tree(structures)
with Path("iamondb.uim").open("wb") as file:
file.write(UIMEncoder310().encode(ink_model))
```
The implementation is provided as a sample and may require additional configuration and testing to work with other datasets.
With the `register_type` method, the parser can be configured to map the annotation types to the segmentation schema defined in the WODL.
The `register_value` method can be used to map the annotation values to the content of the segmentation schema.
Note, that this mapping may fully comply with the WODL schema, but it is a sample implementation and may require additional configuration or post-processing.
The sample document from the IAM On-Line Handwriting Database can't be uploaded to the repository due to the license restrictions.
### Kondate
The implementation supports the [Kondate](https://web.tuat.ac.jp/~nakagawa/database/en/kondate_about.html) dataset as a sample dataset for testing the conversion of InkML to UIM.
```python
import uuid
from pathlib import Path
from uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310
from uim.model.ink import InkModel
from uim.model.inkdata.brush import BrushPolygonUri, VectorBrush
from uim.model.semantics.schema import SegmentationSchema, CommonViews
from uim.codec.parser.inkml import InkMLParser
if __name__ == '__main__':
parser: InkMLParser = InkMLParser()
# Add a brush specified with shape Uris
bpu_1: BrushPolygonUri = BrushPolygonUri("will://brush/3.0/shape/Circle?precision=20&radius=1", min_scale=0.)
bpu_2: BrushPolygonUri = BrushPolygonUri("will://brush/3.0/shape/Circle?precision=20&radius=0.5", min_scale=4.)
poly_uris: list = [
bpu_1, bpu_2
]
vector_brush_1: VectorBrush = VectorBrush(
"app://qa-test-app/vector-brush/MyEllipticBrush",
poly_uris)
parser.register_brush(brush_uri='default', brush=vector_brush_1)
parser.use_brush = 'default'
device_id: str = uuid.uuid4().hex
parser.update_default_context(sample_rate=80, serial_number=device_id, manufacturer="Test Manufacturer",
model="Test Model")
parser.content_view = CommonViews.HWR_VIEW.value
parser.cropping_ink = True
parser.default_annotation_type = SegmentationSchema.UNLABELED
parser.default_xy_resolution = 10
parser.default_position_precision = 3
parser.default_value_resolution = 42
# Kondate database is not using namespace
parser.default_namespace = ''
ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'inkml' / 'kondate.inkml')
with Path("kondate.uim").open("wb") as file:
file.write(UIMEncoder310().encode(ink_model))
```
The sample document from the Kondate dataset can't be uploaded to the repository due to the license restrictions.
## IOT Paper Format
The format encodes the ink as InkML, but additionally it encodes a template image as base64.
```xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<paper xmlns:inkml="http://www.w3.org/2003/InkML"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2003/InkML">
<resource>
<templateImage Content-Type="image/bmp">
<!-- Base64 encoded template -->
</templateImage>
</resource>
<inkml:ink>
<!-- Ink content encoded as InkML -->
</inkml:ink>
</paper>
```
This sample implementation provides a way to convert the IOT Paper Format to UIM and extract the template image.
```python
from pathlib import Path
from typing import List
from uim.codec.parser.iotpaper import IOTPaperParser
from uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310
from uim.model.helpers.serialize import json_encode, serialize_raw_sensor_data_csv
from uim.model.ink import InkModel
from uim.model.inkinput.inputdata import InkSensorType, Unit
if __name__ == '__main__':
paper_file: Path = Path(__file__).parent / '..' / '..' / 'ink' / 'iot' / 'HelloInk.paper'
parser: IOTPaperParser = IOTPaperParser()
parser.cropping_ink = False
parser.cropping_offset = 10
ink_model: InkModel = parser.parse(paper_file)
img: bytes = parser.parse_template(paper_file)
with Path("iot.uim").open("wb") as file:
file.write(UIMEncoder310().encode(ink_model))
with Path("template.bmp").open("wb") as file:
file.write(img)
layout: List[InkSensorType] = [
InkSensorType.TIMESTAMP, InkSensorType.X, InkSensorType.Y, InkSensorType.Z,
InkSensorType.PRESSURE, InkSensorType.ALTITUDE,
InkSensorType.AZIMUTH
]
# In the Universal Ink Model, the sensor data is in SI units:
# - timestamp: seconds
# - x, y, z: meters
# - pressure: N
serialize_raw_sensor_data_csv(ink_model, Path('sensor_data.csv'), layout)
# If you want to convert the data to different units, you can use the following code:
serialize_raw_sensor_data_csv(ink_model, Path('sensor_data_unit.csv'), layout,
{
InkSensorType.X: Unit.MM, # Convert meters to millimeters
InkSensorType.Y: Unit.MM, # Convert meters to millimeters
InkSensorType.Z: Unit.MM, # Convert meters to millimeters
InkSensorType.TIMESTAMP: Unit.MS # Convert seconds to milliseconds
})
# Convert the model to JSON
with open('ink.json', 'w') as f:
# json_encode is a helper function to convert the model to JSON
f.write(json_encode(ink_model))
```
### NOTICE
This implementation is a sample implementation and does not cover all possible cases of InkML files.
There is no guarantee that the implementation will work for all InkML files.
Additional testing and validation may be required to ensure the correctness of the implementation.
Finally, the implementation is provided as-is and without any warranty or support.
# Web Demos
The following web demos can be used to produce Universal Ink Model files:
- [WILL SDK for ink - Demo](https://ink-demo.wacom.com/) - producing UIM 3.1.0 files.
# Documentation
You can find more detailed technical documentation, [here](https://developer-docs.wacom.com/sdk-for-ink/docs/model).
API documentation is available [here](docs/uim/index.md).
# Usage
The library is used for machine learning experiments based on digital ink using the Universal Ink Model.
# License
[Apache License 2.0](LICENSE)
Raw data
{
"_id": null,
"home_page": "https://github.com/Wacom-Developer/universal-ink-library",
"name": "universal-ink-library",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "universal ink model;digital ink;wacom ink technologies",
"author": "Markus Weber",
"author_email": "markus.weber@wacom.com",
"download_url": "https://files.pythonhosted.org/packages/41/b7/05838b82022cd9daf10c4d7c130f7d1881384c0b58161cc828f7dd0cacaa/universal_ink_library-2.1.0.tar.gz",
"platform": null,
"description": "# Universal Ink Library\n[![Python package](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/python-package.yml/badge.svg)](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/python-package.yml)\n[![Pylint](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/pylint.yml/badge.svg)](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/pylint.yml)\n[![PyTest](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/pytest.yml/badge.svg)](https://github.com/Wacom-Developer/universal-ink-library/actions/workflows/pytest.yml)\n\n![License: Apache 2](https://img.shields.io/badge/License-Apache2-green.svg)\n[![PyPI](https://img.shields.io/pypi/v/universal-ink-library.svg)](https://pypi.python.org/pypi/universal-ink-library)\n[![PyPI](https://img.shields.io/pypi/pyversions/universal-ink-library.svg)](https://pypi.python.org/pypi/universal-ink-library)\n[![Documentation](https://img.shields.io/badge/api-reference-blue.svg)](https://developer-docs.wacom.com/sdk-for-ink/docs/model) \n\n![Contributors](https://img.shields.io/github/contributors/Wacom-Developer/universal-ink-library.svg)\n![GitHub forks](https://img.shields.io/github/forks/Wacom-Developer/universal-ink-library.svg)\n![GitHub stars](https://img.shields.io/github/stars/Wacom-Developer/universal-ink-library.svg)\n\n\nUniversal Ink Library is a pure Python package for working with Universal Ink Models ([UIM](https://developer.wacom.com/products/universal-ink-model)).\nThe UIM defines a language-neutral and platform-neutral data model for representing and manipulating digital ink data captured using an electronic pen or stylus, or using touch input.\n\nThe main aspects of the UIM are:\n\n- Interoperability of ink-based data models by defining a standardized interface with other systems\n- Biometric data storage mechanism\n- Spline data storage mechanism\n- Rendering configurations storage mechanism\n- Ability to compose spline/raw-input based logical trees, which are contained within the ink model\n- Portability, by enabling conversion to common industry standards\n- Extensibility, by enabling the description of ink data related semantic meta-data\n- Standardized serialization mechanism\n\nThis reference document defines a RIFF container and Protocol Buffers schema for serialization of ink models as well as \na standard mechanism to describe relationships between different parts of the ink model, and/or between parts of the ink \nmodel and external entities.\n\nThe specified serialization schema is based on the following standards:\n\n- **Resource Interchange File Format (RIFF)** - A generic file container format for storing data in tagged chunks\n- **Protocol Buffers v3** - A language-neutral, platform-neutral extensible mechanism for serializing structured data\n- **Resource Description Framework (RDF)** - A standard model for data interchange on the Web\n- **OWL 2 Web Ontology Language (OWL2)** - An ontology language for the Semantic Web with formally defined meaning\n\n## Data Model\nThe *Universal Ink Model* has five fundamental categories:\n\n- **Input data**: A collection of data repositories, holding raw sensor input, input device/provider configurations, sensor channel configurations, etc. Each data repository keeps certain data-sets isolated and is responsible for specific type(s) of data\n- **Ink data**: The visual appearance of the digital ink, presented as ink geometry with rendering configurations\n- **Meta-data**: Related meta-data about the environment, input devices, etc.\n- **Ink Trees / Views**: A collection of logical trees, representing structures of hierarchically organized paths or raw input data-frames\n- **Semantic triple store**: An RDF compliant triple store, holding semantic information, such as text structure, handwriting recognition results, and semantic entities\n\nThe diagram below illustrates the different logical parts of the ink model.\n![Logical Parts of Ink Model.](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/uim-v1.png)\n\nThis UML diagram illustrates the complete Ink Model in terms of logical models and class dependencies.\n![UML Diagram](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/uim-uml-all-v9.png)\n\nThe *Universal Ink Model* provides the flexibility required for a variety of applications, since the display of pen data is only one aspect.\nFor example, the same data can be used for data mining or even signature comparison, while the ink display can be on a range of platforms potentially requiring different scaling and presentation.\n\n## Input data\n\nIn reality, pen data is captured from a pen device as a set of positional points:\n\n![Digital-ink-w](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/overview_ink_device_sensor_channels.png)\n\nDepending on the type of hardware, in addition to the x/y positional coordinates, the points can contain further information such as pen tip force and angle.\nCollectively, this information is referred to as sensor data and the *Universal Ink Model* provides a means of storing all the available data.\nFor example, with some types of hardware, pen hover coordinates can be captured while the pen is not in contact with the surface.\nThe information is saved in the *Universal Ink Model* and can be used when required.\n\n## Ink data\n\nInk data is the result of the [ink geometry pipeline](https://developer-docs.wacom.com/sdk-for-ink/docs/pipeline) of the [WILL SDK for ink](https://developer.wacom.com/products/will-sdk-for-ink).\nPen strokes are identified as continuous sets of pen coordinates captured while the pen is in contact with the surface. \nFor example, writing the letter \u2018w', as illustrated below.\nThe process converts each pen stroke into a mathematical representation, which can then be used to render the shape on a display.\nSteps in the so-called Ink Geometry pipeline are illustrated below where each step is configured by an application to generate the desired output:\n\n![Digital-ink-rendering](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/pen-data-w-rendering.png)\n\nAs a result, the data points are smoothed and shaped to produce the desired representation. \nFor example, simulating the appearance of a felt-tip ink pen.\nRaster and vector rendering is supported with a selection of rendering brush types.\n\nThe results are saved as Ink data, containing ink geometry and rendering information.\n\n## Meta-data\n\nMeta-data is added as data about the pen data.\nThe *Universal Ink Model* allows for administrative information such as author name, location, pen data source, etc.\nFurther meta-data is computed by analysis of the pen data.\nAn example of digital ink is annotated below:\n\n![Digital-ink-annotated](https://github.com/Wacom-Developer/universal-ink-library/raw/main/assets/pen-data-annotated.png)\n\nThe labels identify pen strokes *s1, s2, s3*, etc.\nIn addition, groups of strokes are identified as *g1, g2, g3*, etc.\nPen strokes are passed to a handwriting recognition engine, and the results are stored as additional meta-data, generally referred to as semantic data.\nThe semantic data is stored with reference to the groups, categorized as single characters, individual words, lines of text, and so on.\n\n## Semantic Data\n\nThe Ink Model Specification provides a standard mechanism to describe relationships between different parts of the ink model, and/or between parts of the ink model and external entities. \nThe Ink Model keeps an instance of a RDF or WODL-compliant triple store, called Knowledge Graph in the scope of this document. \nThis triple store holds a list of semantic triples to encode relationships between subject, predicate and object as defined in the RDF specification.\n\nUsing the knowledge graph nodes of the ink trees, contained within the ink model, could be annotated with additional metadata in order to describe different aspects of the ink model, for instance - text segmentation view, named entity recognition view, etc.\n\n### Wacom Ontology Definition Language (WODL)\n\nAt a fundamental level, digital ink can be used producing content, such as text, math, diagrams, sketches etc.\n\nThe Wacom Ontology Description Language (WODL) provides a standardized, JSON-based way of annotating ink with a specialized schema definition.\n\nThe specification of WODL language is available [here](https://developer-docs.wacom.com/docs/specifications/wodl/).\nSome of the most common schema definitions are:\n- (Segmentation Schema)[https://developer-docs.wacom.com/docs/specifications/schemas/segmentation/]\n- (Math Structures Schema)[https://developer-docs.wacom.com/docs/specifications/schemas/math-structures/]\n- (Named Entity Recognition Schema)[https://developer-docs.wacom.com/docs/specifications/schemas/ner/]\n\n\n\n# Installation\n\nOur Universal Ink Library can be installed using pip.\n\n``\n $ pip install universal-ink-library\n``\n\n\n# Quick Start\n\n## File handling\n\n### Loading UIM\n\nThe `UIMParser` is be used to load a serialized Universal Ink Model in version 3.0.0 or 3.1.0 and you receive the memory model `InkModel` which can be used for extracting the relevant data.\n\n```python\nfrom uim.codec.parser.uim import UIMParser\nfrom uim.model.ink import InkModel\n\nparser: UIMParser = UIMParser()\n# ---------------------------------------------------------------------------------\n# Parse a UIM file version 3.0.0\n# ---------------------------------------------------------------------------------\nink_model_1: InkModel = UIMParser().parse('../ink/uim_3.0.0/1) Value of Ink 1.uim')\n# ---------------------------------------------------------------------------------\n# Parse a UIM file version 3.1.0\n# ---------------------------------------------------------------------------------\nink_model_2: InkModel = UIMParser().parse('../ink/uim_3.1.0/1) Value of Ink 1.uim')\n\n```\n\n### Saving of UIM\n\nSaving the `InkModel` as a Universal Ink Model file.\n\n```python\nfrom uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310\nfrom uim.model.ink import InkModel\n\nink_model: InkModel = InkModel()\n... \n\n# Save the model, this will overwrite an existing file\nwith open('3_1_0.uim', 'wb') as uim:\n # unicode(data) auto-decodes data to unicode if str\n uim.write(UIMEncoder310().encode(ink_model))\n```\n\nFind the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_file_handling.py)\n\n## InkModel\n\n### Iterate over semantics\n\nIf the `InkModel` is enriched with semantics from handwriting recognition and named entity recognition, or named entity linking.\nThe semantics an be access with a helper function `uim_extract_text_and_semantics_from` or by iterating the views, like shown in `uim_extract_text_and_semantics_from` function:\n\n```python\nfrom pathlib import Path\n\nfrom uim.codec.parser.uim import UIMParser\nfrom uim.model.helpers.text_extractor import uim_extract_text_and_semantics_from\nfrom uim.model.ink import InkModel\nfrom uim.model.semantics.schema import CommonViews\n\nif __name__ == '__main__':\n parser: UIMParser = UIMParser()\n ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'uim_3.1.0' /\n '2) Digital Ink is processable 1 (3.1 delta).uim')\n if ink_model.has_knowledge_graph() and ink_model.has_tree(CommonViews.HWR_VIEW.value):\n # The sample\n words, entities, text = uim_extract_text_and_semantics_from(ink_model, hwr_view=CommonViews.HWR_VIEW.value)\n print('=' * 100)\n print(' Recognised text: ')\n print(text)\n print('=' * 100)\n print(' Words:')\n print('=' * 100)\n for word_idx, word in enumerate(words):\n print(f' Word #{word_idx + 1}:')\n print(f' Text: {word[\"text\"]}')\n print(f' Alternatives: {word[\"alternatives\"]}')\n print(f' Bounding box: x:={word[\"bounding_box\"][\"x\"]}, y:={word[\"bounding_box\"][\"y\"]}, '\n f'width:={word[\"bounding_box\"][\"width\"]}, height:={word[\"bounding_box\"][\"height\"]}')\n print('')\n print('=' * 100)\n print(' Entities:')\n print('=' * 100)\n entity_idx: int = 1\n for entity_uri, entity_mappings in entities.items():\n print(f' Entity #{entity_idx}: URI: {entity_uri}')\n print(\"-\" * 100)\n print(f\" Label: {entity_mappings[0]['label']}\")\n print(' Ink Stroke IDs:')\n for word_idx, entity in enumerate(entity_mappings):\n print(f\" #{word_idx + 1}: Word match: {entity['path_id']}\")\n print('=' * 100)\n entity_idx += 1\n```\n\nFind the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_semantic_data.py)\n\nA more generic approach to extract schema elements is the helper function `uim_schema_semantics_from`. \nThis function extracts schema elements from the ink model, like math structures, text, tables, etc..\n\nAn example for extracting math structures is shown below:\n```python\n{\n 'node_uri': UUID('16918b3f-b192-466e-83a3-54835ddfff11'),\n 'parent_uri': UUID('16918b3f-b192-466e-83a3-54835ddfff11'),\n 'path_id': [\n UUID('16918b3f-b192-466e-83a3-54835ddfff11')\n ],\n 'bounding_box': {\n 'x': 175.71, 'y': 150.65, \n 'width': 15.91, 'height': 27.018\n },\n 'type': 'will:math-structures/0.1/Symbol',\n 'attributes': [\n ('symbolType', 'Numerical'), ('representation', 'e')\n ]\n}\n```\n\nWith the extracted schema elements, you can build a tree structure, like shown in the following example:\n\n```python\nfrom pathlib import Path\nfrom typing import List, Dict, Any, Tuple\n\nfrom uim.codec.parser.uim import UIMParser\nfrom uim.model.helpers.schema_content_extractor import uim_schema_semantics_from\nfrom uim.model.ink import InkModel\nfrom uim.model.semantics.schema import CommonViews\n\n\ndef build_tree(node_list: List[Dict[str, Any]]):\n \"\"\"\n Build a tree structure from the node list.\n Parameters\n ----------\n node_list: `List[Dict[str, Any]]`\n List of nodes\n \"\"\"\n # Step 1: Create dictionaries for nodes and parent-child relationships\n children_dict: Dict[str, Any] = {}\n\n for node in node_list:\n parent_uri: str = node['parent_uri']\n if parent_uri is not None:\n if parent_uri not in children_dict:\n children_dict[parent_uri] = []\n children_dict[parent_uri].append(node)\n\n # Step 2: Define a recursive function to print the tree\n def print_tree(node_display: Dict[str, Any], indent: int = 0):\n info: str = \"\"\n attributes: List[Tuple[str, Any]] = node_display.get('attributes', [])\n if \"path_id\" in node_display:\n info = f\"(#strokes:={len(node_display['path_id'])})\"\n elif \"bounding_box\" in node_display:\n info = (f\"(x:={node_display['bounding_box']['x']}, y:={node_display['bounding_box']['y']}, \"\n f\"width:={node_display['bounding_box']['width']}, \"\n f\"height:={node_display['bounding_box']['height']})\")\n print('|' + '-' * indent + f\" [type:={node_display['type']}] - {info}\")\n if len(attributes) > 0:\n print('|' + ' ' * (indent + 4) + \"| -[Attributes:]\")\n for key, value in attributes:\n print('|' + ' ' * (indent + 8) + f\"\\t|-- {key}:={value}\")\n if node_display['node_uri'] in children_dict:\n for child in children_dict[node_display['node_uri']]:\n print_tree(child, indent + 4)\n\n # Step 3: Find the root node (where parent_uri is None) and start printing the tree\n for node in node_list:\n if node['parent_uri'] is None:\n print_tree(node)\n\n\nif __name__ == '__main__':\n parser: UIMParser = UIMParser()\n # Parse UIM v3.0.0\n ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'schemas' / 'math-structures.uim')\n math_structures: List[Dict[str, Any]] = uim_schema_semantics_from(ink_model,\n semantic_view=CommonViews.HWR_VIEW.value)\n # Print the tree structure\n build_tree(math_structures)\n```\n\nFind the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_extract_math.py)\n\n\n### Accessing input and ink data\nIn order to access ink input configuration data, sensor data, or stroke data from `InkModel`, you can use the following functions:\n\n```python\nfrom pathlib import Path\nfrom typing import Dict\nfrom uuid import UUID\n\nfrom uim.codec.parser.uim import UIMParser\nfrom uim.model.ink import InkModel\nfrom uim.model.inkinput.inputdata import InkInputType, InputContext, SensorContext, InputDevice\nfrom uim.model.inkinput.sensordata import SensorData\n\nif __name__ == '__main__':\n parser: UIMParser = UIMParser()\n # This file contains ink from different providers: PEN, TOUCH, MOUSE\n ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'uim_3.1.0' /\n '6) Different Input Providers.uim')\n mapping_type: Dict[UUID, InkInputType] = {}\n if ink_model.has_ink_structure():\n print('InkInputProviders:')\n print('-------------------')\n # Iterate Ink input providers\n for ink_input_provider in ink_model.input_configuration.ink_input_providers:\n print(f' InkInputProvider. ID: {ink_input_provider.id} | type: {ink_input_provider.type}')\n mapping_type[ink_input_provider.id] = ink_input_provider.type\n print()\n print('Strokes:')\n print('--------')\n # Iterate over strokes\n for stroke in ink_model.strokes:\n print(f'|- Stroke (id:={stroke.id} | points count: {stroke.points_count})')\n if stroke.style and stroke.style.path_point_properties:\n print(f'| |- Style (render mode:={stroke.style.render_mode_uri} | color:=('\n f'red: {stroke.style.path_point_properties.red}, '\n f'green: {stroke.style.path_point_properties.green}, '\n f'blue: {stroke.style.path_point_properties.green}, '\n f'alpha: {stroke.style.path_point_properties.alpha}))')\n # Stroke is produced by sensor data being processed by the ink geometry pipeline\n sd: SensorData = ink_model.sensor_data.sensor_data_by_id(stroke.sensor_data_id)\n # Get InputContext for the sensor data\n input_context: InputContext = ink_model.input_configuration.get_input_context(sd.input_context_id)\n # Retrieve SensorContext\n sensor_context: SensorContext = ink_model.input_configuration\\\n .get_sensor_context(input_context.sensor_context_id)\n for scc in sensor_context.sensor_channels_contexts:\n # Sensor channel context is referencing input device\n input_device: InputDevice = ink_model.input_configuration.get_input_device(scc.input_device_id)\n print(f'| |- Input device (id:={input_device.id} | type:=({mapping_type[scc.input_provider_id]})')\n # Iterate over sensor channels\n for c in scc.channels:\n print(f'| | |- Sensor channel (id:={c.id} | name: {c.type.name} '\n f'| values: {sd.get_data_by_id(c.id).values}')\n print('|')\n```\n\nFind the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_input_and_ink.py)\n\n## Creating an Ink Model \nCreating an `InkModel` from the scratch.\nThe [csv file](ink/sensor_data/ink.csv) contains sensor data for strokes. \nThe script loads the sensor data from the CSV file and creates strokes from the sensor data.\n\n```csv\nidx,SPLINE_X,SPLINE_Y,SENSOR_TIMESTAMP,SENSOR_PRESSURE,SENSOR_ALTITUDE,SENSOR_AZIMUTH\n0,277.1012268066406,183.11183166503906,1722443386312.649,0.07,0.6,0.72\n0,277.1012268066406,183.1713409423828,1722443386312.653,0.11000000000000001,0.6,0.72\n...\n```\n\n```python\nimport csv\nimport uuid\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom typing import List, Dict\n\nfrom uim.codec.parser.base import SupportedFormats\nfrom uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310\nfrom uim.model.base import UUIDIdentifier\nfrom uim.model.helpers.serialize import json_encode\nfrom uim.model.ink import InkModel, InkTree, ViewTree\nfrom uim.model.inkdata.brush import VectorBrush, BrushPolygon, BrushPolygonUri\nfrom uim.model.inkdata.strokes import Spline, Style, Stroke, LayoutMask\nfrom uim.model.inkinput.inputdata import Environment, InkInputProvider, InkInputType, InputDevice, SensorChannel, \\\n InkSensorType, InkSensorMetricType, SensorChannelsContext, SensorContext, InputContext, unit2unit, Unit\nfrom uim.model.inkinput.sensordata import SensorData, InkState\nfrom uim.model.semantics import schema\nfrom uim.model.semantics.node import StrokeGroupNode, StrokeNode, URIBuilder\nfrom uim.utils.matrix import Matrix4x4\n\n\ndef create_sensor_data(data_collection: Dict[str, List[float]],\n input_context_id: uuid.UUID, channels: List[SensorChannel]) -> SensorData:\n \"\"\"\n Create sensor data from a data collection.\n Parameters\n ----------\n data_collection: Dict[str, List[float]]\n\n input_context_id\n channels\n\n Returns\n -------\n SensorData\n Instance of SensorData\n \"\"\"\n sd: SensorData = SensorData(UUIDIdentifier.id_generator(), input_context_id=input_context_id, state=InkState.PLANE)\n sd.add_data(channels[0], [unit2unit(Unit.DIP, Unit.M, v) for v in data_collection['SPLINE_X']])\n sd.add_data(channels[1], [unit2unit(Unit.DIP, Unit.M, v) for v in data_collection['SPLINE_Y']])\n sd.add_timestamp_data(channels[2], data_collection['SENSOR_TIMESTAMP'])\n sd.add_data(channels[3], data_collection['SENSOR_PRESSURE'])\n sd.add_data(channels[4], data_collection['SENSOR_AZIMUTH'])\n sd.add_data(channels[5], data_collection['SENSOR_ALTITUDE'])\n return sd\n\n\ndef load_sensor_data(csv_path: Path, input_context_id: uuid.UUID, channels: List[SensorChannel]) -> List[SensorData]:\n \"\"\"\n Load sensor data from a CSV file.\n\n Parameters\n ----------\n csv_path: Path\n Path to the CSV file\n input_context_id: uuid.UUID\n Input context ID\n channels: List[SensorChannel]\n List of sensor channels\n\n Returns\n -------\n List[SensorData]\n List of sensor data\n \"\"\"\n sensor_data_values: List[SensorData] = []\n data_collection: Dict[str, List[float]] = defaultdict(list)\n\n with csv_path.open('r') as f:\n reader = csv.reader(f)\n header: List[str] = next(reader)\n if header != ['idx', 'SPLINE_X', 'SPLINE_Y',\n 'SENSOR_TIMESTAMP', 'SENSOR_PRESSURE', 'SENSOR_ALTITUDE', 'SENSOR_AZIMUTH']:\n raise ValueError(\"Invalid CSV file format\")\n last_idx: int = 0\n for row in reader:\n row_idx: int = int(row[0])\n if row_idx != last_idx:\n sensor_data_values.append(create_sensor_data(data_collection, input_context_id, channels))\n data_collection.clear()\n for idx, value in enumerate(row[1:], start=1):\n data_collection[header[idx]].append(float(value))\n last_idx = row_idx\n if len(data_collection) > 0:\n sensor_data_values.append(create_sensor_data(data_collection, input_context_id, channels))\n return sensor_data_values\n\n\ndef create_strokes(sensor_data_items: List[SensorData], style_stroke: Style, x_id: uuid.UUID, y_id: uuid.UUID) \\\n -> List[Stroke]:\n \"\"\"\n Create strokes from sensor data.\n\n Parameters\n ----------\n sensor_data_items: List[SensorData]\n List of sensor data\n style_stroke: Style\n Style of the stroke\n x_id: uuid.UUID\n Reference id of x sensor channel\n y_id: uuid.UUID\n Reference id of y sensor channel\n\n Returns\n -------\n List[Stroke]\n List of strokes\n \"\"\"\n stroke_items: List[Stroke] = []\n for sensor_data_i in sensor_data_items:\n path: List[float] = []\n # The spline path contains x, y values\n mask: int = LayoutMask.X.value | LayoutMask.Y.value\n for x, y in zip(sensor_data_i.get_data_by_id(x_id).values, sensor_data_i.get_data_by_id(y_id).values):\n path.append(unit2unit(Unit.M, Unit.DIP, x))\n path.append(unit2unit(Unit.M, Unit.DIP, y))\n\n spline: Spline = Spline(layout_mask=mask, data=path)\n # Create a stroke from spline\n s_i: Stroke = Stroke(sid=UUIDIdentifier.id_generator(), spline=spline, style=style_stroke)\n stroke_items.append(s_i)\n return stroke_items\n\n\nif __name__ == '__main__':\n # Creates an ink model from the scratch.\n ink_model: InkModel = InkModel(version=SupportedFormats.UIM_VERSION_3_1_0.value)\n # Setting a unit scale factor\n ink_model.unit_scale_factor = 1.5\n # Using a 4x4 matrix for scaling\n ink_model.transform = Matrix4x4.create_scale(1.5)\n\n # Properties are added as key-value pairs\n ink_model.properties.append((\"Author\", \"Markus\"))\n ink_model.properties.append((\"Locale\", \"en_US\"))\n\n # Create an environment\n env: Environment = Environment()\n # This should describe the environment in which the ink was captured\n env.properties.append((\"wacom.ink.sdk.lang\", \"js\"))\n env.properties.append((\"wacom.ink.sdk.version\", \"2.0.0\"))\n env.properties.append((\"runtime.type\", \"WEB\"))\n env.properties.append((\"user.agent.brands\", \"Chromium 126, Google Chrome 126\"))\n env.properties.append((\"user.agent.platform\", \"macOS\"))\n env.properties.append((\"user.agent.mobile\", \"false\"))\n env.properties.append((\"app.id\", \"sample_create_model_vector\"))\n env.properties.append((\"app.version\", \"1.0.0\"))\n ink_model.input_configuration.environments.append(env)\n\n # Ink input provider can be pen, mouse or touch.\n provider: InkInputProvider = InkInputProvider(input_type=InkInputType.PEN)\n ink_model.input_configuration.ink_input_providers.append(provider)\n\n # Input device is the sensor (pen tablet, screen, etc.)\n input_device: InputDevice = InputDevice()\n input_device.properties.append((\"dev.manufacturer\", \"Wacom\"))\n input_device.properties.append((\"dev.model\", \"Wacom One\"))\n input_device.properties.append((\"dev.product.code\", \"DTC-133\"))\n input_device.properties.append((\"dev.graphics.resolution\", \"1920x1080\"))\n ink_model.input_configuration.devices.append(input_device)\n # Create a group of sensor channels\n sensor_channels: list = [\n SensorChannel(channel_type=InkSensorType.X, metric=InkSensorMetricType.LENGTH, resolution=1.0,\n ink_input_provider_id=provider.id, input_device_id=input_device.id),\n SensorChannel(channel_type=InkSensorType.Y, metric=InkSensorMetricType.LENGTH, resolution=1.0,\n ink_input_provider_id=provider.id, input_device_id=input_device.id),\n SensorChannel(channel_type=InkSensorType.TIMESTAMP, metric=InkSensorMetricType.TIME, resolution=1000.0,\n precision=0,\n ink_input_provider_id=provider.id, input_device_id=input_device.id),\n SensorChannel(channel_type=InkSensorType.PRESSURE, metric=InkSensorMetricType.NORMALIZED, resolution=1.0,\n channel_min=0., channel_max=1.0,\n ink_input_provider_id=provider.id, input_device_id=input_device.id),\n SensorChannel(channel_type=InkSensorType.ALTITUDE, metric=InkSensorMetricType.ANGLE, resolution=1.0,\n channel_min=0., channel_max=1.5707963705062866,\n ink_input_provider_id=provider.id, input_device_id=input_device.id),\n SensorChannel(channel_type=InkSensorType.AZIMUTH, metric=InkSensorMetricType.ANGLE, resolution=1.0,\n channel_min=-3.1415927410125732, channel_max=3.1415927410125732,\n ink_input_provider_id=provider.id, input_device_id=input_device.id)\n ]\n # Create a sensor channels context\n scc_wacom_one: SensorChannelsContext = SensorChannelsContext(channels=sensor_channels,\n ink_input_provider_id=provider.id,\n input_device_id=input_device.id,\n latency=0,\n sampling_rate_hint=240)\n\n # Add sensor channel contexts\n sensor_context: SensorContext = SensorContext()\n sensor_context.add_sensor_channels_context(scc_wacom_one)\n ink_model.input_configuration.sensor_contexts.append(sensor_context)\n\n # Create the input context using the Environment and the Sensor Context\n input_context: InputContext = InputContext(environment_id=env.id, sensor_context_id=sensor_context.id)\n ink_model.input_configuration.input_contexts.append(input_context)\n\n # Create sensor data\n # The CSV file contains sensor data for strokes\n # idx,SPLINE_X,SPLINE_Y,SENSOR_TIMESTAMP,SENSOR_PRESSURE,SENSOR_ALTITUDE,SENSOR_AZIMUTH\n sensor_data = load_sensor_data(Path(__file__).parent / '..' / 'ink' / 'sensor_data' / 'ink.csv', input_context.id,\n sensor_channels)\n # Add sensor data to the model\n for sensor_data_i in sensor_data:\n ink_model.sensor_data.add(sensor_data_i)\n\n # We need to define a brush polygon\n points: list = [(10, 10), (0, 10), (0, 0), (10, 0)]\n brush_polygons: list = [BrushPolygon(min_scale=0., points=points)]\n\n # Create the brush object using polygons\n vector_brush_0: VectorBrush = VectorBrush(\n \"app://qa-test-app/vector-brush/MyTriangleBrush\",\n brush_polygons)\n\n # Add it to the model\n ink_model.brushes.add_vector_brush(vector_brush_0)\n\n # Add a brush specified with shape Uris\n poly_uris: list = [\n BrushPolygonUri(\"will://brush/3.0/shape/Circle?precision=20&radius=1\", 0.),\n BrushPolygonUri(\"will://brush/3.0/shape/Ellipse?precision=20&radiusX=1&radiusY=0.5\", 4.0)\n ]\n # Define a second brush\n vector_brush_1: VectorBrush = VectorBrush(\n \"app://qa-test-app/vector-brush/MyEllipticBrush\",\n poly_uris)\n # Add it to the model\n ink_model.brushes.add_vector_brush(vector_brush_1)\n\n # Specify the layout of the stroke data, in this case the stroke will have variable X, Y and Size properties.\n layout_mask: int = LayoutMask.X.value | LayoutMask.Y.value | LayoutMask.SIZE.value\n\n # Create some style\n style: Style = Style(brush_uri=vector_brush_1.name)\n # Set the color of the strokes\n style.path_point_properties.red = 0.1\n style.path_point_properties.green = 0.2\n style.path_point_properties.blue = 0.4\n style.path_point_properties.alpha = 1.0\n\n # Create the strokes\n strokes = create_strokes(sensor_data, style, sensor_channels[0].id, sensor_channels[1].id)\n # First you need a root group to contain the strokes\n root: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())\n\n # Assign the group as the root of the main ink tree\n ink_model.ink_tree = InkTree()\n ink_model.ink_tree.root = root\n\n # Adding the strokes to the root group\n for stroke in strokes:\n root.add(StrokeNode(stroke))\n\n # Adding view for handwriting recognition results\n hwr_tree: ViewTree = ViewTree(schema.CommonViews.HWR_VIEW.value)\n # Add view right after creation, to avoid warnings that tree is not yet attached\n ink_model.add_view(hwr_tree)\n # Create a root node for the HWR view\n hwr_root: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())\n hwr_tree.root = hwr_root\n ink_model.knowledge_graph.append(schema.SemanticTriple(hwr_root.uri, schema.IS, schema.SegmentationSchema.ROOT))\n ink_model.knowledge_graph.append(schema.SemanticTriple(hwr_root.uri, schema.SegmentationSchema.REPRESENTS_VIEW,\n schema.CommonViews.HWR_VIEW.value))\n\n # Here you can add the same strokes as in the main tree, but you can organize them in a different way\n # (put them in different groups)\n # You are not supposed to add strokes that are not already in the main tree.\n text_region: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())\n hwr_root.add(text_region)\n ink_model.knowledge_graph.append(schema.SemanticTriple(text_region.uri, schema.IS,\n schema.SegmentationSchema.TEXT_REGION))\n\n # The text_line root denotes the text line\n text_line: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())\n text_region.add(text_line)\n ink_model.knowledge_graph.append(schema.SemanticTriple(text_line.uri, schema.IS,\n schema.SegmentationSchema.TEXT_LINE))\n\n # The word node denotes a word\n word: StrokeGroupNode = StrokeGroupNode(UUIDIdentifier.id_generator())\n text_line.add(word)\n ink_model.knowledge_graph.append(schema.SemanticTriple(word.uri, schema.IS, schema.SegmentationSchema.WORD))\n ink_model.knowledge_graph.append(schema.SemanticTriple(word.uri, schema.SegmentationSchema.HAS_CONTENT, \"ink\"))\n ink_model.knowledge_graph.append(schema.SemanticTriple(word.uri, schema.SegmentationSchema.HAS_LANGUAGE, \"en_US\"))\n\n # Add the strokes to the word\n for stroke_i in strokes:\n word.add(StrokeNode(stroke_i))\n\n # We need a URI builder\n uri_builder: URIBuilder = URIBuilder()\n\n # Create a named entity\n named_entity_uri: str = uri_builder.build_named_entity_uri(UUIDIdentifier.id_generator())\n ink_model.knowledge_graph.append(schema.SemanticTriple(word.uri,\n schema.NamedEntityRecognitionSchema.PART_OF_NAMED_ENTITY,\n named_entity_uri))\n\n # Add knowledge for the named entity\n ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri, \"hasPart-0\", word.uri))\n ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,\n schema.NamedEntityRecognitionSchema.HAS_LABEL, \"Ink\"))\n ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,\n schema.NamedEntityRecognitionSchema.HAS_LANGUAGE, \"en_US\"))\n ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,\n schema.NamedEntityRecognitionSchema.HAS_CONFIDENCE, \"0.95\"))\n ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,\n schema.NamedEntityRecognitionSchema.HAS_ARTICLE_URL,\n 'https://en.wikipedia.org/wiki/Ink'))\n ink_model.knowledge_graph.append(schema.SemanticTriple(named_entity_uri,\n schema.NamedEntityRecognitionSchema.HAS_UNIQUE_ID, 'Q127418'))\n # Save the model, this will overwrite an existing file\n with open('3_1_0_vector.uim', 'wb') as uim:\n # unicode(data) auto-decodes data to unicode if str\n uim.write(UIMEncoder310().encode(ink_model))\n # Convert the model to JSON\n with open('ink.json', 'w') as f:\n # json_encode is a helper function to convert the model to JSON\n f.write(json_encode(ink_model))\n```\n\nFind the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_create_model_vector.py)\n\n## Converting Ink Model \n\n### To JSON\nThe `InkModel` can be converted to JSON format using the `json_encode` helper function.\nThis is useful for debugging purposes or for storing the model in a human-readable format.\nDeserialization is not supported.\n\n```python\nfrom pathlib import Path\n\nfrom uim.codec.parser.uim import UIMParser\nfrom uim.model.helpers.serialize import json_encode\nfrom uim.model.ink import InkModel\n\nif __name__ == '__main__':\n parser: UIMParser = UIMParser()\n ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'special' / 'ink.uim')\n # Convert the model to JSON\n with open('ink.json', 'w') as f:\n # json_encode is a helper function to convert the model to JSON\n f.write(json_encode(ink_model))\n```\n\nFind the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_extract_to_json.py)\n\n### Sensor data to CSV\n\nThe sensor data can be exported to a CSV file.\n\n```python\nfrom pathlib import Path\nfrom typing import List\n\nfrom uim.codec.parser.uim import UIMParser\nfrom uim.model.helpers.serialize import serialize_sensor_data_csv\nfrom uim.model.ink import InkModel\nfrom uim.model.inkdata.strokes import InkStrokeAttributeType\n\nif __name__ == '__main__':\n parser: UIMParser = UIMParser()\n # This file contains ink from different providers: PEN, TOUCH, MOUSE\n ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'special' / 'ink.uim')\n # Decide which attributes to serialize\n layout: List[InkStrokeAttributeType] = [\n InkStrokeAttributeType.SPLINE_X, InkStrokeAttributeType.SPLINE_Y, InkStrokeAttributeType.SENSOR_TIMESTAMP,\n InkStrokeAttributeType.SENSOR_PRESSURE, InkStrokeAttributeType.SENSOR_ALTITUDE,\n InkStrokeAttributeType.SENSOR_AZIMUTH\n ]\n # Serialize the model to CSV\n serialize_sensor_data_csv(ink_model, Path('sensor_data.csv'), layout=layout)\n```\n\nFind the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_extract_to_csv.py)\n\n## Extracting statistics\n\nThe `StatisticsAnalyzer` can be used to extract statistics from the `InkModel`.\nThe statistics are extracted from the ink data, sensor data, and input configuration.\n\n```python\nfrom pathlib import Path\nfrom typing import Dict, Any\n\nfrom uim.codec.parser.uim import UIMParser\nfrom uim.model.ink import InkModel\nfrom uim.utils.statistics import StatisticsAnalyzer\n\n\ndef print_model_stats(key: str, value: Any, indent: str = \"\"):\n \"\"\"\n Print the model statistics.\n Parameters\n ----------\n key: str\n Key string\n value: Any\n Value\n indent: str\n Indentation\n \"\"\"\n if isinstance(value, float):\n print(f'{indent}{key}: {value:.2f}')\n elif isinstance(value, int):\n print(f'{indent}{key}: {value:d}')\n elif isinstance(value, str):\n print(f'{indent}{key}: {value}')\n elif isinstance(value, Dict):\n print(f'{indent}{key}:')\n for key_str_2, next_value in value.items():\n print_model_stats(key_str_2, next_value, indent + \" \")\n\n\nif __name__ == '__main__':\n parser: UIMParser = UIMParser()\n ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'uim_3.1.0'/\n '2) Digital Ink is processable 1 (3.1 delta).uim')\n model_analyser: StatisticsAnalyzer = StatisticsAnalyzer()\n stats: Dict[str, Any] = model_analyser.analyze(ink_model)\n for key_str, value_str in stats.items():\n print_model_stats(key_str, value_str)\n```\n\nFind the sample, [here](https://github.com/Wacom-Developer/universal-ink-library/blob/main/samples/sample_analyse.py)\n\n## Convert InkML to UIM\n\nIn the following examples, we will demonstrate how to convert an InkML file from well-known datasets to UIM.\n\n### IAM On-Line Handwriting Database\n\nThe implementation supports the [IAM On-Line Handwriting Database](https://fki.tic.heia-fr.ch/databases/iam-on-line-handwriting-database) as a sample dataset for testing the conversion of InkML to UIM.\nIts annotations can be converted to Wacom Ontology Definition Language (WODL) segmentation schema, by configuring the InkMLParser as follows:\n\n```python\nfrom pathlib import Path\nfrom typing import Dict, Any, List\n\nfrom uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310\nfrom uim.model.helpers.schema_content_extractor import uim_schema_semantics_from\nfrom uim.model.ink import InkModel\nfrom uim.model.semantics.schema import SegmentationSchema, IS\nfrom uim.utils.print import print_tree\n\nfrom uim.codec.parser.inkml import InkMLParser\n\nif __name__ == '__main__':\n parser: InkMLParser = InkMLParser()\n parser.set_typedef_pred(IS)\n parser.register_type('type', 'Document', SegmentationSchema.ROOT)\n parser.register_type('type', 'Formula', SegmentationSchema.MATH_BLOCK)\n parser.register_type('type', 'Arrow', SegmentationSchema.CONNECTOR)\n parser.register_type('type', 'Table', SegmentationSchema.TABLE)\n parser.register_type('type', 'Structure', SegmentationSchema.BORDER)\n parser.register_type('type', 'Diagram', SegmentationSchema.DIAGRAM)\n parser.register_type('type', 'Drawing', SegmentationSchema.DRAWING)\n parser.register_type('type', 'Correction', SegmentationSchema.CORRECTION)\n parser.register_type('type', 'Symbol', '<T>')\n parser.register_type('type', 'Marking', SegmentationSchema.MARKING)\n parser.register_type('type', 'Marking_Bracket', SegmentationSchema.MARKING,\n subtypes=[(SegmentationSchema.HAS_MARKING_TYPE, 'other')])\n parser.register_type('type', 'Marking_Encircling', SegmentationSchema.MARKING,\n subtypes=[(SegmentationSchema.HAS_MARKING_TYPE, 'encircling')])\n parser.register_type('type', 'Marking_Angle', SegmentationSchema.MARKING,\n subtypes=[(SegmentationSchema.HAS_MARKING_TYPE, 'other')])\n parser.register_type('type', 'Marking_Underline', SegmentationSchema.MARKING,\n subtypes=[(SegmentationSchema.HAS_MARKING_TYPE,\n \"underlining\")])\n parser.register_type('type', 'Marking_Sideline', SegmentationSchema.MARKING,\n subtypes=[(SegmentationSchema.HAS_MARKING_TYPE, 'other')])\n parser.register_type('type', 'Marking_Connection', SegmentationSchema.CONNECTOR)\n\n parser.register_type('type', 'Textblock', SegmentationSchema.TEXT_REGION)\n parser.register_type('type', 'Textline', SegmentationSchema.TEXT_LINE)\n parser.register_type('type', 'Word', SegmentationSchema.WORD)\n\n parser.register_type('type', 'Garbage', SegmentationSchema.GARBAGE)\n parser.register_type('type', 'List', SegmentationSchema.LIST)\n parser.register_value('transcription', SegmentationSchema.HAS_CONTENT)\n\n parser.cropping_ink = False\n parser.cropping_offset = 10\n ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'inkml' / 'iamondb.inkml')\n\n structures: List[Dict[str, Any]] = uim_schema_semantics_from(ink_model, \"custom\")\n print_tree(structures)\n with Path(\"iamondb.uim\").open(\"wb\") as file:\n file.write(UIMEncoder310().encode(ink_model))\n```\n\nThe implementation is provided as a sample and may require additional configuration and testing to work with other datasets.\nWith the `register_type` method, the parser can be configured to map the annotation types to the segmentation schema defined in the WODL.\nThe `register_value` method can be used to map the annotation values to the content of the segmentation schema.\nNote, that this mapping may fully comply with the WODL schema, but it is a sample implementation and may require additional configuration or post-processing.\n\nThe sample document from the IAM On-Line Handwriting Database can't be uploaded to the repository due to the license restrictions.\n\n### Kondate\n\nThe implementation supports the [Kondate](https://web.tuat.ac.jp/~nakagawa/database/en/kondate_about.html) dataset as a sample dataset for testing the conversion of InkML to UIM.\n\n```python\nimport uuid\nfrom pathlib import Path\n\nfrom uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310\nfrom uim.model.ink import InkModel\nfrom uim.model.inkdata.brush import BrushPolygonUri, VectorBrush\nfrom uim.model.semantics.schema import SegmentationSchema, CommonViews\n\nfrom uim.codec.parser.inkml import InkMLParser\n\nif __name__ == '__main__':\n parser: InkMLParser = InkMLParser()\n # Add a brush specified with shape Uris\n bpu_1: BrushPolygonUri = BrushPolygonUri(\"will://brush/3.0/shape/Circle?precision=20&radius=1\", min_scale=0.)\n bpu_2: BrushPolygonUri = BrushPolygonUri(\"will://brush/3.0/shape/Circle?precision=20&radius=0.5\", min_scale=4.)\n poly_uris: list = [\n bpu_1, bpu_2\n ]\n vector_brush_1: VectorBrush = VectorBrush(\n \"app://qa-test-app/vector-brush/MyEllipticBrush\",\n poly_uris)\n parser.register_brush(brush_uri='default', brush=vector_brush_1)\n parser.use_brush = 'default'\n device_id: str = uuid.uuid4().hex\n parser.update_default_context(sample_rate=80, serial_number=device_id, manufacturer=\"Test Manufacturer\",\n model=\"Test Model\")\n parser.content_view = CommonViews.HWR_VIEW.value\n parser.cropping_ink = True\n parser.default_annotation_type = SegmentationSchema.UNLABELED\n parser.default_xy_resolution = 10\n parser.default_position_precision = 3\n parser.default_value_resolution = 42\n # Kondate database is not using namespace\n parser.default_namespace = ''\n ink_model: InkModel = parser.parse(Path(__file__).parent / '..' / 'ink' / 'inkml' / 'kondate.inkml')\n with Path(\"kondate.uim\").open(\"wb\") as file:\n file.write(UIMEncoder310().encode(ink_model))\n```\n\nThe sample document from the Kondate dataset can't be uploaded to the repository due to the license restrictions.\n\n## IOT Paper Format \n\nThe format encodes the ink as InkML, but additionally it encodes a template image as base64.\n\n```xml\n<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<paper xmlns:inkml=\"http://www.w3.org/2003/InkML\"\n xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\n xsi:schemaLocation=\"http://www.w3.org/2003/InkML\">\n <resource>\n <templateImage Content-Type=\"image/bmp\">\n <!-- Base64 encoded template -->\n </templateImage>\n </resource>\n <inkml:ink>\n <!-- Ink content encoded as InkML -->\n </inkml:ink>\n</paper>\n```\n\nThis sample implementation provides a way to convert the IOT Paper Format to UIM and extract the template image.\n\n```python\nfrom pathlib import Path\nfrom typing import List\n\nfrom uim.codec.parser.iotpaper import IOTPaperParser\nfrom uim.codec.writer.encoder.encoder_3_1_0 import UIMEncoder310\nfrom uim.model.helpers.serialize import json_encode, serialize_raw_sensor_data_csv\nfrom uim.model.ink import InkModel\nfrom uim.model.inkinput.inputdata import InkSensorType, Unit\n\nif __name__ == '__main__':\n paper_file: Path = Path(__file__).parent / '..' / '..' / 'ink' / 'iot' / 'HelloInk.paper'\n parser: IOTPaperParser = IOTPaperParser()\n\n parser.cropping_ink = False\n parser.cropping_offset = 10\n ink_model: InkModel = parser.parse(paper_file)\n img: bytes = parser.parse_template(paper_file)\n with Path(\"iot.uim\").open(\"wb\") as file:\n file.write(UIMEncoder310().encode(ink_model))\n with Path(\"template.bmp\").open(\"wb\") as file:\n file.write(img)\n layout: List[InkSensorType] = [\n InkSensorType.TIMESTAMP, InkSensorType.X, InkSensorType.Y, InkSensorType.Z,\n InkSensorType.PRESSURE, InkSensorType.ALTITUDE,\n InkSensorType.AZIMUTH\n ]\n # In the Universal Ink Model, the sensor data is in SI units:\n # - timestamp: seconds\n # - x, y, z: meters\n # - pressure: N\n serialize_raw_sensor_data_csv(ink_model, Path('sensor_data.csv'), layout)\n # If you want to convert the data to different units, you can use the following code:\n serialize_raw_sensor_data_csv(ink_model, Path('sensor_data_unit.csv'), layout,\n {\n InkSensorType.X: Unit.MM, # Convert meters to millimeters\n InkSensorType.Y: Unit.MM, # Convert meters to millimeters\n InkSensorType.Z: Unit.MM, # Convert meters to millimeters\n InkSensorType.TIMESTAMP: Unit.MS # Convert seconds to milliseconds\n })\n # Convert the model to JSON\n with open('ink.json', 'w') as f:\n # json_encode is a helper function to convert the model to JSON\n f.write(json_encode(ink_model))\n```\n\n### NOTICE\n\nThis implementation is a sample implementation and does not cover all possible cases of InkML files.\nThere is no guarantee that the implementation will work for all InkML files.\nAdditional testing and validation may be required to ensure the correctness of the implementation.\nFinally, the implementation is provided as-is and without any warranty or support.\n\n\n# Web Demos\nThe following web demos can be used to produce Universal Ink Model files: \n\n- [WILL SDK for ink - Demo](https://ink-demo.wacom.com/) - producing UIM 3.1.0 files.\n\n\n# Documentation\nYou can find more detailed technical documentation, [here](https://developer-docs.wacom.com/sdk-for-ink/docs/model).\nAPI documentation is available [here](docs/uim/index.md).\n\n# Usage\n\nThe library is used for machine learning experiments based on digital ink using the Universal Ink Model. \n\n\n# License\n[Apache License 2.0](LICENSE)\n\n\n\n",
"bugtrack_url": null,
"license": "Apache 2.0 License",
"summary": "Library to parse and write Universal Ink Model data files.",
"version": "2.1.0",
"project_urls": {
"Homepage": "https://github.com/Wacom-Developer/universal-ink-library"
},
"split_keywords": [
"universal",
"ink",
"model;digital",
"ink;wacom",
"ink",
"technologies"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "78437ff981d35f5d2617164a81e52ceade1938e405623e221379564ea7bc5771",
"md5": "33a986b33332870a344fdf3bffaafdd8",
"sha256": "d0a4f3fbb62a1a3abce22112fd31f2512c6594e35cb406c2c09bd350e813c292"
},
"downloads": -1,
"filename": "universal_ink_library-2.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "33a986b33332870a344fdf3bffaafdd8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 200844,
"upload_time": "2024-09-25T11:11:10",
"upload_time_iso_8601": "2024-09-25T11:11:10.214702Z",
"url": "https://files.pythonhosted.org/packages/78/43/7ff981d35f5d2617164a81e52ceade1938e405623e221379564ea7bc5771/universal_ink_library-2.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "41b705838b82022cd9daf10c4d7c130f7d1881384c0b58161cc828f7dd0cacaa",
"md5": "36bbb4774a28a9c76ff5c37b0ba6b0e9",
"sha256": "01c6d82c0eb34f5fe6c16a755291bd728e153054125fbf3bb44d6ccc53747204"
},
"downloads": -1,
"filename": "universal_ink_library-2.1.0.tar.gz",
"has_sig": false,
"md5_digest": "36bbb4774a28a9c76ff5c37b0ba6b0e9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5033401,
"upload_time": "2024-09-25T11:11:12",
"upload_time_iso_8601": "2024-09-25T11:11:12.224534Z",
"url": "https://files.pythonhosted.org/packages/41/b7/05838b82022cd9daf10c4d7c130f7d1881384c0b58161cc828f7dd0cacaa/universal_ink_library-2.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-25 11:11:12",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Wacom-Developer",
"github_project": "universal-ink-library",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"requirements": [],
"tox": true,
"lcname": "universal-ink-library"
}