toon-py


Nametoon-py JSON
Version 1.0.1 PyPI version JSON
download
home_pageNone
SummaryToken-Oriented Object Notation: A compact format for passing structured data to LLMs with 30-60% fewer tokens than JSON
upload_time2025-10-26 21:45:32
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords ai json llm prompt-engineering serialization tokens toon
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # toon-py

**Token-Oriented Object Notation (TOON) for Python**

A compact, human-readable format for passing structured data to LLMs with **30-60% fewer tokens** than JSON.

Python port of [@byjohann/toon](https://github.com/johannschopplich/toon).

## Why TOON?

LLM tokens cost money. TOON reduces token usage by:
- Removing redundant punctuation (braces, brackets, most quotes)
- Using indentation for structure
- Tabularizing arrays of objects
- Writing inline primitive arrays without spaces

## Installation

### As a CLI tool

For standalone CLI usage:

```bash
# Using uv (recommended - installs in isolated environment)
uv tool install toon-py

# Using pip (installs CLI in current Python environment)
pip install toon-py
```

### As a Python library

To use in your Python project:

```bash
# Using uv (adds to project dependencies)
uv add toon-py

# Using pip (installs library + CLI in current environment)
pip install toon-py
```

## Quick Start

### Python API

```python
from toon_py import encode

data = {
    "user": {
        "id": 123,
        "name": "Ada",
        "tags": ["reading", "gaming"],
        "active": True
    }
}

print(encode(data))
```

**Output:**

```
user:
  id: 123
  name: Ada
  tags[2]: reading,gaming
  active: true
```

### CLI

```bash
# From file
toon data.json

# From stdin
cat data.json | toon

# From string
toon '{"tags": ["foo", "bar"]}'

# With options
toon data.json --delimiter tab --length-marker -o output.toon
```

## Token Savings

| Example | JSON Tokens | TOON Tokens | Saved | Reduction |
|---------|-------------|-------------|-------|-----------|
| Simple user | 31 | 18 | 13 | **41.9%** |
| User with tags | 48 | 28 | 20 | **41.7%** |
| Product catalog | 117 | 49 | 68 | **58.1%** |
| API response | 123 | 53 | 70 | **56.9%** |
| Analytics data | 209 | 94 | 115 | **55.0%** |
| Large dataset (50 records) | 2159 | 762 | 1397 | **64.7%** |

## Features

### Objects

```python
encode({"id": 1, "name": "Ada"})
```

```
id: 1
name: Ada
```

### Primitive Arrays (Inline)

```python
encode({"tags": ["admin", "ops", "dev"]})
```

```
tags[3]: admin,ops,dev
```

### Arrays of Objects (Tabular)

```python
encode({
    "items": [
        {"sku": "A1", "qty": 2, "price": 9.99},
        {"sku": "B2", "qty": 1, "price": 14.5}
    ]
})
```

```
items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5
```

### Encoding Options

```python
from toon_py import encode, EncodeOptions

data = {"items": [{"id": 1, "name": "Widget"}]}

# Tab delimiter
options = EncodeOptions(delimiter="\t")
print(encode(data, options))

# Pipe delimiter
options = EncodeOptions(delimiter="|")
print(encode(data, options))

# Length marker
options = EncodeOptions(length_marker="#")
print(encode(data, options))
# Output: items[#1]{id,name}: ...

# Custom indent
options = EncodeOptions(indent=4)
print(encode(data, options))
```

## CLI Options

```
toon [INPUT] [OPTIONS]

Arguments:
  INPUT                 JSON file, JSON string, or stdin

Options:
  -i, --indent INT      Spaces per indent level (default: 2)
  -d, --delimiter TEXT  Delimiter: comma, tab, or pipe (default: comma)
  -l, --length-marker   Add '#' prefix to array lengths
  -o, --output PATH     Output file (default: stdout)
  --help                Show help message
```

## Format Rules

### Quoting

Keys and values are quoted only when necessary:

```python
# Unquoted
{"name": "hello world"}  # -> name: hello world

# Quoted (contains comma)
{"note": "hello, world"}  # -> note: "hello, world"

# Quoted (looks like number)
{"code": "123"}  # -> code: "123"

# Quoted (key with space)
{"full name": "Ada"}  # -> "full name": Ada
```

### Tabular Format

Arrays of objects use tabular format when:
- All elements are objects
- All objects have identical keys
- All values are primitives (no nested arrays/objects)

```python
encode({
    "users": [
        {"id": 1, "name": "Alice", "active": True},
        {"id": 2, "name": "Bob", "active": False}
    ]
})
```

```
users[2]{id,name,active}:
  1,Alice,true
  2,Bob,false
```

### Empty Containers

```python
encode({})            # -> (empty output)
encode({"items": []}) # -> items[0]:
encode({"config": {}})# -> config:
```

## Type Conversions

| Python Type | TOON Output |
|-------------|-------------|
| `None` | `null` |
| `True`/`False` | `true`/`false` |
| `123` | `123` |
| `-0.0` | `0` |
| `float('nan')` | `null` |
| `float('inf')` | `null` |
| `datetime(...)` | `"2025-01-01T00:00:00Z"` |

## Use in LLM Prompts

Wrap TOON data in code blocks:

````markdown
Here's the data in TOON format:

```
user:
  id: 123
  tags[2]: reading,gaming
  active: true
```

Please analyze this data...
````

## Development

```bash
# Clone and setup
git clone https://github.com/shammianand/toon-py.git
cd toon-py
uv sync --all-extras

# Run tests
uv run pytest

# Format code
uv run black src/
uv run ruff check src/
```

## License

MIT License - see [LICENSE](LICENSE)

## Credits

Python port of [@byjohann/toon](https://github.com/johannschopplich/toon) by [Johann Schopplich](https://github.com/johannschopplich)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "toon-py",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "ai, json, llm, prompt-engineering, serialization, tokens, toon",
    "author": null,
    "author_email": "Shammi Anand <shammianand101@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/8e/a8/251b07b805c23a3b05adbee82c938c1aa147fa2432684d91df6ebb545c8e/toon_py-1.0.1.tar.gz",
    "platform": null,
    "description": "# toon-py\n\n**Token-Oriented Object Notation (TOON) for Python**\n\nA compact, human-readable format for passing structured data to LLMs with **30-60% fewer tokens** than JSON.\n\nPython port of [@byjohann/toon](https://github.com/johannschopplich/toon).\n\n## Why TOON?\n\nLLM tokens cost money. TOON reduces token usage by:\n- Removing redundant punctuation (braces, brackets, most quotes)\n- Using indentation for structure\n- Tabularizing arrays of objects\n- Writing inline primitive arrays without spaces\n\n## Installation\n\n### As a CLI tool\n\nFor standalone CLI usage:\n\n```bash\n# Using uv (recommended - installs in isolated environment)\nuv tool install toon-py\n\n# Using pip (installs CLI in current Python environment)\npip install toon-py\n```\n\n### As a Python library\n\nTo use in your Python project:\n\n```bash\n# Using uv (adds to project dependencies)\nuv add toon-py\n\n# Using pip (installs library + CLI in current environment)\npip install toon-py\n```\n\n## Quick Start\n\n### Python API\n\n```python\nfrom toon_py import encode\n\ndata = {\n    \"user\": {\n        \"id\": 123,\n        \"name\": \"Ada\",\n        \"tags\": [\"reading\", \"gaming\"],\n        \"active\": True\n    }\n}\n\nprint(encode(data))\n```\n\n**Output:**\n\n```\nuser:\n  id: 123\n  name: Ada\n  tags[2]: reading,gaming\n  active: true\n```\n\n### CLI\n\n```bash\n# From file\ntoon data.json\n\n# From stdin\ncat data.json | toon\n\n# From string\ntoon '{\"tags\": [\"foo\", \"bar\"]}'\n\n# With options\ntoon data.json --delimiter tab --length-marker -o output.toon\n```\n\n## Token Savings\n\n| Example | JSON Tokens | TOON Tokens | Saved | Reduction |\n|---------|-------------|-------------|-------|-----------|\n| Simple user | 31 | 18 | 13 | **41.9%** |\n| User with tags | 48 | 28 | 20 | **41.7%** |\n| Product catalog | 117 | 49 | 68 | **58.1%** |\n| API response | 123 | 53 | 70 | **56.9%** |\n| Analytics data | 209 | 94 | 115 | **55.0%** |\n| Large dataset (50 records) | 2159 | 762 | 1397 | **64.7%** |\n\n## Features\n\n### Objects\n\n```python\nencode({\"id\": 1, \"name\": \"Ada\"})\n```\n\n```\nid: 1\nname: Ada\n```\n\n### Primitive Arrays (Inline)\n\n```python\nencode({\"tags\": [\"admin\", \"ops\", \"dev\"]})\n```\n\n```\ntags[3]: admin,ops,dev\n```\n\n### Arrays of Objects (Tabular)\n\n```python\nencode({\n    \"items\": [\n        {\"sku\": \"A1\", \"qty\": 2, \"price\": 9.99},\n        {\"sku\": \"B2\", \"qty\": 1, \"price\": 14.5}\n    ]\n})\n```\n\n```\nitems[2]{sku,qty,price}:\n  A1,2,9.99\n  B2,1,14.5\n```\n\n### Encoding Options\n\n```python\nfrom toon_py import encode, EncodeOptions\n\ndata = {\"items\": [{\"id\": 1, \"name\": \"Widget\"}]}\n\n# Tab delimiter\noptions = EncodeOptions(delimiter=\"\\t\")\nprint(encode(data, options))\n\n# Pipe delimiter\noptions = EncodeOptions(delimiter=\"|\")\nprint(encode(data, options))\n\n# Length marker\noptions = EncodeOptions(length_marker=\"#\")\nprint(encode(data, options))\n# Output: items[#1]{id,name}: ...\n\n# Custom indent\noptions = EncodeOptions(indent=4)\nprint(encode(data, options))\n```\n\n## CLI Options\n\n```\ntoon [INPUT] [OPTIONS]\n\nArguments:\n  INPUT                 JSON file, JSON string, or stdin\n\nOptions:\n  -i, --indent INT      Spaces per indent level (default: 2)\n  -d, --delimiter TEXT  Delimiter: comma, tab, or pipe (default: comma)\n  -l, --length-marker   Add '#' prefix to array lengths\n  -o, --output PATH     Output file (default: stdout)\n  --help                Show help message\n```\n\n## Format Rules\n\n### Quoting\n\nKeys and values are quoted only when necessary:\n\n```python\n# Unquoted\n{\"name\": \"hello world\"}  # -> name: hello world\n\n# Quoted (contains comma)\n{\"note\": \"hello, world\"}  # -> note: \"hello, world\"\n\n# Quoted (looks like number)\n{\"code\": \"123\"}  # -> code: \"123\"\n\n# Quoted (key with space)\n{\"full name\": \"Ada\"}  # -> \"full name\": Ada\n```\n\n### Tabular Format\n\nArrays of objects use tabular format when:\n- All elements are objects\n- All objects have identical keys\n- All values are primitives (no nested arrays/objects)\n\n```python\nencode({\n    \"users\": [\n        {\"id\": 1, \"name\": \"Alice\", \"active\": True},\n        {\"id\": 2, \"name\": \"Bob\", \"active\": False}\n    ]\n})\n```\n\n```\nusers[2]{id,name,active}:\n  1,Alice,true\n  2,Bob,false\n```\n\n### Empty Containers\n\n```python\nencode({})            # -> (empty output)\nencode({\"items\": []}) # -> items[0]:\nencode({\"config\": {}})# -> config:\n```\n\n## Type Conversions\n\n| Python Type | TOON Output |\n|-------------|-------------|\n| `None` | `null` |\n| `True`/`False` | `true`/`false` |\n| `123` | `123` |\n| `-0.0` | `0` |\n| `float('nan')` | `null` |\n| `float('inf')` | `null` |\n| `datetime(...)` | `\"2025-01-01T00:00:00Z\"` |\n\n## Use in LLM Prompts\n\nWrap TOON data in code blocks:\n\n````markdown\nHere's the data in TOON format:\n\n```\nuser:\n  id: 123\n  tags[2]: reading,gaming\n  active: true\n```\n\nPlease analyze this data...\n````\n\n## Development\n\n```bash\n# Clone and setup\ngit clone https://github.com/shammianand/toon-py.git\ncd toon-py\nuv sync --all-extras\n\n# Run tests\nuv run pytest\n\n# Format code\nuv run black src/\nuv run ruff check src/\n```\n\n## License\n\nMIT License - see [LICENSE](LICENSE)\n\n## Credits\n\nPython port of [@byjohann/toon](https://github.com/johannschopplich/toon) by [Johann Schopplich](https://github.com/johannschopplich)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Token-Oriented Object Notation: A compact format for passing structured data to LLMs with 30-60% fewer tokens than JSON",
    "version": "1.0.1",
    "project_urls": null,
    "split_keywords": [
        "ai",
        " json",
        " llm",
        " prompt-engineering",
        " serialization",
        " tokens",
        " toon"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ccc1c3b9398d585b3251051b3456a07604357f6514ed38593b1d65dd81fcb814",
                "md5": "27a9af499d17d284f8295d277f610bb9",
                "sha256": "4b26ab4b12ee52f8fcbb71fa9d34feaa4d0035656c7171b690dba967922c35ee"
            },
            "downloads": -1,
            "filename": "toon_py-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "27a9af499d17d284f8295d277f610bb9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 8745,
            "upload_time": "2025-10-26T21:45:31",
            "upload_time_iso_8601": "2025-10-26T21:45:31.181225Z",
            "url": "https://files.pythonhosted.org/packages/cc/c1/c3b9398d585b3251051b3456a07604357f6514ed38593b1d65dd81fcb814/toon_py-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8ea8251b07b805c23a3b05adbee82c938c1aa147fa2432684d91df6ebb545c8e",
                "md5": "0851fe4165d8275fd71e9f76fd80e5f7",
                "sha256": "12895a2d05fad18dfc33185f6b8912238008ec602365286ce17b6f1231baad0a"
            },
            "downloads": -1,
            "filename": "toon_py-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0851fe4165d8275fd71e9f76fd80e5f7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 23183,
            "upload_time": "2025-10-26T21:45:32",
            "upload_time_iso_8601": "2025-10-26T21:45:32.317335Z",
            "url": "https://files.pythonhosted.org/packages/8e/a8/251b07b805c23a3b05adbee82c938c1aa147fa2432684d91df6ebb545c8e/toon_py-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-26 21:45:32",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "toon-py"
}
        
Elapsed time: 3.17724s