pydanttention

Name	pydanttention JSON
Version	0.1.2 JSON
	download
home_page
Summary	Transformer model attention in Pydantic.
upload_time	2023-09-17 16:24:22
maintainer
docs_url	None
author
requires_python	>=3.10
license	MIT
keywords	pydantic
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # pydanttention

![PyPI](https://img.shields.io/pypi/v/pydanttention?logo=python&logoColor=%23cccccc)
[![pdm-managed](https://img.shields.io/badge/pdm-managed-blueviolet)](https://pdm.fming.dev)
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/lmmx/pydanttention/master.svg)](https://results.pre-commit.ci/latest/github/lmmx/pydanttention/master)
[![Supported Python versions](https://img.shields.io/pypi/pyversions/pydanttention.svg)](https://pypi.org/project/pydanttention)

<!-- [![build status](https://github.com/lmmx/pydanttention/actions/workflows/master.yml/badge.svg)](https://github.com/lmmx/pydanttention/actions/workflows/master.yml) -->

Transformer model attention in Pydantic.

Adapted from the source by Theia Vogel (MIT licensed, included here as `vogel_manual_transformer.py`):

- [I made a transformer by hand (no training!)](https://vgel.me/posts/handmade-transformer/) (2023)

In turn using model ops from [picoGPT](https://github.com/jaymody/picoGPT/blob/main/gpt2.py) (MIT license)

## Motivation

Rewriting AI model source code as Pydantic data models is an interesting exercise. I'd note the following benefits.

- All operations can be subclassed from an arbitrary `Operation` model (see `.models.ops.base`),
  i.e. they all expect their first argument to be a numpy array `x`. This naturally allows you to
  factor your code around a category of 'operations'.

- Since all functions get turned into a class (a Pydantic data model with type-annotated fields for
  input state rather than funcdef kw/args), and classes are conventionally named in `PascalCase` whereas functions
  (like all other Python variables) are conventionally named in `snake_case`, you can easily observe from case alone
  where significant operations are called, as well as where the data model is referenced (by `self.{field}`) making
  these 2 types of data access distinct from the intermediate variables. This gives a better sense
  at a glance of data flow through your program.

- State can be configured at runtime but also given defaults at import time through use of fields in
  the data model. The original source code hardcoded values in the config as module globals
  (similarly to using class variables), it was not possible to configure component parts at runtime.
  This was appropriate to author an expository demo, but made it difficult to approach as a reader
  wishing to modify and experiment (likewise code is easier to test if easier to configure at runtime).

- Clear and consolidated declarations of input data (i.e. not scattered across many sites of declaration)
  without losing the ability to decompose into structured components. The original code used primitive types
  (lists of dictionaries) for the attention blocks, which became model field defaults in a self-contained module (see `.models.config`).
  Since Pydantic allows you to load ("validate") typed data models from these primitive types, we
  could supply the original dictionary primitive to `AttentionBlock.model_validate` and it'd still work
  (but doing so is actually more verbose than just constructing the model class directly).

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "pydanttention",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "pydantic",
    "author": "",
    "author_email": "Louis Maddox <louismmx@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/52/c4/8b0d5efa6089770f22ef4613ec26d6ddfaca1a77571e0279e0bf30c334ae/pydanttention-0.1.2.tar.gz",
    "platform": null,
    "description": "# pydanttention\n\n![PyPI](https://img.shields.io/pypi/v/pydanttention?logo=python&logoColor=%23cccccc)\n[![pdm-managed](https://img.shields.io/badge/pdm-managed-blueviolet)](https://pdm.fming.dev)\n[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/lmmx/pydanttention/master.svg)](https://results.pre-commit.ci/latest/github/lmmx/pydanttention/master)\n[![Supported Python versions](https://img.shields.io/pypi/pyversions/pydanttention.svg)](https://pypi.org/project/pydanttention)\n\n<!-- [![build status](https://github.com/lmmx/pydanttention/actions/workflows/master.yml/badge.svg)](https://github.com/lmmx/pydanttention/actions/workflows/master.yml) -->\n\nTransformer model attention in Pydantic.\n\nAdapted from the source by Theia Vogel (MIT licensed, included here as `vogel_manual_transformer.py`):\n\n- [I made a transformer by hand (no training!)](https://vgel.me/posts/handmade-transformer/) (2023)\n\nIn turn using model ops from [picoGPT](https://github.com/jaymody/picoGPT/blob/main/gpt2.py) (MIT license)\n\n## Motivation\n\nRewriting AI model source code as Pydantic data models is an interesting exercise. I'd note the following benefits.\n\n- All operations can be subclassed from an arbitrary `Operation` model (see `.models.ops.base`),\n  i.e. they all expect their first argument to be a numpy array `x`. This naturally allows you to\n  factor your code around a category of 'operations'.\n\n- Since all functions get turned into a class (a Pydantic data model with type-annotated fields for\n  input state rather than funcdef kw/args), and classes are conventionally named in `PascalCase` whereas functions\n  (like all other Python variables) are conventionally named in `snake_case`, you can easily observe from case alone\n  where significant operations are called, as well as where the data model is referenced (by `self.{field}`) making\n  these 2 types of data access distinct from the intermediate variables. This gives a better sense\n  at a glance of data flow through your program.\n\n- State can be configured at runtime but also given defaults at import time through use of fields in\n  the data model. The original source code hardcoded values in the config as module globals\n  (similarly to using class variables), it was not possible to configure component parts at runtime.\n  This was appropriate to author an expository demo, but made it difficult to approach as a reader\n  wishing to modify and experiment (likewise code is easier to test if easier to configure at runtime).\n\n- Clear and consolidated declarations of input data (i.e. not scattered across many sites of declaration)\n  without losing the ability to decompose into structured components. The original code used primitive types\n  (lists of dictionaries) for the attention blocks, which became model field defaults in a self-contained module (see `.models.config`).\n  Since Pydantic allows you to load (\"validate\") typed data models from these primitive types, we\n  could supply the original dictionary primitive to `AttentionBlock.model_validate` and it'd still work\n  (but doing so is actually more verbose than just constructing the model class directly).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Transformer model attention in Pydantic.",
    "version": "0.1.2",
    "project_urls": {
        "Homepage": "https://github.com/lmmx/pydanttention",
        "Repository": "https://github.com/lmmx/pydanttention.git"
    },
    "split_keywords": [
        "pydantic"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7ffa62986ff64578eca0bb9957cff487efc348baebefcd3bb1f867e3d2a1670f",
                "md5": "91b2b350cd777d7c41b7528c6be55c2d",
                "sha256": "8668d6ca0ba4cdb7ea71744b81ff15b1c054054f5f862f7a3a30557226c9714d"
            },
            "downloads": -1,
            "filename": "pydanttention-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "91b2b350cd777d7c41b7528c6be55c2d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 12557,
            "upload_time": "2023-09-17T16:24:20",
            "upload_time_iso_8601": "2023-09-17T16:24:20.598498Z",
            "url": "https://files.pythonhosted.org/packages/7f/fa/62986ff64578eca0bb9957cff487efc348baebefcd3bb1f867e3d2a1670f/pydanttention-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "52c48b0d5efa6089770f22ef4613ec26d6ddfaca1a77571e0279e0bf30c334ae",
                "md5": "d177b64b69f529a5200872fd284b6a46",
                "sha256": "1275c31aaa5a912c2d489a8077f8832787013cc27751e841db90864107cb56a7"
            },
            "downloads": -1,
            "filename": "pydanttention-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "d177b64b69f529a5200872fd284b6a46",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 10213,
            "upload_time": "2023-09-17T16:24:22",
            "upload_time_iso_8601": "2023-09-17T16:24:22.096546Z",
            "url": "https://files.pythonhosted.org/packages/52/c4/8b0d5efa6089770f22ef4613ec26d6ddfaca1a77571e0279e0bf30c334ae/pydanttention-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-17 16:24:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lmmx",
    "github_project": "pydanttention",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pydanttention"
}