spear-python


Namespear-python JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummarySPEAR: Structured Primitives for Efficient Architecture Research
upload_time2025-10-13 16:49:57
maintainerNone
docs_urlNone
authorRadical Numerics Inc.
requires_python>=3.10
licenseNone
keywords cuda kernels linear-algebra machine-learning pytorch
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
  <img width=500 alt="Spear Logo" src="https://raw.githubusercontent.com/RadicalNumerics/assets/refs/heads/main/svg/spear-logo.svg" />
</p>


SPEAR is a collection of kernels for AI model architectures developed by Radical Numerics.


## Installation

You may use PyPI to install SPEAR:

```bash
pip install spear-python
```

Note that it will take few minutes to compile kernels for your specific GPU architecture.


You may also install it locally using the following method to install the package in development mode:

```bash
git clone https://github.com/radicalnumerics/spear.git && cd spear # clone the repository
uv venv && source .venv/bin/activate # virtual env with uv (recommended)
uv pip install -e '.[dev]' # install in development mode
```


### Caching

We use `ccache` by default. To use it and enable faster compilation (see explanation on the [vLLM docs](https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html#set-up-using-python-only-build-without-compilation:~:text=%2De%20.-,Tip,-Building%20from%20source)), run:
```bash
CCACHE_NOHASHDIR="true" uv pip install --no-build-isolation -e '.[dev]'
```


## Quick Start

```python
import torch
from spear.nn.phalanx import Phalanx

device = "cuda:0" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16

dim = 512  # Must be divisible by 16 (head_dim is fixed at 16)
length = 128
batch_size = 1024
layer = Phalanx(dim=dim, length=length, dtype=dtype).to(device)

x = torch.randn(batch_size, length, dim, dtype=dtype, device=device)
y = layer(x)
print(f"Input: {x.shape} -> Output: {y.shape}")
```

### Development

We include pre-commit hooks for linting and formatting (Python, C++, CUDA). To install:

```bash
uv run pre-commit install
```

To run (note they will be run automatically on commit, so not necessary to run manually):

```bash
uv run pre-commit run --all-files
```

To run tests

```bash
uv run pytest
```

## Structure

```
csrc/        # kernels: CUDA/C++ or other DSLs
spear/
├─ ops/      # low-level wrappers per op family
│  └─ <op>/
└─ nn/       # layers built from ops (parametrized)
   └─ <layer>/
```


## Target Architectures

Currently supported hardware includes compute capabilities 9.0 (Hopper) and 10.0 (Blackwell).

| Kernel Name       |  (NVIDIA) sm9.0 |  (NVIDIA) sm10.0 |  (NVIDIA) sm10.3 |
| ----------------- | :-----: | :-----: | :-----: |
| `swr.btp.fwd.bf16.bdl.hd16-bl16.sm90` | ✔︎ |  ~ |  ⛔|
| `swr.btp.bwd.bf16.bdl.hd16-bl16.sm90`  | ✔︎ | ~ |  ⛔ |

* ✔︎: optimized
* ~: working but not fully optimized
* ⛔: not available


---

<p align="center">
  <img width=350 alt="Radical Numerics Logo" src="https://raw.githubusercontent.com/RadicalNumerics/assets/refs/heads/main/svg/rn-logo-desktop-vector-animated.svg" />
</p>


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "spear-python",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "cuda, kernels, linear-algebra, machine-learning, pytorch",
    "author": "Radical Numerics Inc.",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/ef/56/dd32fc9ebee92fe7104ba22d190afa5da82dad965a733543e9180b305f95/spear_python-0.1.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <img width=500 alt=\"Spear Logo\" src=\"https://raw.githubusercontent.com/RadicalNumerics/assets/refs/heads/main/svg/spear-logo.svg\" />\n</p>\n\n\nSPEAR is a collection of kernels for AI model architectures developed by Radical Numerics.\n\n\n## Installation\n\nYou may use PyPI to install SPEAR:\n\n```bash\npip install spear-python\n```\n\nNote that it will take few minutes to compile kernels for your specific GPU architecture.\n\n\nYou may also install it locally using the following method to install the package in development mode:\n\n```bash\ngit clone https://github.com/radicalnumerics/spear.git && cd spear # clone the repository\nuv venv && source .venv/bin/activate # virtual env with uv (recommended)\nuv pip install -e '.[dev]' # install in development mode\n```\n\n\n### Caching\n\nWe use `ccache` by default. To use it and enable faster compilation (see explanation on the [vLLM docs](https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html#set-up-using-python-only-build-without-compilation:~:text=%2De%20.-,Tip,-Building%20from%20source)), run:\n```bash\nCCACHE_NOHASHDIR=\"true\" uv pip install --no-build-isolation -e '.[dev]'\n```\n\n\n## Quick Start\n\n```python\nimport torch\nfrom spear.nn.phalanx import Phalanx\n\ndevice = \"cuda:0\" if torch.cuda.is_available() else \"cpu\"\ndtype = torch.bfloat16\n\ndim = 512  # Must be divisible by 16 (head_dim is fixed at 16)\nlength = 128\nbatch_size = 1024\nlayer = Phalanx(dim=dim, length=length, dtype=dtype).to(device)\n\nx = torch.randn(batch_size, length, dim, dtype=dtype, device=device)\ny = layer(x)\nprint(f\"Input: {x.shape} -> Output: {y.shape}\")\n```\n\n### Development\n\nWe include pre-commit hooks for linting and formatting (Python, C++, CUDA). To install:\n\n```bash\nuv run pre-commit install\n```\n\nTo run (note they will be run automatically on commit, so not necessary to run manually):\n\n```bash\nuv run pre-commit run --all-files\n```\n\nTo run tests\n\n```bash\nuv run pytest\n```\n\n## Structure\n\n```\ncsrc/        # kernels: CUDA/C++ or other DSLs\nspear/\n\u251c\u2500 ops/      # low-level wrappers per op family\n\u2502  \u2514\u2500 <op>/\n\u2514\u2500 nn/       # layers built from ops (parametrized)\n   \u2514\u2500 <layer>/\n```\n\n\n## Target Architectures\n\nCurrently supported hardware includes compute capabilities 9.0 (Hopper) and 10.0 (Blackwell).\n\n| Kernel Name       |  (NVIDIA) sm9.0 |  (NVIDIA) sm10.0 |  (NVIDIA) sm10.3 |\n| ----------------- | :-----: | :-----: | :-----: |\n| `swr.btp.fwd.bf16.bdl.hd16-bl16.sm90` | \u2714\ufe0e |  ~ |  \u26d4|\n| `swr.btp.bwd.bf16.bdl.hd16-bl16.sm90`  | \u2714\ufe0e | ~ |  \u26d4 |\n\n* \u2714\ufe0e: optimized\n* ~: working but not fully optimized\n* \u26d4: not available\n\n\n---\n\n<p align=\"center\">\n  <img width=350 alt=\"Radical Numerics Logo\" src=\"https://raw.githubusercontent.com/RadicalNumerics/assets/refs/heads/main/svg/rn-logo-desktop-vector-animated.svg\" />\n</p>\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "SPEAR: Structured Primitives for Efficient Architecture Research",
    "version": "0.1.0",
    "project_urls": null,
    "split_keywords": [
        "cuda",
        " kernels",
        " linear-algebra",
        " machine-learning",
        " pytorch"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ef56dd32fc9ebee92fe7104ba22d190afa5da82dad965a733543e9180b305f95",
                "md5": "4655a63c10ff65dd0a6026bfa66edbd8",
                "sha256": "cfe667efe0e21fdf13880315636d15cd6674249f3a5c8567b31213a7d4a8832e"
            },
            "downloads": -1,
            "filename": "spear_python-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "4655a63c10ff65dd0a6026bfa66edbd8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 43955,
            "upload_time": "2025-10-13T16:49:57",
            "upload_time_iso_8601": "2025-10-13T16:49:57.103481Z",
            "url": "https://files.pythonhosted.org/packages/ef/56/dd32fc9ebee92fe7104ba22d190afa5da82dad965a733543e9180b305f95/spear_python-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-13 16:49:57",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "spear-python"
}
        
Elapsed time: 2.12744s