b10-tcache


Nameb10-tcache JSON
Version 0.2.2 PyPI version JSON
download
home_pagehttps://docs.baseten.co/development/model/b10-tcache
SummaryDistributed PyTorch compilation cache for Baseten - Environment-aware, lock-free compilation cache management
upload_time2025-08-14 19:26:03
maintainerFred Liu
docs_urlNone
authorShounak Ray
requires_python<4.0,>=3.9
licenseMIT
keywords pytorch torch.compile cache machine-learning inference
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            https://www.notion.so/ml-infra/mega-base-cache-24291d247273805b8e20fe26677b7b0f

# B10 TCache

PyTorch compilation cache for Baseten deployments.

## Usage

```python
import b10_tcache

# Inside model.load() function
def load()
    # Load cache before torch.compile()
    cache_loaded = b10_tcache.load_compile_cache()

    # ...

    # Your model compilation
    model = torch.compile(model)
    # Warm up the model with dummy prompts, and arguments that would be typically used in your requests (e.g resolutions)
    dummy_input = "What is the capital of France?"
    model(dummy_input)

    # ...

    # Save cache after compilation
    if not cache_loaded:
        b10_tcache.save_compile_cache()
```

## Configuration

Configure via environment variables:

```bash
# Cache directories
export TORCH_CACHE_DIR="/tmp/torchinductor_root"      # Default
export B10FS_CACHE_DIR="/cache/model/compile_cache"   # Default  
export LOCAL_WORK_DIR="/app"                          # Default

# Cache limits
export MAX_CACHE_SIZE_MB="1024"                       # 1GB default
```

## How It Works

### Environment-Specific Caching

The library automatically creates unique cache keys based on your environment:

```
torch-2.1.0_cuda-12.1_cc-8.6_triton-2.1.0 → cache_a1b2c3d4e5f6.latest.tar.gz
torch-2.0.1_cuda-11.8_cc-7.5_triton-2.0.1 → cache_x9y8z7w6v5u4.latest.tar.gz
torch-2.1.0_cpu_triton-none                → cache_m1n2o3p4q5r6.latest.tar.gz
```

**Components used:**
- **PyTorch version** (e.g., `torch-2.1.0`)
- **CUDA version** (e.g., `cuda-12.1` or `cpu`)
- **GPU compute capability** (e.g., `cc-8.6` for A100)
- **Triton version** (e.g., `triton-2.1.0` or `triton-none`)

### Cache Workflow

1. **Load Phase** (startup): Generate environment key, check for matching cache in B10FS, extract to local directory
2. **Save Phase** (after compilation): Create archive, atomic copy to B10FS with environment-specific filename

### Lock-Free Race Prevention  

Uses journal pattern with atomic filesystem operations for parallel-safe cache saves.

## API Reference

### Functions

- `load_compile_cache() -> bool`: Load cache from B10FS for current environment
- `save_compile_cache() -> bool`: Save cache to B10FS with environment-specific filename
- `clear_local_cache() -> bool`: Clear local cache directory
- `get_cache_info() -> Dict[str, Any]`: Get cache status information for current environment
- `list_available_caches() -> Dict[str, Any]`: List all cache files with environment details

### Exceptions

- `CacheError`: Base exception for cache operations
- `CacheValidationError`: Path validation or compatibility check failed

## Performance Impact

### Debugging

Enable debug logging:

```python
import logging
logging.getLogger('b10_tcache').setLevel(logging.DEBUG)
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://docs.baseten.co/development/model/b10-tcache",
    "name": "b10-tcache",
    "maintainer": "Fred Liu",
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": "fred.liu.noreply@baseten.co",
    "keywords": "pytorch, torch.compile, cache, machine-learning, inference",
    "author": "Shounak Ray",
    "author_email": "shounak.noreply@baseten.co",
    "download_url": "https://files.pythonhosted.org/packages/ec/45/0c2dbbd4dd539541e1b6787e35322780eb816c626dd7e7b8460a57ff54b4/b10_tcache-0.2.2.tar.gz",
    "platform": null,
    "description": "https://www.notion.so/ml-infra/mega-base-cache-24291d247273805b8e20fe26677b7b0f\n\n# B10 TCache\n\nPyTorch compilation cache for Baseten deployments.\n\n## Usage\n\n```python\nimport b10_tcache\n\n# Inside model.load() function\ndef load()\n    # Load cache before torch.compile()\n    cache_loaded = b10_tcache.load_compile_cache()\n\n    # ...\n\n    # Your model compilation\n    model = torch.compile(model)\n    # Warm up the model with dummy prompts, and arguments that would be typically used in your requests (e.g resolutions)\n    dummy_input = \"What is the capital of France?\"\n    model(dummy_input)\n\n    # ...\n\n    # Save cache after compilation\n    if not cache_loaded:\n        b10_tcache.save_compile_cache()\n```\n\n## Configuration\n\nConfigure via environment variables:\n\n```bash\n# Cache directories\nexport TORCH_CACHE_DIR=\"/tmp/torchinductor_root\"      # Default\nexport B10FS_CACHE_DIR=\"/cache/model/compile_cache\"   # Default  \nexport LOCAL_WORK_DIR=\"/app\"                          # Default\n\n# Cache limits\nexport MAX_CACHE_SIZE_MB=\"1024\"                       # 1GB default\n```\n\n## How It Works\n\n### Environment-Specific Caching\n\nThe library automatically creates unique cache keys based on your environment:\n\n```\ntorch-2.1.0_cuda-12.1_cc-8.6_triton-2.1.0 \u2192 cache_a1b2c3d4e5f6.latest.tar.gz\ntorch-2.0.1_cuda-11.8_cc-7.5_triton-2.0.1 \u2192 cache_x9y8z7w6v5u4.latest.tar.gz\ntorch-2.1.0_cpu_triton-none                \u2192 cache_m1n2o3p4q5r6.latest.tar.gz\n```\n\n**Components used:**\n- **PyTorch version** (e.g., `torch-2.1.0`)\n- **CUDA version** (e.g., `cuda-12.1` or `cpu`)\n- **GPU compute capability** (e.g., `cc-8.6` for A100)\n- **Triton version** (e.g., `triton-2.1.0` or `triton-none`)\n\n### Cache Workflow\n\n1. **Load Phase** (startup): Generate environment key, check for matching cache in B10FS, extract to local directory\n2. **Save Phase** (after compilation): Create archive, atomic copy to B10FS with environment-specific filename\n\n### Lock-Free Race Prevention  \n\nUses journal pattern with atomic filesystem operations for parallel-safe cache saves.\n\n## API Reference\n\n### Functions\n\n- `load_compile_cache() -> bool`: Load cache from B10FS for current environment\n- `save_compile_cache() -> bool`: Save cache to B10FS with environment-specific filename\n- `clear_local_cache() -> bool`: Clear local cache directory\n- `get_cache_info() -> Dict[str, Any]`: Get cache status information for current environment\n- `list_available_caches() -> Dict[str, Any]`: List all cache files with environment details\n\n### Exceptions\n\n- `CacheError`: Base exception for cache operations\n- `CacheValidationError`: Path validation or compatibility check failed\n\n## Performance Impact\n\n### Debugging\n\nEnable debug logging:\n\n```python\nimport logging\nlogging.getLogger('b10_tcache').setLevel(logging.DEBUG)\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Distributed PyTorch compilation cache for Baseten - Environment-aware, lock-free compilation cache management",
    "version": "0.2.2",
    "project_urls": {
        "Documentation": "https://docs.baseten.co/development/model/b10-tcache",
        "Homepage": "https://docs.baseten.co/development/model/b10-tcache",
        "Repository": "https://pypi.org/project/b10-tcache/"
    },
    "split_keywords": [
        "pytorch",
        " torch.compile",
        " cache",
        " machine-learning",
        " inference"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "01714823495c35d4274c826a5a65a332bb1fece78241f4edf80b6b504b375249",
                "md5": "7cfc71fe3f1f08f5961d199cd4538150",
                "sha256": "95df4147967355f0edbf87a35751f96b346e4e44852164c2e18e76ce2d5a4d76"
            },
            "downloads": -1,
            "filename": "b10_tcache-0.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7cfc71fe3f1f08f5961d199cd4538150",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 19123,
            "upload_time": "2025-08-14T19:26:01",
            "upload_time_iso_8601": "2025-08-14T19:26:01.978770Z",
            "url": "https://files.pythonhosted.org/packages/01/71/4823495c35d4274c826a5a65a332bb1fece78241f4edf80b6b504b375249/b10_tcache-0.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ec450c2dbbd4dd539541e1b6787e35322780eb816c626dd7e7b8460a57ff54b4",
                "md5": "e3122d7ebfd0501dba6c4ae998511221",
                "sha256": "5db50d406e5fb5cc5b22c97f1314b8bec00aa7991746046ca4439a4e17b003ae"
            },
            "downloads": -1,
            "filename": "b10_tcache-0.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "e3122d7ebfd0501dba6c4ae998511221",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 16166,
            "upload_time": "2025-08-14T19:26:03",
            "upload_time_iso_8601": "2025-08-14T19:26:03.134614Z",
            "url": "https://files.pythonhosted.org/packages/ec/45/0c2dbbd4dd539541e1b6787e35322780eb816c626dd7e7b8460a57ff54b4/b10_tcache-0.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-14 19:26:03",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "b10-tcache"
}
        
Elapsed time: 0.83332s