tabpfn-common-utils


Nametabpfn-common-utils JSON
Version 0.1.5 PyPI version JSON
download
home_pageNone
SummaryUtilities shared between TabPFN codebases
upload_time2025-09-05 09:49:26
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords machine-learning tabpfn utilities
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TabPFN Common Utilities

A comprehensive utility package for [TabPFN](https://github.com/priorlabs/tabpfn) - the foundation model for tabular data.

## Features

### 🔒 Privacy-First Telemetry System
- **Anonymous & Aggregated Data Collection**: Implements safe, GDPR-compliant telemetry that respects user privacy
- **Configurable Analytics**: Optional telemetry that can be disabled via environment variables
- **Usage Pattern Insights**: Tracks TabPFN usage patterns to improve the model and user experience
- **Zero Personal Data**: No personal information or sensitive data is collected or transmitted

### 💰 Cost Estimation
- **Resource Planning**: Accurate estimation of computational costs and duration for TabPFN predictions
- **Cloud Pricing**: Essential for resource planning in cloud-based TabPFN services
- **Task-Specific Calculations**: Different cost models for classification vs regression tasks

### 📊 Data Processing Utilities
- **Regression Results**: Comprehensive handling of prediction outputs with mean, median, mode, and quantiles
- **Data Serialization**: Convert between pandas DataFrames, NumPy arrays, and CSV formats
- **Dataset Management**: Load and preprocess standard ML datasets with proper train/test splits
- **Preprocessing Configuration**: Extensive options for data transformation strategies

## Installation

```bash
pip install tabpfn-common-utils
```

Or with uv:
```bash
uv add tabpfn-common-utils
```

## Quick Start

### Telemetry (Privacy-Compliant)

```python
from tabpfn_common_utils.telemetry import ProductTelemetry

# Initialize telemetry service (anonymous, GDPR-compliant)
telemetry = ProductTelemetry()

# Track usage events (no personal data collected)
telemetry.capture(...)

# Telemetry can be disabled by setting environment variable
export TABPFN_DISABLE_TELEMETRY=1
```

### Regression Results

```python
from tabpfn_common_utils.regression_pred_result import RegressionPredictResult

# Handle regression prediction results
result = RegressionPredictResult({
    "mean": [1.2, 2.3, 3.4],
    "median": [1.1, 2.2, 3.3],
    "mode": [1.0, 2.0, 3.0],
    "quantile_0.25": [0.9, 1.9, 2.9],
    "quantile_0.75": [1.5, 2.5, 3.5]
})

# Convert to basic representation for serialization
basic_repr = RegressionPredictResult.to_basic_representation(result)
```

### Data Utilities

```python
from tabpfn_common_utils.utils import get_example_dataset, serialize_to_csv_formatted_bytes
import pandas as pd

# Load example dataset
X_train, X_test, y_train, y_test = get_example_dataset("iris")

# Serialize data to CSV bytes
csv_bytes = serialize_to_csv_formatted_bytes(X_train)
```

## Privacy & Compliance

This package implements **privacy-first telemetry** that:

- ✅ **GDPR Compliant**: No personal data collection
- ✅ **Anonymous Only**: No user identification or tracking
- ✅ **Aggregated Data**: Only statistical insights are collected
- ✅ **User Control**: Can be completely disabled
- ✅ **Transparent**: Open source code for full transparency

Telemetry data helps improve TabPFN but never compromises user privacy.

## Development

### Setup

```bash
# Install dependencies
uv sync

# Activate virtual environment
source .venv/bin/activate

# Run tests
uv run pytest

# Type checking
uv run pyright

# Code formatting
uv run ruff check --fix
```

### Adding Dependencies

```bash
# Add runtime dependency
uv add <package_name>

# Add development dependency
uv add --group dev <package_name>
```

## Contributing

Contributions are welcome! Please ensure all code passes type checking and formatting requirements.

## Links

- [TabPFN Main Repository](https://github.com/priorlabs/tabpfn)
- [Documentation](https://github.com/priorlabs/tabpfn_common_utils)
- [Issues](https://github.com/priorlabs/tabpfn_common_utils/issues)
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "tabpfn-common-utils",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "Prior Labs <hello@priorlabs.ai>",
    "keywords": "machine-learning, tabpfn, utilities",
    "author": null,
    "author_email": "Prior Labs <hello@priorlabs.ai>",
    "download_url": "https://files.pythonhosted.org/packages/44/88/38697306023717cbca0a25cdfdff63c88cefa23519dbc5a343265b7fd95c/tabpfn_common_utils-0.1.5.tar.gz",
    "platform": null,
    "description": "# TabPFN Common Utilities\n\nA comprehensive utility package for [TabPFN](https://github.com/priorlabs/tabpfn) - the foundation model for tabular data.\n\n## Features\n\n### \ud83d\udd12 Privacy-First Telemetry System\n- **Anonymous & Aggregated Data Collection**: Implements safe, GDPR-compliant telemetry that respects user privacy\n- **Configurable Analytics**: Optional telemetry that can be disabled via environment variables\n- **Usage Pattern Insights**: Tracks TabPFN usage patterns to improve the model and user experience\n- **Zero Personal Data**: No personal information or sensitive data is collected or transmitted\n\n### \ud83d\udcb0 Cost Estimation\n- **Resource Planning**: Accurate estimation of computational costs and duration for TabPFN predictions\n- **Cloud Pricing**: Essential for resource planning in cloud-based TabPFN services\n- **Task-Specific Calculations**: Different cost models for classification vs regression tasks\n\n### \ud83d\udcca Data Processing Utilities\n- **Regression Results**: Comprehensive handling of prediction outputs with mean, median, mode, and quantiles\n- **Data Serialization**: Convert between pandas DataFrames, NumPy arrays, and CSV formats\n- **Dataset Management**: Load and preprocess standard ML datasets with proper train/test splits\n- **Preprocessing Configuration**: Extensive options for data transformation strategies\n\n## Installation\n\n```bash\npip install tabpfn-common-utils\n```\n\nOr with uv:\n```bash\nuv add tabpfn-common-utils\n```\n\n## Quick Start\n\n### Telemetry (Privacy-Compliant)\n\n```python\nfrom tabpfn_common_utils.telemetry import ProductTelemetry\n\n# Initialize telemetry service (anonymous, GDPR-compliant)\ntelemetry = ProductTelemetry()\n\n# Track usage events (no personal data collected)\ntelemetry.capture(...)\n\n# Telemetry can be disabled by setting environment variable\nexport TABPFN_DISABLE_TELEMETRY=1\n```\n\n### Regression Results\n\n```python\nfrom tabpfn_common_utils.regression_pred_result import RegressionPredictResult\n\n# Handle regression prediction results\nresult = RegressionPredictResult({\n    \"mean\": [1.2, 2.3, 3.4],\n    \"median\": [1.1, 2.2, 3.3],\n    \"mode\": [1.0, 2.0, 3.0],\n    \"quantile_0.25\": [0.9, 1.9, 2.9],\n    \"quantile_0.75\": [1.5, 2.5, 3.5]\n})\n\n# Convert to basic representation for serialization\nbasic_repr = RegressionPredictResult.to_basic_representation(result)\n```\n\n### Data Utilities\n\n```python\nfrom tabpfn_common_utils.utils import get_example_dataset, serialize_to_csv_formatted_bytes\nimport pandas as pd\n\n# Load example dataset\nX_train, X_test, y_train, y_test = get_example_dataset(\"iris\")\n\n# Serialize data to CSV bytes\ncsv_bytes = serialize_to_csv_formatted_bytes(X_train)\n```\n\n## Privacy & Compliance\n\nThis package implements **privacy-first telemetry** that:\n\n- \u2705 **GDPR Compliant**: No personal data collection\n- \u2705 **Anonymous Only**: No user identification or tracking\n- \u2705 **Aggregated Data**: Only statistical insights are collected\n- \u2705 **User Control**: Can be completely disabled\n- \u2705 **Transparent**: Open source code for full transparency\n\nTelemetry data helps improve TabPFN but never compromises user privacy.\n\n## Development\n\n### Setup\n\n```bash\n# Install dependencies\nuv sync\n\n# Activate virtual environment\nsource .venv/bin/activate\n\n# Run tests\nuv run pytest\n\n# Type checking\nuv run pyright\n\n# Code formatting\nuv run ruff check --fix\n```\n\n### Adding Dependencies\n\n```bash\n# Add runtime dependency\nuv add <package_name>\n\n# Add development dependency\nuv add --group dev <package_name>\n```\n\n## Contributing\n\nContributions are welcome! Please ensure all code passes type checking and formatting requirements.\n\n## Links\n\n- [TabPFN Main Repository](https://github.com/priorlabs/tabpfn)\n- [Documentation](https://github.com/priorlabs/tabpfn_common_utils)\n- [Issues](https://github.com/priorlabs/tabpfn_common_utils/issues)",
    "bugtrack_url": null,
    "license": null,
    "summary": "Utilities shared between TabPFN codebases",
    "version": "0.1.5",
    "project_urls": {
        "Homepage": "https://github.com/priorlabs/tabpfn_common_utils",
        "Issues": "https://github.com/priorlabs/tabpfn_common_utils/issues"
    },
    "split_keywords": [
        "machine-learning",
        " tabpfn",
        " utilities"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5f591b29905f0b814e791d55486abed71cd399b9aceddc1ce504ee90610e7c22",
                "md5": "abc2a6b2e5c3be3ed1b840a0da3567a9",
                "sha256": "a56b5eea2388e7f8fe9bece365e54e73cfaeceb9b4c9edbdde57b4947d41f847"
            },
            "downloads": -1,
            "filename": "tabpfn_common_utils-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "abc2a6b2e5c3be3ed1b840a0da3567a9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 10419,
            "upload_time": "2025-09-05T09:49:21",
            "upload_time_iso_8601": "2025-09-05T09:49:21.680043Z",
            "url": "https://files.pythonhosted.org/packages/5f/59/1b29905f0b814e791d55486abed71cd399b9aceddc1ce504ee90610e7c22/tabpfn_common_utils-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "448838697306023717cbca0a25cdfdff63c88cefa23519dbc5a343265b7fd95c",
                "md5": "18af18539d44fefc22e950c8d6fb99cd",
                "sha256": "8b762753455812cd2eb938f46f0fdbdc7b2f69feaa3b4b4bf28888edea38ec29"
            },
            "downloads": -1,
            "filename": "tabpfn_common_utils-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "18af18539d44fefc22e950c8d6fb99cd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 1908199,
            "upload_time": "2025-09-05T09:49:26",
            "upload_time_iso_8601": "2025-09-05T09:49:26.378448Z",
            "url": "https://files.pythonhosted.org/packages/44/88/38697306023717cbca0a25cdfdff63c88cefa23519dbc5a343265b7fd95c/tabpfn_common_utils-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-05 09:49:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "priorlabs",
    "github_project": "tabpfn_common_utils",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "tabpfn-common-utils"
}
        
Elapsed time: 0.70831s