dataset-with-logits


Namedataset-with-logits JSON
Version 0.2.9 PyPI version JSON
download
home_pageNone
SummaryPyTorch datasets with pre-computed model logits for efficient research
upload_time2025-08-08 01:24:21
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords pytorch imagenet cifar dataset computer-vision machine-learning knowledge-distillation deep-learning logits pretrained-models
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Dataset with Logits

A PyTorch package for loading computer vision datasets paired with pre-computed model logits. Perfect for knowledge distillation, model analysis, and efficient research workflows.

## 🚀 Quick Start

```bash
pip install dataset-with-logits
```

```python
import torchvision.transforms as transforms
from dataset_with_logits import ImageNet

# Define transforms
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
])

# Create dataset (auto-downloads predictions)
dataset = ImageNet(
    root='/path/to/imagenet/val',
    model='resnet18',
    transform=transform,
    auto_download=True
)

# Use with DataLoader
from torch.utils.data import DataLoader
loader = DataLoader(dataset, batch_size=32, shuffle=True)

for images, labels, logits in loader:
    # images: [batch_size, 3, 224, 224] 
    # labels: [batch_size] - ground truth
    # logits: [batch_size, 1000] - model predictions
    break
```

## 📊 Available Models

### ImageNet-1K
- `resnet18` - ResNet-18 (11.7M parameters)
- `resnet50` - ResNet-50 (25.6M parameters)  
- `resnet152` - ResNet-152 (60.2M parameters)
- `vit_l_16` - Vision Transformer Large (304M parameters)
- `mobilenet_v3_small` - MobileNet V3 Small (2.5M parameters)
- `mobilenet_v3_large` - MobileNet V3 Large (5.5M parameters)

More models and datasets coming soon!

## 🎯 Use Cases

### Knowledge Distillation
```python
import torch.nn.functional as F

def knowledge_distillation_loss(student_logits, teacher_logits, labels, temperature=3.0):
    student_soft = F.log_softmax(student_logits / temperature, dim=1)
    teacher_soft = F.softmax(teacher_logits / temperature, dim=1)
    return F.kl_div(student_soft, teacher_soft, reduction='batchmean')

# In your training loop
for images, labels, teacher_logits in dataloader:
    student_logits = student_model(images)
    loss = knowledge_distillation_loss(student_logits, teacher_logits, labels)
```

### Model Analysis
```python
from dataset_with_logits import ImageNet

# Compare different models
models = ['resnet18', 'resnet152', 'vit_l_16']
datasets = {}

for model in models:
    datasets[model] = ImageNet(root=imagenet_path, model=model)

# Analyze prediction differences, calibration, etc.
```

## 🔧 Advanced Usage

### List Available Models
```python
from dataset_with_logits import list_available_models

models = list_available_models()
print(models)
# {'imagenet1k': {'resnet18': 'ResNet-18 (11.7M parameters)', ...}}
```

### Custom Cache Directory
```python
dataset = ImageNet(
    root='/path/to/imagenet',
    model='resnet18',
    cache_dir='/custom/cache/dir',
    auto_download=True
)
```

### Version Control
```python
dataset = ImageNet(
    root='/path/to/imagenet',
    model='resnet18',
    version='v0.1.0',  # Specific version
    auto_download=True
)
```

## 📁 File Format

Prediction files are CSV format with:
- `id`: Image filename (no extension)
- `label`: Ground truth class index  
- `logits`: Semicolon-separated model outputs

Example:
```csv
id,label,logits
ILSVRC2012_val_00000001,65,-2.3;1.7;0.2;...;0.8
ILSVRC2012_val_00000002,970,0.1;-1.2;3.4;...;-0.5
```

## 🌐 Data Source

Prediction files are automatically downloaded from **Hugging Face Hub** (primary) with GitHub fallback. Files are cached locally after first download.

**Hosting Infrastructure:**
- 🤗 **Primary**: [Hugging Face Datasets](https://huggingface.co/datasets/ViGeng/prediction-datasets) - Fast, reliable, academic-friendly
- 🐙 **Fallback**: GitHub Releases - For redundancy
- 📦 **Multi-backend**: Automatic fallback ensures high availability

## 🔍 Examples

See the `examples/` directory for:
- Basic usage
- Knowledge distillation
- Model comparison
- Advanced workflows

## 📦 Installation

### From PyPI (Recommended)
```bash
pip install dataset-with-logits
```

### From Source
```bash
git clone https://github.com/ViGeng/predictions-on-datasets.git
cd predictions-on-datasets/dataset_with_logits
pip install -e .
```

## 🤝 Contributing

Contributions are welcome! See the main repository for contribution guidelines.

## 📄 License

MIT License - see LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "dataset-with-logits",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "pytorch, imagenet, cifar, dataset, computer-vision, machine-learning, knowledge-distillation, deep-learning, logits, pretrained-models",
    "author": null,
    "author_email": "ViGeng <your.email@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/a7/37/598d02a3a33796e362c6c68b54d72be1345de5344043ebd42a60ee204a49/dataset_with_logits-0.2.9.tar.gz",
    "platform": null,
    "description": "# Dataset with Logits\n\nA PyTorch package for loading computer vision datasets paired with pre-computed model logits. Perfect for knowledge distillation, model analysis, and efficient research workflows.\n\n## \ud83d\ude80 Quick Start\n\n```bash\npip install dataset-with-logits\n```\n\n```python\nimport torchvision.transforms as transforms\nfrom dataset_with_logits import ImageNet\n\n# Define transforms\ntransform = transforms.Compose([\n    transforms.Resize(256),\n    transforms.CenterCrop(224),\n    transforms.ToTensor(),\n])\n\n# Create dataset (auto-downloads predictions)\ndataset = ImageNet(\n    root='/path/to/imagenet/val',\n    model='resnet18',\n    transform=transform,\n    auto_download=True\n)\n\n# Use with DataLoader\nfrom torch.utils.data import DataLoader\nloader = DataLoader(dataset, batch_size=32, shuffle=True)\n\nfor images, labels, logits in loader:\n    # images: [batch_size, 3, 224, 224] \n    # labels: [batch_size] - ground truth\n    # logits: [batch_size, 1000] - model predictions\n    break\n```\n\n## \ud83d\udcca Available Models\n\n### ImageNet-1K\n- `resnet18` - ResNet-18 (11.7M parameters)\n- `resnet50` - ResNet-50 (25.6M parameters)  \n- `resnet152` - ResNet-152 (60.2M parameters)\n- `vit_l_16` - Vision Transformer Large (304M parameters)\n- `mobilenet_v3_small` - MobileNet V3 Small (2.5M parameters)\n- `mobilenet_v3_large` - MobileNet V3 Large (5.5M parameters)\n\nMore models and datasets coming soon!\n\n## \ud83c\udfaf Use Cases\n\n### Knowledge Distillation\n```python\nimport torch.nn.functional as F\n\ndef knowledge_distillation_loss(student_logits, teacher_logits, labels, temperature=3.0):\n    student_soft = F.log_softmax(student_logits / temperature, dim=1)\n    teacher_soft = F.softmax(teacher_logits / temperature, dim=1)\n    return F.kl_div(student_soft, teacher_soft, reduction='batchmean')\n\n# In your training loop\nfor images, labels, teacher_logits in dataloader:\n    student_logits = student_model(images)\n    loss = knowledge_distillation_loss(student_logits, teacher_logits, labels)\n```\n\n### Model Analysis\n```python\nfrom dataset_with_logits import ImageNet\n\n# Compare different models\nmodels = ['resnet18', 'resnet152', 'vit_l_16']\ndatasets = {}\n\nfor model in models:\n    datasets[model] = ImageNet(root=imagenet_path, model=model)\n\n# Analyze prediction differences, calibration, etc.\n```\n\n## \ud83d\udd27 Advanced Usage\n\n### List Available Models\n```python\nfrom dataset_with_logits import list_available_models\n\nmodels = list_available_models()\nprint(models)\n# {'imagenet1k': {'resnet18': 'ResNet-18 (11.7M parameters)', ...}}\n```\n\n### Custom Cache Directory\n```python\ndataset = ImageNet(\n    root='/path/to/imagenet',\n    model='resnet18',\n    cache_dir='/custom/cache/dir',\n    auto_download=True\n)\n```\n\n### Version Control\n```python\ndataset = ImageNet(\n    root='/path/to/imagenet',\n    model='resnet18',\n    version='v0.1.0',  # Specific version\n    auto_download=True\n)\n```\n\n## \ud83d\udcc1 File Format\n\nPrediction files are CSV format with:\n- `id`: Image filename (no extension)\n- `label`: Ground truth class index  \n- `logits`: Semicolon-separated model outputs\n\nExample:\n```csv\nid,label,logits\nILSVRC2012_val_00000001,65,-2.3;1.7;0.2;...;0.8\nILSVRC2012_val_00000002,970,0.1;-1.2;3.4;...;-0.5\n```\n\n## \ud83c\udf10 Data Source\n\nPrediction files are automatically downloaded from **Hugging Face Hub** (primary) with GitHub fallback. Files are cached locally after first download.\n\n**Hosting Infrastructure:**\n- \ud83e\udd17 **Primary**: [Hugging Face Datasets](https://huggingface.co/datasets/ViGeng/prediction-datasets) - Fast, reliable, academic-friendly\n- \ud83d\udc19 **Fallback**: GitHub Releases - For redundancy\n- \ud83d\udce6 **Multi-backend**: Automatic fallback ensures high availability\n\n## \ud83d\udd0d Examples\n\nSee the `examples/` directory for:\n- Basic usage\n- Knowledge distillation\n- Model comparison\n- Advanced workflows\n\n## \ud83d\udce6 Installation\n\n### From PyPI (Recommended)\n```bash\npip install dataset-with-logits\n```\n\n### From Source\n```bash\ngit clone https://github.com/ViGeng/predictions-on-datasets.git\ncd predictions-on-datasets/dataset_with_logits\npip install -e .\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! See the main repository for contribution guidelines.\n\n## \ud83d\udcc4 License\n\nMIT License - see LICENSE file for details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "PyTorch datasets with pre-computed model logits for efficient research",
    "version": "0.2.9",
    "project_urls": {
        "Bug Tracker": "https://github.com/ViGeng/predictions-on-datasets/issues",
        "Documentation": "https://github.com/ViGeng/predictions-on-datasets#readme",
        "Homepage": "https://github.com/ViGeng/predictions-on-datasets",
        "Repository": "https://github.com/ViGeng/predictions-on-datasets"
    },
    "split_keywords": [
        "pytorch",
        " imagenet",
        " cifar",
        " dataset",
        " computer-vision",
        " machine-learning",
        " knowledge-distillation",
        " deep-learning",
        " logits",
        " pretrained-models"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2de12a3b21e6c0f0e753aca34b65dcfb6d85df518abe6843011d8f4041d947b1",
                "md5": "583823b932e96f525b37bb989f3748a0",
                "sha256": "f27724d2e22bc7e9f90fc5007b36335155eaae0bf246d527d32b6132d01a4942"
            },
            "downloads": -1,
            "filename": "dataset_with_logits-0.2.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "583823b932e96f525b37bb989f3748a0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 13805,
            "upload_time": "2025-08-08T01:24:20",
            "upload_time_iso_8601": "2025-08-08T01:24:20.519948Z",
            "url": "https://files.pythonhosted.org/packages/2d/e1/2a3b21e6c0f0e753aca34b65dcfb6d85df518abe6843011d8f4041d947b1/dataset_with_logits-0.2.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a737598d02a3a33796e362c6c68b54d72be1345de5344043ebd42a60ee204a49",
                "md5": "74b72bf52c17d4fde61f8c671dffe04a",
                "sha256": "84e0d10735133d14c19e9f2f7c4a0acc8306765e8275163c4a8741b90c6e69ab"
            },
            "downloads": -1,
            "filename": "dataset_with_logits-0.2.9.tar.gz",
            "has_sig": false,
            "md5_digest": "74b72bf52c17d4fde61f8c671dffe04a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 15579,
            "upload_time": "2025-08-08T01:24:21",
            "upload_time_iso_8601": "2025-08-08T01:24:21.803082Z",
            "url": "https://files.pythonhosted.org/packages/a7/37/598d02a3a33796e362c6c68b54d72be1345de5344043ebd42a60ee204a49/dataset_with_logits-0.2.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-08 01:24:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ViGeng",
    "github_project": "predictions-on-datasets",
    "github_not_found": true,
    "lcname": "dataset-with-logits"
}
        
Elapsed time: 1.05270s