# adult-dataset
A PyTorch dataset wrapper for the
[Adult (Census Income)](https://archive.ics.uci.edu/dataset/2/adult) dataset.
Adult is a popular dataset in machine learning fairness research.
This package provides the `adult.Adult` class:
a`torch.utils.data.Datasets` loading and, optionally, downloading the
Adult dataset.
It can be used like the `MNIST` dataset in
[torchvision](https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html?highlight=mnist#torchvision.datasets.MNIST).
Beyond `adult.Adult`, this package also provides `adult.AdultRaw`,
which works just as `adult.Adult`, but
does not standardize the features in the dataset and does not apply one-hot encoding.
## Installation
```shell
pip install adult-dataset
```
## Basic Usage
```python
from adult import Adult
# load (if necessary, download) the Adult training dataset
train_set = Adult(root="datasets", download=True)
# load the test set
test_set = Adult(root="datasets", train=False, download=True)
inputs, target = train_set[0] # retrieve the first sample of the training set
# iterate over the training set
for inputs, target in iter(train_set):
... # Do something with a single sample
# use a PyTorch data loader
from torch.utils.data import DataLoader
loader = DataLoader(test_set, batch_size=32, shuffle=True)
for epoch in range(100):
for inputs, targets in iter(loader):
... # Do something with a batch of samples
```
## Advanced Usage
Turn off status messages while downloading the dataset:
```python
Adult(root=..., output_fn=None)
```
Use the `logging` module for logging status messages while downloading the
dataset instead of placing the status messages on `sys.stdout`.
```python
import logging
Adult(root=..., output_fn=logging.info)
```
Raw data
{
"_id": null,
"home_page": null,
"name": "adult-dataset",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "PyTorch,dataset,Adult,Census Income",
"author": null,
"author_email": "David Boetius <david.boetius@uni-konstanz.de>",
"download_url": "https://files.pythonhosted.org/packages/da/48/b8bd172b5d03cf8b4f0197881db5a931ecbc9ab9396061efc68b50b5cc39/adult_dataset-2.1.0.tar.gz",
"platform": null,
"description": "# adult-dataset\nA PyTorch dataset wrapper for the \n[Adult (Census Income)](https://archive.ics.uci.edu/dataset/2/adult) dataset.\nAdult is a popular dataset in machine learning fairness research. \n\nThis package provides the `adult.Adult` class:\na`torch.utils.data.Datasets` loading and, optionally, downloading the\nAdult dataset.\nIt can be used like the `MNIST` dataset in\n[torchvision](https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html?highlight=mnist#torchvision.datasets.MNIST).\n\nBeyond `adult.Adult`, this package also provides `adult.AdultRaw`,\nwhich works just as `adult.Adult`, but\ndoes not standardize the features in the dataset and does not apply one-hot encoding.\n\n## Installation\n```shell\npip install adult-dataset\n```\n\n## Basic Usage\n```python\nfrom adult import Adult\n\n# load (if necessary, download) the Adult training dataset \ntrain_set = Adult(root=\"datasets\", download=True)\n# load the test set\ntest_set = Adult(root=\"datasets\", train=False, download=True)\n\ninputs, target = train_set[0] # retrieve the first sample of the training set\n\n# iterate over the training set\nfor inputs, target in iter(train_set):\n ... # Do something with a single sample\n\n# use a PyTorch data loader\nfrom torch.utils.data import DataLoader\n\nloader = DataLoader(test_set, batch_size=32, shuffle=True)\nfor epoch in range(100):\n for inputs, targets in iter(loader):\n ... # Do something with a batch of samples\n```\n\n## Advanced Usage\n\nTurn off status messages while downloading the dataset:\n```python\nAdult(root=..., output_fn=None)\n```\n\nUse the `logging` module for logging status messages while downloading the\ndataset instead of placing the status messages on `sys.stdout`.\n```python\nimport logging\n\nAdult(root=..., output_fn=logging.info)\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "PyTorch dataset wrapper for the",
"version": "2.1.0",
"project_urls": {
"Bug Tracker": "https://github.com/cherrywoods/adult-dataset/issues",
"Homepage": "https://github.com/cherrywoods/adult-dataset",
"Repository": "https://github.com/cherrywoods/adult-dataset.git"
},
"split_keywords": [
"pytorch",
"dataset",
"adult",
"census income"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "64a2e664226ea4e3a6735008a759d618024109da108e4cf755fb310efef25cb9",
"md5": "afb9d1f4f5e0f9630b56aa43dc5f7591",
"sha256": "27e9b4b3ce2d81a8298cc26713f69780ad0fc16ee1c68d6efe743d7e13bc200e"
},
"downloads": -1,
"filename": "adult_dataset-2.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "afb9d1f4f5e0f9630b56aa43dc5f7591",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 8438,
"upload_time": "2024-01-30T17:23:09",
"upload_time_iso_8601": "2024-01-30T17:23:09.470996Z",
"url": "https://files.pythonhosted.org/packages/64/a2/e664226ea4e3a6735008a759d618024109da108e4cf755fb310efef25cb9/adult_dataset-2.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "da48b8bd172b5d03cf8b4f0197881db5a931ecbc9ab9396061efc68b50b5cc39",
"md5": "326753df1b083532e36be90055d38eb3",
"sha256": "d41490a57eb4ca1d0271c34a2dde39a1747fb484ea5a3baa1aa5d91d8f025179"
},
"downloads": -1,
"filename": "adult_dataset-2.1.0.tar.gz",
"has_sig": false,
"md5_digest": "326753df1b083532e36be90055d38eb3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 10552,
"upload_time": "2024-01-30T17:23:47",
"upload_time_iso_8601": "2024-01-30T17:23:47.381183Z",
"url": "https://files.pythonhosted.org/packages/da/48/b8bd172b5d03cf8b4f0197881db5a931ecbc9ab9396061efc68b50b5cc39/adult_dataset-2.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-30 17:23:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "cherrywoods",
"github_project": "adult-dataset",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "adult-dataset"
}