# TorchData (see note below on current status)
[**What is TorchData?**](#what-is-torchdata) | [**Stateful DataLoader**](#stateful-dataloader) |
[**Install guide**](#installation) | [**Contributing**](#contributing) | [**License**](#license)
**:warning: June 2024 Status Update: Removing DataPipes and DataLoader V2**
**We are re-focusing the torchdata repo to be an iterative enhancement of torch.utils.data.DataLoader. We do not plan on
continuing development or maintaining the [`DataPipes`] and [`DataLoaderV2`] solutions, and they will be removed from
the torchdata repo. We'll also be revisiting the `DataPipes` references in pytorch/pytorch. In release
`torchdata==0.8.0` (July 2024) they will be marked as deprecated, and in 0.9.0 (Oct 2024) they will be deleted. Existing
users are advised to pin to `torchdata==0.8.0` or an older version until they are able to migrate away. Subsequent
releases will not include DataPipes or DataLoaderV2. The old version of this README is
[available here](https://github.com/pytorch/data/blob/v0.7.1/README.md). Please reach out if you suggestions or comments
(please use [#1196](https://github.com/pytorch/data/issues/1196) for feedback).**
##
## What is TorchData?
The TorchData project is an iterative enhancement to the PyTorch torch.utils.data.DataLoader and
torch.utils.data.Dataset/IterableDataset to make them scalable, performant dataloading solutions. We will be iterating
on the enhancements under [the torchdata repo](torchdata).
Our first change begins with adding checkpointing to torch.utils.data.DataLoader, which can be found in
[stateful_dataloader, a drop-in replacement for torch.utils.data.DataLoader](torchdata/stateful_dataloader), by defining
`load_state_dict` and `state_dict` methods that enable mid-epoch checkpointing, and an API for users to track custom
iteration progress, and other custom states from the dataloader workers such as token buffers and/or RNG states.
## Stateful DataLoader
`torchdata.stateful_dataloader.StatefulDataLoader` is a drop-in replacement for torch.utils.data.DataLoader which
provides state_dict and load_state_dict functionality. See
[the Stateful DataLoader main page](torchdata/stateful_dataloader) for more information and examples. Also check out the
examples
[in this Colab notebook](https://colab.research.google.com/drive/1tonoovEd7Tsi8EW8ZHXf0v3yHJGwZP8M?usp=sharing).
## Installation
### Version Compatibility
The following is the corresponding `torchdata` versions and supported Python versions.
| `torch` | `torchdata` | `python` |
| -------------------- | ------------------ | ----------------- |
| `master` / `nightly` | `main` / `nightly` | `>=3.9`, `<=3.12` |
| `2.4.0` | `0.8.0` | `>=3.8`, `<=3.12` |
| `2.0.0` | `0.6.0` | `>=3.8`, `<=3.11` |
| `1.13.1` | `0.5.1` | `>=3.7`, `<=3.10` |
| `1.12.1` | `0.4.1` | `>=3.7`, `<=3.10` |
| `1.12.0` | `0.4.0` | `>=3.7`, `<=3.10` |
| `1.11.0` | `0.3.0` | `>=3.7`, `<=3.10` |
### Local pip or conda
First, set up an environment. We will be installing a PyTorch binary as well as torchdata. If you're using conda, create
a conda environment:
```bash
conda create --name torchdata
conda activate torchdata
```
If you wish to use `venv` instead:
```bash
python -m venv torchdata-env
source torchdata-env/bin/activate
```
Install torchdata:
Using pip:
```bash
pip install torchdata
```
Using conda:
```bash
conda install -c pytorch torchdata
```
### From source
```bash
pip install .
```
In case building TorchData from source fails, install the nightly version of PyTorch following the linked guide on the
[contributing page](CONTRIBUTING.md#install-pytorch-nightly).
### From nightly
The nightly version of TorchData is also provided and updated daily from main branch.
Using pip:
```bash
pip install --pre torchdata --extra-index-url https://download.pytorch.org/whl/nightly/cpu
```
Using conda:
```bash
conda install torchdata -c pytorch-nightly
```
## Contributing
We welcome PRs! See the [CONTRIBUTING](CONTRIBUTING.md) file.
## Beta Usage and Feedback
We'd love to hear from and work with early adopters to shape our designs. Please reach out by raising an issue if you're
interested in using this tooling for your project.
## License
TorchData is BSD licensed, as found in the [LICENSE](LICENSE) file.
Raw data
{
"_id": null,
"home_page": "https://github.com/pytorch/data",
"name": "torchdata",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "PyTorch Team",
"author_email": "packages@pytorch.org",
"download_url": null,
"platform": null,
"description": "# TorchData (see note below on current status)\n\n[**What is TorchData?**](#what-is-torchdata) | [**Stateful DataLoader**](#stateful-dataloader) |\n[**Install guide**](#installation) | [**Contributing**](#contributing) | [**License**](#license)\n\n**:warning: June 2024 Status Update: Removing DataPipes and DataLoader V2**\n\n**We are re-focusing the torchdata repo to be an iterative enhancement of torch.utils.data.DataLoader. We do not plan on\ncontinuing development or maintaining the [`DataPipes`] and [`DataLoaderV2`] solutions, and they will be removed from\nthe torchdata repo. We'll also be revisiting the `DataPipes` references in pytorch/pytorch. In release\n`torchdata==0.8.0` (July 2024) they will be marked as deprecated, and in 0.9.0 (Oct 2024) they will be deleted. Existing\nusers are advised to pin to `torchdata==0.8.0` or an older version until they are able to migrate away. Subsequent\nreleases will not include DataPipes or DataLoaderV2. The old version of this README is\n[available here](https://github.com/pytorch/data/blob/v0.7.1/README.md). Please reach out if you suggestions or comments\n(please use [#1196](https://github.com/pytorch/data/issues/1196) for feedback).**\n\n##\n\n## What is TorchData?\n\nThe TorchData project is an iterative enhancement to the PyTorch torch.utils.data.DataLoader and\ntorch.utils.data.Dataset/IterableDataset to make them scalable, performant dataloading solutions. We will be iterating\non the enhancements under [the torchdata repo](torchdata).\n\nOur first change begins with adding checkpointing to torch.utils.data.DataLoader, which can be found in\n[stateful_dataloader, a drop-in replacement for torch.utils.data.DataLoader](torchdata/stateful_dataloader), by defining\n`load_state_dict` and `state_dict` methods that enable mid-epoch checkpointing, and an API for users to track custom\niteration progress, and other custom states from the dataloader workers such as token buffers and/or RNG states.\n\n## Stateful DataLoader\n\n`torchdata.stateful_dataloader.StatefulDataLoader` is a drop-in replacement for torch.utils.data.DataLoader which\nprovides state_dict and load_state_dict functionality. See\n[the Stateful DataLoader main page](torchdata/stateful_dataloader) for more information and examples. Also check out the\nexamples\n[in this Colab notebook](https://colab.research.google.com/drive/1tonoovEd7Tsi8EW8ZHXf0v3yHJGwZP8M?usp=sharing).\n\n## Installation\n\n### Version Compatibility\n\nThe following is the corresponding `torchdata` versions and supported Python versions.\n\n| `torch` | `torchdata` | `python` |\n| -------------------- | ------------------ | ----------------- |\n| `master` / `nightly` | `main` / `nightly` | `>=3.9`, `<=3.12` |\n| `2.4.0` | `0.8.0` | `>=3.8`, `<=3.12` |\n| `2.0.0` | `0.6.0` | `>=3.8`, `<=3.11` |\n| `1.13.1` | `0.5.1` | `>=3.7`, `<=3.10` |\n| `1.12.1` | `0.4.1` | `>=3.7`, `<=3.10` |\n| `1.12.0` | `0.4.0` | `>=3.7`, `<=3.10` |\n| `1.11.0` | `0.3.0` | `>=3.7`, `<=3.10` |\n\n### Local pip or conda\n\nFirst, set up an environment. We will be installing a PyTorch binary as well as torchdata. If you're using conda, create\na conda environment:\n\n```bash\nconda create --name torchdata\nconda activate torchdata\n```\n\nIf you wish to use `venv` instead:\n\n```bash\npython -m venv torchdata-env\nsource torchdata-env/bin/activate\n```\n\nInstall torchdata:\n\nUsing pip:\n\n```bash\npip install torchdata\n```\n\nUsing conda:\n\n```bash\nconda install -c pytorch torchdata\n```\n\n### From source\n\n```bash\npip install .\n```\n\nIn case building TorchData from source fails, install the nightly version of PyTorch following the linked guide on the\n[contributing page](CONTRIBUTING.md#install-pytorch-nightly).\n\n### From nightly\n\nThe nightly version of TorchData is also provided and updated daily from main branch.\n\nUsing pip:\n\n```bash\npip install --pre torchdata --extra-index-url https://download.pytorch.org/whl/nightly/cpu\n```\n\nUsing conda:\n\n```bash\nconda install torchdata -c pytorch-nightly\n```\n\n## Contributing\n\nWe welcome PRs! See the [CONTRIBUTING](CONTRIBUTING.md) file.\n\n## Beta Usage and Feedback\n\nWe'd love to hear from and work with early adopters to shape our designs. Please reach out by raising an issue if you're\ninterested in using this tooling for your project.\n\n## License\n\nTorchData is BSD licensed, as found in the [LICENSE](LICENSE) file.\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "Composable data loading modules for PyTorch",
"version": "0.9.0",
"project_urls": {
"Homepage": "https://github.com/pytorch/data"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "87284980c2d94329f01cc7477be5c1c42dd955bfb840e49bf9dc20545c2fc3bd",
"md5": "625cdf23dfd194adc492ed8d4cd1f585",
"sha256": "f3f3bb4d519e59a4c0fcb71d41c594ba2baf8a211fadb40a1a7780d7fb80faa5"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp310-cp310-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "625cdf23dfd194adc492ed8d4cd1f585",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.9",
"size": 5004725,
"upload_time": "2024-10-21T19:13:55",
"upload_time_iso_8601": "2024-10-21T19:13:55.441658Z",
"url": "https://files.pythonhosted.org/packages/87/28/4980c2d94329f01cc7477be5c1c42dd955bfb840e49bf9dc20545c2fc3bd/torchdata-0.9.0-cp310-cp310-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ecd286a99c579dc30a889cd2ebbfbbddfe844a12789225fd6811f97fe4d9ded1",
"md5": "e65cc0ac21c6202fe8074fbff68b23d7",
"sha256": "594f1c32934c44d699737d3444cbe6a6f3cecd51047469b19fb99c865bb03eb2"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp310-cp310-manylinux1_x86_64.whl",
"has_sig": false,
"md5_digest": "e65cc0ac21c6202fe8074fbff68b23d7",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.9",
"size": 2656228,
"upload_time": "2024-10-21T19:14:00",
"upload_time_iso_8601": "2024-10-21T19:14:00.803213Z",
"url": "https://files.pythonhosted.org/packages/ec/d2/86a99c579dc30a889cd2ebbfbbddfe844a12789225fd6811f97fe4d9ded1/torchdata-0.9.0-cp310-cp310-manylinux1_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "70526c629b2eaddb4c41bf7844d83d0639082b7209f575b8fff5436225397e30",
"md5": "87fb6887d8e59bb347a1fcc6e5d7d6c4",
"sha256": "7c8dfdf2af1b3127b8dbd7072e503ae37cf6cc86e4d02fde6e49ac0d8109fd43"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp310-cp310-win_amd64.whl",
"has_sig": false,
"md5_digest": "87fb6887d8e59bb347a1fcc6e5d7d6c4",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.9",
"size": 1355173,
"upload_time": "2024-10-21T19:14:09",
"upload_time_iso_8601": "2024-10-21T19:14:09.739086Z",
"url": "https://files.pythonhosted.org/packages/70/52/6c629b2eaddb4c41bf7844d83d0639082b7209f575b8fff5436225397e30/torchdata-0.9.0-cp310-cp310-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d0972c5a52db867015924b63d200700e070ca1bfffa9aa327f28b849a2555565",
"md5": "68bdee86334007891d6da9da29e29afa",
"sha256": "79781f557feb97e2a8d54563a5b484b367d658108107e7b4bb2acf58df1765ad"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp311-cp311-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "68bdee86334007891d6da9da29e29afa",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.9",
"size": 5004711,
"upload_time": "2024-10-21T19:13:53",
"upload_time_iso_8601": "2024-10-21T19:13:53.943409Z",
"url": "https://files.pythonhosted.org/packages/d0/97/2c5a52db867015924b63d200700e070ca1bfffa9aa327f28b849a2555565/torchdata-0.9.0-cp311-cp311-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "9e73811aaab2b76cc31646f111367abd1332849df60f96b093bf751e5ef6040d",
"md5": "709bdf77fe45c5e34939fcfddfd94e40",
"sha256": "a0e20f1f9365e8ff8a6f207f6cbb41c56e300c027b2f96b68f3371a9a89446d0"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp311-cp311-manylinux1_x86_64.whl",
"has_sig": false,
"md5_digest": "709bdf77fe45c5e34939fcfddfd94e40",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.9",
"size": 2656331,
"upload_time": "2024-10-21T19:13:58",
"upload_time_iso_8601": "2024-10-21T19:13:58.050152Z",
"url": "https://files.pythonhosted.org/packages/9e/73/811aaab2b76cc31646f111367abd1332849df60f96b093bf751e5ef6040d/torchdata-0.9.0-cp311-cp311-manylinux1_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8c73328ba7fa2ef807d0eaa2bcf631eb7f537ea4aa4b22e61a95382bef1e08f0",
"md5": "e2fa2bf635657264df874af76e77777c",
"sha256": "f9c36cfd9fe86c9b4f61bdde8344be4312b84d7a4ffdf2e45e24d66d0198f5aa"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp311-cp311-win_amd64.whl",
"has_sig": false,
"md5_digest": "e2fa2bf635657264df874af76e77777c",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.9",
"size": 1355094,
"upload_time": "2024-10-21T19:14:07",
"upload_time_iso_8601": "2024-10-21T19:14:07.786010Z",
"url": "https://files.pythonhosted.org/packages/8c/73/328ba7fa2ef807d0eaa2bcf631eb7f537ea4aa4b22e61a95382bef1e08f0/torchdata-0.9.0-cp311-cp311-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a409a437eed49dccc1b50e43446e8f58721ea8b63eeaeb23128bd158dfd432a4",
"md5": "c4848827dbc02d6c80d4975a3fc7eab5",
"sha256": "5aa08868ee14c1afae0d88d86398cf623d716ec030b8c66237c2be4fbf0932ea"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp312-cp312-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "c4848827dbc02d6c80d4975a3fc7eab5",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.9",
"size": 5005259,
"upload_time": "2024-10-21T19:13:50",
"upload_time_iso_8601": "2024-10-21T19:13:50.794614Z",
"url": "https://files.pythonhosted.org/packages/a4/09/a437eed49dccc1b50e43446e8f58721ea8b63eeaeb23128bd158dfd432a4/torchdata-0.9.0-cp312-cp312-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "dc6c7678aafb57cbd932666433e9edc68639a9f31af57d6dee1ac36df50a3f0a",
"md5": "e8bf076a406dc4b1786ea7b61e402a35",
"sha256": "3b437e1305998efae3f0b6f46c71a5cb154c1661b98528238dad64b559203dfc"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp312-cp312-manylinux1_x86_64.whl",
"has_sig": false,
"md5_digest": "e8bf076a406dc4b1786ea7b61e402a35",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.9",
"size": 2655999,
"upload_time": "2024-10-21T19:13:59",
"upload_time_iso_8601": "2024-10-21T19:13:59.469328Z",
"url": "https://files.pythonhosted.org/packages/dc/6c/7678aafb57cbd932666433e9edc68639a9f31af57d6dee1ac36df50a3f0a/torchdata-0.9.0-cp312-cp312-manylinux1_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b552af77cbeec30f5527512ed6a0fd6ff936ce98e48ea8b037b40a5c504e0950",
"md5": "c610e236e78af506d11d521297729a76",
"sha256": "ba0d2c6364576f3ee8a077dd1dbd0ce26d8e7b1bb4cb04282d0021dad989891f"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp312-cp312-win_amd64.whl",
"has_sig": false,
"md5_digest": "c610e236e78af506d11d521297729a76",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.9",
"size": 1355152,
"upload_time": "2024-10-21T19:14:04",
"upload_time_iso_8601": "2024-10-21T19:14:04.336100Z",
"url": "https://files.pythonhosted.org/packages/b5/52/af77cbeec30f5527512ed6a0fd6ff936ce98e48ea8b037b40a5c504e0950/torchdata-0.9.0-cp312-cp312-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "605ccf2a02763450176971aade9e7b39b4175dee9328731781e29fec015cb646",
"md5": "74d92a75fc9dac1aaad5ec39b85ec217",
"sha256": "d0d9e32d63f4bc9ae780c1f956cd8b2dc1c8db34aa236dc174660d22c7a6b394"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp39-cp39-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "74d92a75fc9dac1aaad5ec39b85ec217",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 5004837,
"upload_time": "2024-10-21T19:14:05",
"upload_time_iso_8601": "2024-10-21T19:14:05.716824Z",
"url": "https://files.pythonhosted.org/packages/60/5c/cf2a02763450176971aade9e7b39b4175dee9328731781e29fec015cb646/torchdata-0.9.0-cp39-cp39-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0096207b562d248a6aad5e1e481ea86a30989f983c13b19bad41038e541253a6",
"md5": "5eb8ac844281b51d86f16c5b8ddacafa",
"sha256": "19085d5db150ce1457ca19b64f5434d9d4ab1d266f6a89716a18852103bdfba5"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp39-cp39-manylinux1_x86_64.whl",
"has_sig": false,
"md5_digest": "5eb8ac844281b51d86f16c5b8ddacafa",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 2656259,
"upload_time": "2024-10-21T19:13:52",
"upload_time_iso_8601": "2024-10-21T19:13:52.367242Z",
"url": "https://files.pythonhosted.org/packages/00/96/207b562d248a6aad5e1e481ea86a30989f983c13b19bad41038e541253a6/torchdata-0.9.0-cp39-cp39-manylinux1_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "1122158a30519ec270def73c0af75ea5328dc3366d4da5eecada20b1abb3e611",
"md5": "59c2e87a3aa8b85f49d2d8d5d895e020",
"sha256": "435aac61ad043ce79e61edee8d24e8bffcf2c969cc72c3d8e6c2ba0c1796cd29"
},
"downloads": -1,
"filename": "torchdata-0.9.0-cp39-cp39-win_amd64.whl",
"has_sig": false,
"md5_digest": "59c2e87a3aa8b85f49d2d8d5d895e020",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1354423,
"upload_time": "2024-10-21T19:14:02",
"upload_time_iso_8601": "2024-10-21T19:14:02.359584Z",
"url": "https://files.pythonhosted.org/packages/11/22/158a30519ec270def73c0af75ea5328dc3366d4da5eecada20b1abb3e611/torchdata-0.9.0-cp39-cp39-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-21 19:13:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pytorch",
"github_project": "data",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "torchdata"
}