# mirdata
Common loaders for Music Information Retrieval (MIR) datasets. Find the API documentation [here](https://mirdata.readthedocs.io/).
![CI status](https://github.com/mir-dataset-loaders/mirdata/actions/workflows/ci.yml/badge.svg)
![Formatting status](https://github.com/mir-dataset-loaders/mirdata/actions/workflows/formatting.yml/badge.svg)
![Linting status](https://github.com/mir-dataset-loaders/mirdata/actions/workflows/lint-python.yml/badge.svg)
[![codecov](https://codecov.io/gh/mir-dataset-loaders/mirdata/branch/master/graph/badge.svg)](https://codecov.io/gh/mir-dataset-loaders/mirdata)
[![Documentation Status](https://readthedocs.org/projects/mirdata/badge/?version=latest)](https://mirdata.readthedocs.io/en/latest/?badge=latest)
![GitHub](https://img.shields.io/github/license/mir-dataset-loaders/mirdata.svg)
This library provides tools for working with common MIR datasets, including tools for:
* downloading datasets to a common location and format
* validating that the files for a dataset are all present
* loading annotation files to a common format, consistent with the format required by [mir_eval](https://github.com/craffel/mir_eval)
* parsing track level metadata for detailed evaluations
### Installation
To install, simply run:
```python
pip install mirdata
```
### Quick example
```python
import mirdata
orchset = mirdata.initialize('orchset')
orchset.download() # download the dataset
orchset.validate() # validate that all the expected files are there
example_track = orchset.choice_track() # choose a random example track
print(example_track) # see the available data
```
See the [documentation](https://mirdata.readthedocs.io/) for more examples and the API reference.
### Currently supported datasets
Supported datasets include [AcousticBrainz](https://zenodo.org/record/2553414#.X8jTgulKhhE), [DALI](https://github.com/gabolsgabs/DALI), [Guitarset](http://github.com/marl/guitarset/), [MAESTRO](https://magenta.tensorflow.org/datasets/maestro), [TinySOL](https://www.orch-idea.org/), among many others.
For the **complete list** of supported datasets, see the [documentation](https://mirdata.readthedocs.io/en/stable/source/quick_reference.html)
### Citing
There are two ways of citing mirdata:
If you are using the library for your work, please cite the version you used as indexed at Zenodo:
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4355859.svg)](https://doi.org/10.5281/zenodo.4355859)
If you refer to mirdata's design principles, motivation etc., please cite the following [paper](https://zenodo.org/record/3527750#.X-Inp5NKhUI):
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3527750.svg)](https://doi.org/10.5281/zenodo.3527750)
```
"mirdata: Software for Reproducible Usage of Datasets"
Rachel M. Bittner, Magdalena Fuentes, David Rubinstein, Andreas Jansson, Keunwoo Choi, and Thor Kell
in International Society for Music Information Retrieval (ISMIR) Conference, 2019
```
```
@inproceedings{
bittner_fuentes_2019,
title={mirdata: Software for Reproducible Usage of Datasets},
author={Bittner, Rachel M and Fuentes, Magdalena and Rubinstein, David and Jansson, Andreas and Choi, Keunwoo and Kell, Thor},
booktitle={International Society for Music Information Retrieval (ISMIR) Conference},
year={2019}
}
```
When working with datasets, please cite the version of `mirdata` that you are using (given by the `DOI` above) **AND** include the reference of the dataset, which can be found in the respective dataset loader using the `cite()` method.
### Contributing a new dataset loader
We welcome contributions to this library, especially new datasets. Please see [contributing](https://mirdata.readthedocs.io/en/latest/source/contributing.html) for guidelines.
Raw data
{
"_id": null,
"home_page": "https://github.com/mir-dataset-loaders/mirdata",
"name": "mirdata",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "mir dataset loader audio",
"author": "",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/bb/73/3bcceb30df10ef145bbe695e6e51885b9267e87e5d07db49b10a2f75a470/mirdata-0.3.8.tar.gz",
"platform": null,
"description": "# mirdata\nCommon loaders for Music Information Retrieval (MIR) datasets. Find the API documentation [here](https://mirdata.readthedocs.io/).\n\n![CI status](https://github.com/mir-dataset-loaders/mirdata/actions/workflows/ci.yml/badge.svg)\n![Formatting status](https://github.com/mir-dataset-loaders/mirdata/actions/workflows/formatting.yml/badge.svg)\n![Linting status](https://github.com/mir-dataset-loaders/mirdata/actions/workflows/lint-python.yml/badge.svg)\n[![codecov](https://codecov.io/gh/mir-dataset-loaders/mirdata/branch/master/graph/badge.svg)](https://codecov.io/gh/mir-dataset-loaders/mirdata)\n[![Documentation Status](https://readthedocs.org/projects/mirdata/badge/?version=latest)](https://mirdata.readthedocs.io/en/latest/?badge=latest)\n![GitHub](https://img.shields.io/github/license/mir-dataset-loaders/mirdata.svg)\n\n\nThis library provides tools for working with common MIR datasets, including tools for:\n* downloading datasets to a common location and format\n* validating that the files for a dataset are all present \n* loading annotation files to a common format, consistent with the format required by [mir_eval](https://github.com/craffel/mir_eval)\n* parsing track level metadata for detailed evaluations\n\n\n### Installation\n\nTo install, simply run:\n\n```python\npip install mirdata\n```\n\n### Quick example\n```python\nimport mirdata\n\norchset = mirdata.initialize('orchset')\norchset.download() # download the dataset\norchset.validate() # validate that all the expected files are there\n\nexample_track = orchset.choice_track() # choose a random example track\nprint(example_track) # see the available data\n```\nSee the [documentation](https://mirdata.readthedocs.io/) for more examples and the API reference.\n\n\n### Currently supported datasets\n\n\nSupported datasets include [AcousticBrainz](https://zenodo.org/record/2553414#.X8jTgulKhhE), [DALI](https://github.com/gabolsgabs/DALI), [Guitarset](http://github.com/marl/guitarset/), [MAESTRO](https://magenta.tensorflow.org/datasets/maestro), [TinySOL](https://www.orch-idea.org/), among many others.\n\nFor the **complete list** of supported datasets, see the [documentation](https://mirdata.readthedocs.io/en/stable/source/quick_reference.html)\n\n\n### Citing\n\n\nThere are two ways of citing mirdata:\n\nIf you are using the library for your work, please cite the version you used as indexed at Zenodo:\n\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4355859.svg)](https://doi.org/10.5281/zenodo.4355859)\n\nIf you refer to mirdata's design principles, motivation etc., please cite the following [paper](https://zenodo.org/record/3527750#.X-Inp5NKhUI):\n\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3527750.svg)](https://doi.org/10.5281/zenodo.3527750)\n\n```\n\"mirdata: Software for Reproducible Usage of Datasets\"\nRachel M. Bittner, Magdalena Fuentes, David Rubinstein, Andreas Jansson, Keunwoo Choi, and Thor Kell\nin International Society for Music Information Retrieval (ISMIR) Conference, 2019\n```\n\n```\n@inproceedings{\n bittner_fuentes_2019,\n title={mirdata: Software for Reproducible Usage of Datasets},\n author={Bittner, Rachel M and Fuentes, Magdalena and Rubinstein, David and Jansson, Andreas and Choi, Keunwoo and Kell, Thor},\n booktitle={International Society for Music Information Retrieval (ISMIR) Conference},\n year={2019}\n}\n```\n\nWhen working with datasets, please cite the version of `mirdata` that you are using (given by the `DOI` above) **AND** include the reference of the dataset, which can be found in the respective dataset loader using the `cite()` method. \n\n### Contributing a new dataset loader\n\nWe welcome contributions to this library, especially new datasets. Please see [contributing](https://mirdata.readthedocs.io/en/latest/source/contributing.html) for guidelines.\n",
"bugtrack_url": null,
"license": "BSD-3-Clause",
"summary": "Common loaders for MIR datasets.",
"version": "0.3.8",
"project_urls": {
"Download": "http://github.com/mir-dataset-loaders/mirdata/releases",
"Homepage": "https://github.com/mir-dataset-loaders/mirdata"
},
"split_keywords": [
"mir",
"dataset",
"loader",
"audio"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "958294868d20301272782f0c240f506d0e8acc9aeec89412df1f6c1cd150debe",
"md5": "d69d47bc242343ba8a404dcde3d2026b",
"sha256": "dd5f3c5c8d463144fd6acdad5f13b485c324a1642db9913fe83590249be44979"
},
"downloads": -1,
"filename": "mirdata-0.3.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d69d47bc242343ba8a404dcde3d2026b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 17166957,
"upload_time": "2023-11-03T22:34:48",
"upload_time_iso_8601": "2023-11-03T22:34:48.545760Z",
"url": "https://files.pythonhosted.org/packages/95/82/94868d20301272782f0c240f506d0e8acc9aeec89412df1f6c1cd150debe/mirdata-0.3.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bb733bcceb30df10ef145bbe695e6e51885b9267e87e5d07db49b10a2f75a470",
"md5": "8d06a17f836d06ee03ac8841cde091dc",
"sha256": "b9e217e107f27d162ffcd866e86f062147faf4aa8260437b3eb1770805cc6c3e"
},
"downloads": -1,
"filename": "mirdata-0.3.8.tar.gz",
"has_sig": false,
"md5_digest": "8d06a17f836d06ee03ac8841cde091dc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 16555607,
"upload_time": "2023-11-03T22:34:51",
"upload_time_iso_8601": "2023-11-03T22:34:51.780938Z",
"url": "https://files.pythonhosted.org/packages/bb/73/3bcceb30df10ef145bbe695e6e51885b9267e87e5d07db49b10a2f75a470/mirdata-0.3.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-03 22:34:51",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mir-dataset-loaders",
"github_project": "mirdata",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "mirdata"
}