slakh-dataset


Nameslakh-dataset JSON
Version 0.1.19 PyPI version JSON
download
home_page
SummaryUnofficial PyTorch dataset for Slakh
upload_time2021-05-09 18:04:37
maintainer
docs_urlNone
authorHenrik Grønbech
requires_python>=3.7,<4.0
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Slakh PyTorch Dataset

Unofficial PyTorch dataset for [Slakh](http://www.slakh.com/).

This project is a work in progress, expect breaking changes!

## Roadmap

### Automatic music transcription (AMT) usecase with audio and labels

- [x] Specify dataset split (`original`, `splits_v2`, `redux`)
- [x] Add new splits (`redux_no_pitch_bend`, ...) (Should also be filed upstream) (implemented by `skip_pitch_bend_tracks`)
- [x] Load audio `mix.flac` (all the instruments comined)
- [x] Load individual audio mixes (need to combine audio in a streaming fashion)
- [x] Specify `train`, `validation` or `test` group
- [x] Choose sequence length
- [x] Reproducable load sequences (usefull for validation group to get consistent results)
- [ ] Add more instruments (`eletric-bass`, `piano`, `guitar`, ...)
- [x] Choose between having audio in memory or stream from disk (solved by `max_files_in_memory`)
- [x] Add to pip

### Audio source separation usecase with different audio mixes
- [ ] List to come


## Usage

1. Download the Slakh dataset (see the official [website](http://www.slakh.com/)). It's about 100GB compressed so expect using some time on this point.

2. Install the Python package with pip:
```bash
pip install slakh-dataset
```

3. Convert the audio to 16 kHz (see https://github.com/ethman/slakh-utils)

4. You can use the dataset (AMT usecase):

```python
from torch.utils.data import DataLoader
from slakh_dataset import SlakhAmtDataset


dataset = SlakhAmtDataset(
    path='path/to/slakh-16khz-folder'
    split='redux', # 'splits_v2','redux-no-pitch-bend'
    audio='mix.flac', # 'individual'
    instrument='electric-bass', # or `midi_programs`
    # midi_programs=[33, 34, 35, 36, 37],
    groups=['train'],
    skip_pitch_bend_tracks=True,
    sequence_length=327680,
    max_files_in_memory=200,
)

batch_size = 8
loader = DataLoader(dataset, batch_size, shuffle=True, drop_last=True)

# train model on dataset...
```

## Acknowledgement

- This code is based on the dataset in [Onset and Frames](https://github.com/jongwook/onsets-and-frames) by Jong Wook Kim which is MIT Lisenced.

- Slakh http://www.slakh.com/



            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "slakh-dataset",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Henrik Gr\u00f8nbech",
    "author_email": "henrikgronbech@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ea/68/07785e3a4178cd53aa0fece70ddbf7951224bae55c0c50f362ba00dc7da6/slakh-dataset-0.1.19.tar.gz",
    "platform": "",
    "description": "# Slakh PyTorch Dataset\n\nUnofficial PyTorch dataset for [Slakh](http://www.slakh.com/).\n\nThis project is a work in progress, expect breaking changes!\n\n## Roadmap\n\n### Automatic music transcription (AMT) usecase with audio and labels\n\n- [x] Specify dataset split (`original`, `splits_v2`, `redux`)\n- [x] Add new splits (`redux_no_pitch_bend`, ...) (Should also be filed upstream) (implemented by `skip_pitch_bend_tracks`)\n- [x] Load audio `mix.flac` (all the instruments comined)\n- [x] Load individual audio mixes (need to combine audio in a streaming fashion)\n- [x] Specify `train`, `validation` or `test` group\n- [x] Choose sequence length\n- [x] Reproducable load sequences (usefull for validation group to get consistent results)\n- [ ] Add more instruments (`eletric-bass`, `piano`, `guitar`, ...)\n- [x] Choose between having audio in memory or stream from disk (solved by `max_files_in_memory`)\n- [x] Add to pip\n\n### Audio source separation usecase with different audio mixes\n- [ ] List to come\n\n\n## Usage\n\n1. Download the Slakh dataset (see the official [website](http://www.slakh.com/)). It's about 100GB compressed so expect using some time on this point.\n\n2. Install the Python package with pip:\n```bash\npip install slakh-dataset\n```\n\n3. Convert the audio to 16 kHz (see https://github.com/ethman/slakh-utils)\n\n4. You can use the dataset (AMT usecase):\n\n```python\nfrom torch.utils.data import DataLoader\nfrom slakh_dataset import SlakhAmtDataset\n\n\ndataset = SlakhAmtDataset(\n    path='path/to/slakh-16khz-folder'\n    split='redux', # 'splits_v2','redux-no-pitch-bend'\n    audio='mix.flac', # 'individual'\n    instrument='electric-bass', # or `midi_programs`\n    # midi_programs=[33, 34, 35, 36, 37],\n    groups=['train'],\n    skip_pitch_bend_tracks=True,\n    sequence_length=327680,\n    max_files_in_memory=200,\n)\n\nbatch_size = 8\nloader = DataLoader(dataset, batch_size, shuffle=True, drop_last=True)\n\n# train model on dataset...\n```\n\n## Acknowledgement\n\n- This code is based on the dataset in [Onset and Frames](https://github.com/jongwook/onsets-and-frames) by Jong Wook Kim which is MIT Lisenced.\n\n- Slakh http://www.slakh.com/\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Unofficial PyTorch dataset for Slakh",
    "version": "0.1.19",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "a004eab8ee504ed2da06d04b2d1fb8f6",
                "sha256": "e08b2cd789ad5a2ae199eb2a8c29025ac3fa33b70158211a756601cf5b768cfc"
            },
            "downloads": -1,
            "filename": "slakh_dataset-0.1.19-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a004eab8ee504ed2da06d04b2d1fb8f6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7,<4.0",
            "size": 49068,
            "upload_time": "2021-05-09T18:04:39",
            "upload_time_iso_8601": "2021-05-09T18:04:39.541336Z",
            "url": "https://files.pythonhosted.org/packages/9c/db/47fd531f61555bc1f951b94656863eee45f06480480551295bc58a784e5c/slakh_dataset-0.1.19-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "c70bfa995c63c2b1c22ddda36a34aaaf",
                "sha256": "fc661110eeb0831fa74724646bc45daa1f4518439c5f0dca33dd97e57224d582"
            },
            "downloads": -1,
            "filename": "slakh-dataset-0.1.19.tar.gz",
            "has_sig": false,
            "md5_digest": "c70bfa995c63c2b1c22ddda36a34aaaf",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7,<4.0",
            "size": 48262,
            "upload_time": "2021-05-09T18:04:37",
            "upload_time_iso_8601": "2021-05-09T18:04:37.731871Z",
            "url": "https://files.pythonhosted.org/packages/ea/68/07785e3a4178cd53aa0fece70ddbf7951224bae55c0c50f362ba00dc7da6/slakh-dataset-0.1.19.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2021-05-09 18:04:37",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "slakh-dataset"
}
        
Elapsed time: 0.24974s