Name | symurbench JSON |
Version |
1.0.0
JSON |
| download |
home_page | None |
Summary | SyMuRBench: Benchmark for symbolic music representations |
upload_time | 2025-08-15 16:52:26 |
maintainer | None |
docs_url | None |
author | Peter Strepetov, Dmitrii Kovalev |
requires_python | >=3.10.0 |
license | MIT License
Copyright (c) 2025 Petr Strepetov and Dmitrii Kovalev
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE. |
keywords |
artificial intelligence
midi
mir
music
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<p align="center">
<img width="300" src="docs/assets/logo.jpg"/>
</p>
<h1 align="center"><i>SyMuRBench</i></h1>
<p align="center"><i>Benchmark for Symbolic Music Representations</i></p>
[](https://pypi.python.org/pypi/symurbench/)
[](https://github.com/Mintas/SyMuRBench/blob/main/LICENSE)
## 1. Overview
SyMuRBench is a versatile benchmark designed to compare vector representations of symbolic music. We provide standardized test splits from well-known datasets and strongly encourage authors to **exclude files from these splits** when training models to ensure fair evaluation. Additionally, we introduce a novel **score-performance retrieval task** to evaluate the alignment between symbolic scores and their performed versions.
## 2. Tasks Description
| Task Name | Source Dataset | Task Type | # of Classes | # of Files | Default Metrics |
|-------------------------------|--------------|--------------------------|-------------|------------------|--------------------------------------------------|
| ComposerClassificationASAP | ASAP | Multiclass Classification | 7 | 197 | Weighted F1 Score, Balanced Accuracy |
| GenreClassificationMMD | MetaMIDI | Multiclass Classification | 7 | 2,795 | Weighted F1 Score, Balanced Accuracy |
| GenreClassificationWMTX | WikiMT-X | Multiclass Classification | 8 | 985 | Weighted F1 Score, Balanced Accuracy |
| EmotionClassificationEMOPIA | Emopia | Multiclass Classification | 4 | 191 | Weighted F1 Score, Balanced Accuracy |
| EmotionClassificationMIREX | MIREX | Multiclass Classification | 5 | 163 | Weighted F1 Score, Balanced Accuracy |
| InstrumentDetectionMMD | MetaMIDI | Multilabel Classification | 128 | 4,675 | Weighted F1 Score |
| ScorePerformanceRetrievalASAP | ASAP | Retrieval | - | 438 (219 pairs) | R@1, R@5, R@10, Median Rank |
> **Note**: "ScorePerformanceRetrievalASAP" evaluates how well a model retrieves the correct performed version given a symbolic score (and vice versa), using paired score-performance MIDI files.
---
## 3. Baseline Features
As baselines, we provide precomputed features from [**music21**](https://github.com/cuthbertLab/music21) and [**jSymbolic2**](https://github.com/DDMAL/jSymbolic2). A `FeatureExtractor` for music21 is available in `src/symurbench/music21_extractor.py`.
---
## 4. Installation
Install the package via pip:
```bash
pip install symurbench
```
Then download the datasets and (optionally) precomputed features:
```python
from symurbench.utils import load_datasets
output_folder = "symurbench_data" # Absolute or relative path to save data
load_datasets(
output_folder=output_folder,
load_features=True # Downloads precomputed music21 & jSymbolic features
)
```
---
## 4. Usage Examples.
**Example 1: Using Precomputed Features**
Run benchmark on specific tasks using cached music21 and jSymbolic features.
```python
from symurbench.benchmark import Benchmark
from symurbench.feature_extractor import PersistentFeatureExtractor
path_to_music21_features = "symurbench_data/features/music21_full_dataset.parquet"
path_to_jsymbolic_features = "symurbench_data/features/jsymbolic_full_dataset.parquet"
m21_pfe = PersistentFeatureExtractor(
persistence_path=path_to_music21_features,
use_cached=True,
name="music21"
)
jsymb_pfe = PersistentFeatureExtractor(
persistence_path=path_to_jsymbolic_features,
use_cached=True,
name="jSymbolic"
)
benchmark = Benchmark(
feature_extractors_list=[m21_pfe, jsymb_pfe],
tasks=[ # By default, if no specific tasks are specified, the benchmark will run all tasks.
"ComposerClassificationASAP",
"ScorePerformanceRetrievalASAP"
]
)
benchmark.run_all_tasks()
benchmark.display_result(return_ci=True, alpha=0.05)
```
> **Tip**: If tasks is omitted, all available tasks will be run by default.
*Output Example*

**Example 2: Using a Configuration Dictionary**
Run benchmark with custom dataset paths and AutoML configuration.
```python
from symurbench.benchmark import Benchmark
from symurbench.music21_extractor import Music21Extractor
from symurbench.constant import DEFAULT_LAML_CONFIG_PATHS # dict with paths to AutoML configs
multiclass_task_automl_cfg_path = DEFAULT_LAML_CONFIG_PATHS["multiclass"]
print(f"AutoML config path: {multiclass_task_automl_cfg_path}")
config = {
"ComposerClassificationASAP": {
"metadata_csv_path":"symurbench_data/datasets/composer_and_retrieval_datasets/metadata_composer_dataset.csv",
"files_dir_path":"symurbench_data/datasets/composer_and_retrieval_datasets/",
"automl_config_path":multiclass_task_automl_cfg_path
}
}
m21_fe = Music21Extractor()
benchmark = Benchmark.init_from_config(
feature_extractors_list=[m21_fe],
tasks_config=config
)
benchmark.run_all_tasks()
benchmark.display_result()
```
**Example 3: Using a YAML Configuration File**
Load task configurations from a YAML file (e.g., dataset paths, AutoML config paths).
```python
from symurbench.benchmark import Benchmark
from symurbench.music21_extractor import Music21Extractor
from symurbench.constant import DATASETS_CONFIG_PATH # path to config with datasets paths
print(f"Datasets config path: {DATASETS_CONFIG_PATH}")
m21_fe = Music21Extractor()
benchmark = Benchmark.init_from_config_file(
feature_extractors_list=[m21_fe],
tasks_config_path=DATASETS_CONFIG_PATH
)
benchmark.run_all_tasks()
benchmark.display_result()
```
**Example 4: Saving Results to CSV**
Run benchmark and export results to a CSV file using pandas.
```python
from symurbench.benchmark import Benchmark
from symurbench.music21_extractor import Music21Extractor
path_to_music21_features = "symurbench_data/features/music21_features.parquet"
m21_pfe = PersistentFeatureExtractor(
feature_extractor=Music21Extractor(),
persistence_path=path_to_music21_features,
use_cached=False,
name="music21"
)
benchmark = Benchmark(
feature_extractors_list=[m21_pfe],
tasks=[
"ComposerClassificationASAP",
"ScorePerformanceRetrievalASAP"
]
)
benchmark.run_all_tasks()
results_df = benchmark.get_result_df(round_num=3, return_ci=True)
results_df.to_csv("results.csv")
```
> **๐ก**: `round_num=3`: Round metrics to 3 decimal places.
`return_ci=True`: Include confidence intervals in the output.
## 6. Notes & Best Practices
- ๐ **Avoid data leakage**: Do not include test-set files in your training data to ensure fair and valid evaluation.
- ๐ **Reproducibility**: Use fixed random seeds and consistent preprocessing pipelines to make experiments reproducible.
- ๐ **File paths**: Ensure paths in config files are correct and accessible.
- ๐งช **Custom extractors**: You can implement your own `FeatureExtractor` subclass by inheriting from the base `FeatureExtractor` class and implementing the `extract` method.
## 7. Citation
If you use SyMuRBench in your research, please cite:
```bibtex
@inproceedings{symurbench2025,
author = {Petr Strepetov and Dmitrii Kovalev},
title = {SyMuRBench: Benchmark for Symbolic Music Representations},
booktitle = {Proceedings of the 3rd International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice (McGE '25)},
year = {2025},
pages = {9},
publisher = {ACM},
address = {Dublin, Ireland},
doi = {10.1145/3746278.3759392}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "symurbench",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10.0",
"maintainer_email": null,
"keywords": "artificial intelligence, midi, mir, music",
"author": "Peter Strepetov, Dmitrii Kovalev",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/64/39/021b631b349d5ab3b3395bada2f037387779fdd69367f664341428cf5a58/symurbench-1.0.0.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n <img width=\"300\" src=\"docs/assets/logo.jpg\"/>\n</p>\n\n<h1 align=\"center\"><i>SyMuRBench</i></h1>\n<p align=\"center\"><i>Benchmark for Symbolic Music Representations</i></p>\n\n[](https://pypi.python.org/pypi/symurbench/)\n[](https://github.com/Mintas/SyMuRBench/blob/main/LICENSE)\n\n## 1. Overview\n\nSyMuRBench is a versatile benchmark designed to compare vector representations of symbolic music. We provide standardized test splits from well-known datasets and strongly encourage authors to **exclude files from these splits** when training models to ensure fair evaluation. Additionally, we introduce a novel **score-performance retrieval task** to evaluate the alignment between symbolic scores and their performed versions.\n\n## 2. Tasks Description\n\n| Task Name | Source Dataset | Task Type | # of Classes | # of Files | Default Metrics |\n|-------------------------------|--------------|--------------------------|-------------|------------------|--------------------------------------------------|\n| ComposerClassificationASAP | ASAP | Multiclass Classification | 7 | 197 | Weighted F1 Score, Balanced Accuracy |\n| GenreClassificationMMD | MetaMIDI | Multiclass Classification | 7 | 2,795 | Weighted F1 Score, Balanced Accuracy |\n| GenreClassificationWMTX | WikiMT-X | Multiclass Classification | 8 | 985 | Weighted F1 Score, Balanced Accuracy |\n| EmotionClassificationEMOPIA | Emopia | Multiclass Classification | 4 | 191 | Weighted F1 Score, Balanced Accuracy |\n| EmotionClassificationMIREX | MIREX | Multiclass Classification | 5 | 163 | Weighted F1 Score, Balanced Accuracy |\n| InstrumentDetectionMMD | MetaMIDI | Multilabel Classification | 128 | 4,675 | Weighted F1 Score |\n| ScorePerformanceRetrievalASAP | ASAP | Retrieval | - | 438 (219 pairs) | R@1, R@5, R@10, Median Rank |\n\n> **Note**: \"ScorePerformanceRetrievalASAP\" evaluates how well a model retrieves the correct performed version given a symbolic score (and vice versa), using paired score-performance MIDI files.\n\n---\n\n## 3. Baseline Features\n\nAs baselines, we provide precomputed features from [**music21**](https://github.com/cuthbertLab/music21) and [**jSymbolic2**](https://github.com/DDMAL/jSymbolic2). A `FeatureExtractor` for music21 is available in `src/symurbench/music21_extractor.py`.\n\n---\n\n## 4. Installation\n\nInstall the package via pip:\n\n```bash\npip install symurbench\n```\n\nThen download the datasets and (optionally) precomputed features:\n\n```python\nfrom symurbench.utils import load_datasets\n\noutput_folder = \"symurbench_data\" # Absolute or relative path to save data\nload_datasets(\n output_folder=output_folder,\n load_features=True # Downloads precomputed music21 & jSymbolic features\n)\n```\n\n---\n\n## 4. Usage Examples.\n\n**Example 1: Using Precomputed Features**\n\nRun benchmark on specific tasks using cached music21 and jSymbolic features.\n\n```python\nfrom symurbench.benchmark import Benchmark\nfrom symurbench.feature_extractor import PersistentFeatureExtractor\n\npath_to_music21_features = \"symurbench_data/features/music21_full_dataset.parquet\"\npath_to_jsymbolic_features = \"symurbench_data/features/jsymbolic_full_dataset.parquet\"\n\nm21_pfe = PersistentFeatureExtractor(\n persistence_path=path_to_music21_features,\n use_cached=True,\n name=\"music21\"\n)\njsymb_pfe = PersistentFeatureExtractor(\n persistence_path=path_to_jsymbolic_features,\n use_cached=True,\n name=\"jSymbolic\"\n)\n\nbenchmark = Benchmark(\n feature_extractors_list=[m21_pfe, jsymb_pfe],\n tasks=[ # By default, if no specific tasks are specified, the benchmark will run all tasks.\n \"ComposerClassificationASAP\",\n \"ScorePerformanceRetrievalASAP\"\n ]\n)\n\nbenchmark.run_all_tasks()\nbenchmark.display_result(return_ci=True, alpha=0.05)\n```\n\n> **Tip**: If tasks is omitted, all available tasks will be run by default.\n\n*Output Example*\n\n\n\n\n**Example 2: Using a Configuration Dictionary**\n\nRun benchmark with custom dataset paths and AutoML configuration.\n\n```python\nfrom symurbench.benchmark import Benchmark\nfrom symurbench.music21_extractor import Music21Extractor\nfrom symurbench.constant import DEFAULT_LAML_CONFIG_PATHS # dict with paths to AutoML configs\n\nmulticlass_task_automl_cfg_path = DEFAULT_LAML_CONFIG_PATHS[\"multiclass\"]\nprint(f\"AutoML config path: {multiclass_task_automl_cfg_path}\")\n\nconfig = {\n \"ComposerClassificationASAP\": {\n \"metadata_csv_path\":\"symurbench_data/datasets/composer_and_retrieval_datasets/metadata_composer_dataset.csv\",\n \"files_dir_path\":\"symurbench_data/datasets/composer_and_retrieval_datasets/\",\n \"automl_config_path\":multiclass_task_automl_cfg_path\n }\n}\n\nm21_fe = Music21Extractor()\n\nbenchmark = Benchmark.init_from_config(\n feature_extractors_list=[m21_fe],\n tasks_config=config\n)\nbenchmark.run_all_tasks()\nbenchmark.display_result()\n```\n\n**Example 3: Using a YAML Configuration File**\n\nLoad task configurations from a YAML file (e.g., dataset paths, AutoML config paths).\n\n```python\nfrom symurbench.benchmark import Benchmark\nfrom symurbench.music21_extractor import Music21Extractor\nfrom symurbench.constant import DATASETS_CONFIG_PATH # path to config with datasets paths\n\nprint(f\"Datasets config path: {DATASETS_CONFIG_PATH}\")\n\nm21_fe = Music21Extractor()\n\nbenchmark = Benchmark.init_from_config_file(\n feature_extractors_list=[m21_fe],\n tasks_config_path=DATASETS_CONFIG_PATH\n)\nbenchmark.run_all_tasks()\nbenchmark.display_result()\n```\n\n**Example 4: Saving Results to CSV**\n\nRun benchmark and export results to a CSV file using pandas.\n\n```python\n\nfrom symurbench.benchmark import Benchmark\nfrom symurbench.music21_extractor import Music21Extractor\n\npath_to_music21_features = \"symurbench_data/features/music21_features.parquet\"\n\nm21_pfe = PersistentFeatureExtractor(\n feature_extractor=Music21Extractor(),\n persistence_path=path_to_music21_features,\n use_cached=False,\n name=\"music21\"\n)\n\nbenchmark = Benchmark(\n feature_extractors_list=[m21_pfe],\n tasks=[\n \"ComposerClassificationASAP\",\n \"ScorePerformanceRetrievalASAP\"\n ]\n)\nbenchmark.run_all_tasks()\nresults_df = benchmark.get_result_df(round_num=3, return_ci=True)\nresults_df.to_csv(\"results.csv\")\n```\n\n> **\ud83d\udca1**: `round_num=3`: Round metrics to 3 decimal places.\n`return_ci=True`: Include confidence intervals in the output.\n\n## 6. Notes & Best Practices\n\n- \ud83d\udd12 **Avoid data leakage**: Do not include test-set files in your training data to ensure fair and valid evaluation.\n- \ud83d\udd04 **Reproducibility**: Use fixed random seeds and consistent preprocessing pipelines to make experiments reproducible.\n- \ud83d\udcc1 **File paths**: Ensure paths in config files are correct and accessible.\n- \ud83e\uddea **Custom extractors**: You can implement your own `FeatureExtractor` subclass by inheriting from the base `FeatureExtractor` class and implementing the `extract` method.\n\n## 7. Citation\n\nIf you use SyMuRBench in your research, please cite:\n\n```bibtex\n@inproceedings{symurbench2025,\n author = {Petr Strepetov and Dmitrii Kovalev},\n title = {SyMuRBench: Benchmark for Symbolic Music Representations},\n booktitle = {Proceedings of the 3rd International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice (McGE '25)},\n year = {2025},\n pages = {9},\n publisher = {ACM},\n address = {Dublin, Ireland},\n doi = {10.1145/3746278.3759392}\n}\n```\n",
"bugtrack_url": null,
"license": "MIT License\n \n Copyright (c) 2025 Petr Strepetov and Dmitrii Kovalev\n \n Permission is hereby granted, free of charge, to any person obtaining a copy\n of this software and associated documentation files (the \"Software\"), to deal\n in the Software without restriction, including without limitation the rights\n to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n copies of the Software, and to permit persons to whom the Software is\n furnished to do so, subject to the following conditions:\n \n The above copyright notice and this permission notice shall be included in all\n copies or substantial portions of the Software.\n \n THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n SOFTWARE.",
"summary": "SyMuRBench: Benchmark for symbolic music representations",
"version": "1.0.0",
"project_urls": null,
"split_keywords": [
"artificial intelligence",
" midi",
" mir",
" music"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "75741daa9bac0bfd5391f6421b113fefe1b1432a29b24341a2ae76f7fd2d156c",
"md5": "e9249010ad1d84b9fad55fc8e1041dc1",
"sha256": "72864cc5394b7b75d453733b34c32a9c3721d831d6874cd76af7a652943fcadf"
},
"downloads": -1,
"filename": "symurbench-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e9249010ad1d84b9fad55fc8e1041dc1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10.0",
"size": 48003,
"upload_time": "2025-08-15T16:52:24",
"upload_time_iso_8601": "2025-08-15T16:52:24.330304Z",
"url": "https://files.pythonhosted.org/packages/75/74/1daa9bac0bfd5391f6421b113fefe1b1432a29b24341a2ae76f7fd2d156c/symurbench-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "6439021b631b349d5ab3b3395bada2f037387779fdd69367f664341428cf5a58",
"md5": "649ec35193061d873ec38dc3af19bcd2",
"sha256": "c06e8ac568877c18ac1821ec87256a26c066c1d7804ae1365f00117e0d510165"
},
"downloads": -1,
"filename": "symurbench-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "649ec35193061d873ec38dc3af19bcd2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10.0",
"size": 199265,
"upload_time": "2025-08-15T16:52:26",
"upload_time_iso_8601": "2025-08-15T16:52:26.044558Z",
"url": "https://files.pythonhosted.org/packages/64/39/021b631b349d5ab3b3395bada2f037387779fdd69367f664341428cf5a58/symurbench-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-15 16:52:26",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "symurbench"
}