# libri-light-abx2
The ABX phonetic evaluation metric for unsupervised representation learning as used by the ZeroSpeech challenge, now with context-type options (on-triphone, within-context, any-context). This module is a reworking of https://github.com/zerospeech/libri-light-abx, which in turn is a wrapper around https://github.com/facebookresearch/libri-light/tree/main/eval
### Installation
You can install this module from pip directly using the following command :
`pip install zerospeech-libriabx2`
Or you can install from source by cloning this repository and running:
`pip install .`
As the final alternative, you can install into a conda environment by running:
`conda install -c conda-forge -c pytorch -c coml zerospeech-libriabx2 pytorch::pytorch`
### Usage
### From command line
```
usage: zrc-abx2 [-h] [--path_checkpoint PATH_CHECKPOINT]
[--file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}]
[--feature_size FEATURE_SIZE] [--cuda]
[--speaker_mode {all,within,across}]
[--context_mode {all,within,any}]
[--distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}]
[--max_size_group MAX_SIZE_GROUP]
[--max_x_across MAX_X_ACROSS] [--out OUT] [--seed SEED]
[--pooling {none,mean,hamming}] [--seq_norm]
[--max_size_seq MAX_SIZE_SEQ] [--strict]
path_data path_item_file
ABX metric
positional arguments:
path_data Path to directory containing the submission data
path_item_file Path to the .item file containing the timestamps and
transcriptions
optional arguments:
-h, --help show this help message and exit
--path_checkpoint PATH_CHECKPOINT
Path to a CPC checkpoint. If set, apply the model to
the input data to compute the features
--file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}
--feature_size FEATURE_SIZE
Size (in s) of one feature
--cuda Use the GPU to compute distances
--speaker_mode {all,within,across}
Choose the speaker mode of the ABX score to compute
--context_mode {all,within,any}
Choose the context mode of the ABX score to compute
--distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}
Choose the kind of distance to use to compute the ABX
score.
--max_size_group MAX_SIZE_GROUP
Max size of a group while computing the ABX score. A
small value will make the code faster but less
precise.
--max_x_across MAX_X_ACROSS
When computing the ABX across score, maximum number of
speaker X to sample per couple A,B. A small value will
make the code faster but less precise.
--out OUT Path where the results should be saved
--seed SEED Seed to use in random sampling.
--pooling {none,mean,hamming}
Type of pooling over frame representations of items.
--seq_norm Used for CPC features only. If activated, normalize
each batch of feature across the time channel before
computing ABX.
--max_size_seq MAX_SIZE_SEQ
Used for CPC features only. Maximal number of frames
to consider when computing a batch of features.
--strict Used for CPC features only. If activated, each batch
of feature will contain exactly max_size_seq frames.
```
### Python API
You can also call the abx evaluation from python code. You can use the following example:
```
import zrc_abx2
args = zrc_abx2.EvalArgs(
path_data= "/location/to/representations/",
path_item_file= "/location/to/file.item",
**other_options
)
result = zrc_abx2.EvalABX().eval_abx(args)
```
## Information on evaluation conditions
A new variable in this ABX version is context.
In the within-context condition, a, b, and x have the same surrounding context (i.e. the same preceding and following phoneme). any-context ignores the surrounding context; typically, it varies.
For the within-context and any-context comparison, use an item file that extracts phonemes (rather than XYZ triphones). For the on-triphone condition, which is still available, use an item file that extracts triphones (just like in the previous abx evaluation), and then run it within-context (which was the default behavior of the previous abx evaluation). any-context is not used for the on-triphone version due to excessive noise that would be included in the representation.
Like in the previous version, it is also possible to run within-speaker (a, b, x are all from the same speaker) and across-speaker (a and b are from the same speaker, x is from another) evaluations. So there are four phoneme-based evaluation combinations in total: within_s-within_c, within_s-any-c, across_s-within_c, across_s-any_c; and two triphone-based evaluation combinations: within_s-within_c, across_s-within_c.
Raw data
{
"_id": null,
"home_page": "",
"name": "zerospeech-libriabx2",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "speech,machine-learning,challenges,research-tool,benchmark-speech,zerospeech",
"author": "",
"author_email": "Mark Hallap <mark.hallap@mail.utoronto.ca>, CoML Team <dev@zerospeech.com>, Nicolas Hamilakis <nicolas.hamilakis@ens.psl.eu>",
"download_url": "https://files.pythonhosted.org/packages/e6/ca/3df0b37b497a33c3f43154e2552007711fed3de479d615440ec6553dbf88/zerospeech-libriabx2-0.9.8.tar.gz",
"platform": null,
"description": "# libri-light-abx2\n\nThe ABX phonetic evaluation metric for unsupervised representation learning as used by the ZeroSpeech challenge, now with context-type options (on-triphone, within-context, any-context). This module is a reworking of https://github.com/zerospeech/libri-light-abx, which in turn is a wrapper around https://github.com/facebookresearch/libri-light/tree/main/eval\n\n \n### Installation\n \nYou can install this module from pip directly using the following command : \n\n`pip install zerospeech-libriabx2`\n\nOr you can install from source by cloning this repository and running: \n\n`pip install .`\n\nAs the final alternative, you can install into a conda environment by running:\n\n`conda install -c conda-forge -c pytorch -c coml zerospeech-libriabx2 pytorch::pytorch`\n\n### Usage\n### From command line\n\n```\nusage: zrc-abx2 [-h] [--path_checkpoint PATH_CHECKPOINT]\n [--file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}]\n [--feature_size FEATURE_SIZE] [--cuda]\n [--speaker_mode {all,within,across}]\n [--context_mode {all,within,any}]\n [--distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}]\n [--max_size_group MAX_SIZE_GROUP]\n [--max_x_across MAX_X_ACROSS] [--out OUT] [--seed SEED]\n [--pooling {none,mean,hamming}] [--seq_norm]\n [--max_size_seq MAX_SIZE_SEQ] [--strict]\n path_data path_item_file\n\nABX metric\n\npositional arguments:\n path_data Path to directory containing the submission data\n path_item_file Path to the .item file containing the timestamps and\n transcriptions\n\noptional arguments:\n -h, --help show this help message and exit\n --path_checkpoint PATH_CHECKPOINT\n Path to a CPC checkpoint. If set, apply the model to\n the input data to compute the features\n --file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}\n --feature_size FEATURE_SIZE\n Size (in s) of one feature\n --cuda Use the GPU to compute distances\n --speaker_mode {all,within,across}\n Choose the speaker mode of the ABX score to compute\n --context_mode {all,within,any}\n Choose the context mode of the ABX score to compute\n --distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}\n Choose the kind of distance to use to compute the ABX\n score.\n --max_size_group MAX_SIZE_GROUP\n Max size of a group while computing the ABX score. A\n small value will make the code faster but less\n precise.\n --max_x_across MAX_X_ACROSS\n When computing the ABX across score, maximum number of\n speaker X to sample per couple A,B. A small value will\n make the code faster but less precise.\n --out OUT Path where the results should be saved\n --seed SEED Seed to use in random sampling.\n --pooling {none,mean,hamming}\n Type of pooling over frame representations of items.\n --seq_norm Used for CPC features only. If activated, normalize\n each batch of feature across the time channel before\n computing ABX.\n --max_size_seq MAX_SIZE_SEQ\n Used for CPC features only. Maximal number of frames\n to consider when computing a batch of features.\n --strict Used for CPC features only. If activated, each batch\n of feature will contain exactly max_size_seq frames.\n```\n### Python API\nYou can also call the abx evaluation from python code. You can use the following example:\n\n```\nimport zrc_abx2\n\nargs = zrc_abx2.EvalArgs(\n path_data= \"/location/to/representations/\",\n path_item_file= \"/location/to/file.item\",\n **other_options\n)\n\nresult = zrc_abx2.EvalABX().eval_abx(args)\n```\n\n## Information on evaluation conditions\nA new variable in this ABX version is context.\nIn the within-context condition, a, b, and x have the same surrounding context (i.e. the same preceding and following phoneme). any-context ignores the surrounding context; typically, it varies. \n\nFor the within-context and any-context comparison, use an item file that extracts phonemes (rather than XYZ triphones). For the on-triphone condition, which is still available, use an item file that extracts triphones (just like in the previous abx evaluation), and then run it within-context (which was the default behavior of the previous abx evaluation). any-context is not used for the on-triphone version due to excessive noise that would be included in the representation.\n\nLike in the previous version, it is also possible to run within-speaker (a, b, x are all from the same speaker) and across-speaker (a and b are from the same speaker, x is from another) evaluations. So there are four phoneme-based evaluation combinations in total: within_s-within_c, within_s-any-c, across_s-within_c, across_s-any_c; and two triphone-based evaluation combinations: within_s-within_c, across_s-within_c. \n\n",
"bugtrack_url": null,
"license": "",
"summary": "Package implementing a revamped method to the librilight-abx.",
"version": "0.9.8",
"project_urls": {
"documentation": "https://zerospeech.com/toolbox/",
"homepage": "https://zerospeech.com/",
"repository": "https://github.com/zerospeech/libri-light-abx2"
},
"split_keywords": [
"speech",
"machine-learning",
"challenges",
"research-tool",
"benchmark-speech",
"zerospeech"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e6ca3df0b37b497a33c3f43154e2552007711fed3de479d615440ec6553dbf88",
"md5": "c87535edfd899da80f5750a4e0a69c24",
"sha256": "485ffbd6a227af11c828db701a33096ec87a314fa5911279712396365f087900"
},
"downloads": -1,
"filename": "zerospeech-libriabx2-0.9.8.tar.gz",
"has_sig": false,
"md5_digest": "c87535edfd899da80f5750a4e0a69c24",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 163275,
"upload_time": "2023-06-19T12:59:53",
"upload_time_iso_8601": "2023-06-19T12:59:53.433620Z",
"url": "https://files.pythonhosted.org/packages/e6/ca/3df0b37b497a33c3f43154e2552007711fed3de479d615440ec6553dbf88/zerospeech-libriabx2-0.9.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-19 12:59:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "zerospeech",
"github_project": "libri-light-abx2",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "zerospeech-libriabx2"
}