zerospeech-libriabx2


Namezerospeech-libriabx2 JSON
Version 0.9.8 PyPI version JSON
download
home_page
SummaryPackage implementing a revamped method to the librilight-abx.
upload_time2023-06-19 12:59:53
maintainer
docs_urlNone
author
requires_python>=3.8
license
keywords speech machine-learning challenges research-tool benchmark-speech zerospeech
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # libri-light-abx2

The ABX phonetic evaluation metric for unsupervised representation learning as used by the ZeroSpeech challenge, now with context-type options (on-triphone, within-context, any-context). This module is a reworking of https://github.com/zerospeech/libri-light-abx, which in turn is a wrapper around https://github.com/facebookresearch/libri-light/tree/main/eval

  
### Installation
  
You can install this module from pip directly using the following command : 

`pip install zerospeech-libriabx2`

Or you can install from source by cloning this repository and running: 

`pip install .`

As the final alternative, you can install into a conda environment by running:

`conda install -c conda-forge -c pytorch -c coml zerospeech-libriabx2 pytorch::pytorch`

### Usage
### From command line

```
usage: zrc-abx2 [-h] [--path_checkpoint PATH_CHECKPOINT]
                [--file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}]
                [--feature_size FEATURE_SIZE] [--cuda]
                [--speaker_mode {all,within,across}]
                [--context_mode {all,within,any}]
                [--distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}]
                [--max_size_group MAX_SIZE_GROUP]
                [--max_x_across MAX_X_ACROSS] [--out OUT] [--seed SEED]
                [--pooling {none,mean,hamming}] [--seq_norm]
                [--max_size_seq MAX_SIZE_SEQ] [--strict]
                path_data path_item_file

ABX metric

positional arguments:
  path_data             Path to directory containing the submission data
  path_item_file        Path to the .item file containing the timestamps and
                        transcriptions

optional arguments:
  -h, --help            show this help message and exit
  --path_checkpoint PATH_CHECKPOINT
                        Path to a CPC checkpoint. If set, apply the model to
                        the input data to compute the features
  --file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}
  --feature_size FEATURE_SIZE
                        Size (in s) of one feature
  --cuda                Use the GPU to compute distances
  --speaker_mode {all,within,across}
                        Choose the speaker mode of the ABX score to compute
  --context_mode {all,within,any}
                        Choose the context mode of the ABX score to compute
  --distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}
                        Choose the kind of distance to use to compute the ABX
                        score.
  --max_size_group MAX_SIZE_GROUP
                        Max size of a group while computing the ABX score. A
                        small value will make the code faster but less
                        precise.
  --max_x_across MAX_X_ACROSS
                        When computing the ABX across score, maximum number of
                        speaker X to sample per couple A,B. A small value will
                        make the code faster but less precise.
  --out OUT             Path where the results should be saved
  --seed SEED           Seed to use in random sampling.
  --pooling {none,mean,hamming}
                        Type of pooling over frame representations of items.
  --seq_norm            Used for CPC features only. If activated, normalize
                        each batch of feature across the time channel before
                        computing ABX.
  --max_size_seq MAX_SIZE_SEQ
                        Used for CPC features only. Maximal number of frames
                        to consider when computing a batch of features.
  --strict              Used for CPC features only. If activated, each batch
                        of feature will contain exactly max_size_seq frames.
```
### Python API
You can also call the abx evaluation from python code. You can use the following example:

```
import zrc_abx2

args = zrc_abx2.EvalArgs(
    path_data= "/location/to/representations/",
    path_item_file= "/location/to/file.item",
    **other_options
)

result = zrc_abx2.EvalABX().eval_abx(args)
```

## Information on evaluation conditions
A new  variable in this ABX version is context.
In the within-context condition, a, b, and x have the same surrounding context (i.e. the same preceding and following phoneme). any-context ignores the surrounding context; typically, it varies. 

For the within-context and any-context comparison, use an item file that extracts phonemes (rather than XYZ triphones). For the on-triphone condition, which is still available, use an item file that extracts triphones (just like in the previous abx evaluation), and then run it within-context (which was the default behavior of the previous abx evaluation). any-context is not used for the on-triphone version due to excessive noise that would be included in the representation.

Like in the previous version, it is also possible to run within-speaker (a, b, x are all from the same speaker) and across-speaker (a and b are from the same speaker, x is from another) evaluations. So there are four phoneme-based evaluation combinations in total: within_s-within_c, within_s-any-c, across_s-within_c, across_s-any_c; and two triphone-based evaluation combinations: within_s-within_c, across_s-within_c. 


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "zerospeech-libriabx2",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "speech,machine-learning,challenges,research-tool,benchmark-speech,zerospeech",
    "author": "",
    "author_email": "Mark Hallap <mark.hallap@mail.utoronto.ca>, CoML Team <dev@zerospeech.com>, Nicolas Hamilakis <nicolas.hamilakis@ens.psl.eu>",
    "download_url": "https://files.pythonhosted.org/packages/e6/ca/3df0b37b497a33c3f43154e2552007711fed3de479d615440ec6553dbf88/zerospeech-libriabx2-0.9.8.tar.gz",
    "platform": null,
    "description": "# libri-light-abx2\n\nThe ABX phonetic evaluation metric for unsupervised representation learning as used by the ZeroSpeech challenge, now with context-type options (on-triphone, within-context, any-context). This module is a reworking of https://github.com/zerospeech/libri-light-abx, which in turn is a wrapper around https://github.com/facebookresearch/libri-light/tree/main/eval\n\n  \n### Installation\n  \nYou can install this module from pip directly using the following command : \n\n`pip install zerospeech-libriabx2`\n\nOr you can install from source by cloning this repository and running: \n\n`pip install .`\n\nAs the final alternative, you can install into a conda environment by running:\n\n`conda install -c conda-forge -c pytorch -c coml zerospeech-libriabx2 pytorch::pytorch`\n\n### Usage\n### From command line\n\n```\nusage: zrc-abx2 [-h] [--path_checkpoint PATH_CHECKPOINT]\n                [--file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}]\n                [--feature_size FEATURE_SIZE] [--cuda]\n                [--speaker_mode {all,within,across}]\n                [--context_mode {all,within,any}]\n                [--distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}]\n                [--max_size_group MAX_SIZE_GROUP]\n                [--max_x_across MAX_X_ACROSS] [--out OUT] [--seed SEED]\n                [--pooling {none,mean,hamming}] [--seq_norm]\n                [--max_size_seq MAX_SIZE_SEQ] [--strict]\n                path_data path_item_file\n\nABX metric\n\npositional arguments:\n  path_data             Path to directory containing the submission data\n  path_item_file        Path to the .item file containing the timestamps and\n                        transcriptions\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --path_checkpoint PATH_CHECKPOINT\n                        Path to a CPC checkpoint. If set, apply the model to\n                        the input data to compute the features\n  --file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}\n  --feature_size FEATURE_SIZE\n                        Size (in s) of one feature\n  --cuda                Use the GPU to compute distances\n  --speaker_mode {all,within,across}\n                        Choose the speaker mode of the ABX score to compute\n  --context_mode {all,within,any}\n                        Choose the context mode of the ABX score to compute\n  --distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}\n                        Choose the kind of distance to use to compute the ABX\n                        score.\n  --max_size_group MAX_SIZE_GROUP\n                        Max size of a group while computing the ABX score. A\n                        small value will make the code faster but less\n                        precise.\n  --max_x_across MAX_X_ACROSS\n                        When computing the ABX across score, maximum number of\n                        speaker X to sample per couple A,B. A small value will\n                        make the code faster but less precise.\n  --out OUT             Path where the results should be saved\n  --seed SEED           Seed to use in random sampling.\n  --pooling {none,mean,hamming}\n                        Type of pooling over frame representations of items.\n  --seq_norm            Used for CPC features only. If activated, normalize\n                        each batch of feature across the time channel before\n                        computing ABX.\n  --max_size_seq MAX_SIZE_SEQ\n                        Used for CPC features only. Maximal number of frames\n                        to consider when computing a batch of features.\n  --strict              Used for CPC features only. If activated, each batch\n                        of feature will contain exactly max_size_seq frames.\n```\n### Python API\nYou can also call the abx evaluation from python code. You can use the following example:\n\n```\nimport zrc_abx2\n\nargs = zrc_abx2.EvalArgs(\n    path_data= \"/location/to/representations/\",\n    path_item_file= \"/location/to/file.item\",\n    **other_options\n)\n\nresult = zrc_abx2.EvalABX().eval_abx(args)\n```\n\n## Information on evaluation conditions\nA new  variable in this ABX version is context.\nIn the within-context condition, a, b, and x have the same surrounding context (i.e. the same preceding and following phoneme). any-context ignores the surrounding context; typically, it varies. \n\nFor the within-context and any-context comparison, use an item file that extracts phonemes (rather than XYZ triphones). For the on-triphone condition, which is still available, use an item file that extracts triphones (just like in the previous abx evaluation), and then run it within-context (which was the default behavior of the previous abx evaluation). any-context is not used for the on-triphone version due to excessive noise that would be included in the representation.\n\nLike in the previous version, it is also possible to run within-speaker (a, b, x are all from the same speaker) and across-speaker (a and b are from the same speaker, x is from another) evaluations. So there are four phoneme-based evaluation combinations in total: within_s-within_c, within_s-any-c, across_s-within_c, across_s-any_c; and two triphone-based evaluation combinations: within_s-within_c, across_s-within_c. \n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Package implementing a revamped method to the librilight-abx.",
    "version": "0.9.8",
    "project_urls": {
        "documentation": "https://zerospeech.com/toolbox/",
        "homepage": "https://zerospeech.com/",
        "repository": "https://github.com/zerospeech/libri-light-abx2"
    },
    "split_keywords": [
        "speech",
        "machine-learning",
        "challenges",
        "research-tool",
        "benchmark-speech",
        "zerospeech"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e6ca3df0b37b497a33c3f43154e2552007711fed3de479d615440ec6553dbf88",
                "md5": "c87535edfd899da80f5750a4e0a69c24",
                "sha256": "485ffbd6a227af11c828db701a33096ec87a314fa5911279712396365f087900"
            },
            "downloads": -1,
            "filename": "zerospeech-libriabx2-0.9.8.tar.gz",
            "has_sig": false,
            "md5_digest": "c87535edfd899da80f5750a4e0a69c24",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 163275,
            "upload_time": "2023-06-19T12:59:53",
            "upload_time_iso_8601": "2023-06-19T12:59:53.433620Z",
            "url": "https://files.pythonhosted.org/packages/e6/ca/3df0b37b497a33c3f43154e2552007711fed3de479d615440ec6553dbf88/zerospeech-libriabx2-0.9.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-19 12:59:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "zerospeech",
    "github_project": "libri-light-abx2",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "zerospeech-libriabx2"
}
        
Elapsed time: 0.10743s