# txt-utils
[![PyPI](https://img.shields.io/pypi/v/txt-utils.svg)](https://pypi.python.org/pypi/txt-utils)
[![PyPI](https://img.shields.io/pypi/pyversions/txt-utils.svg)](https://pypi.python.org/pypi/txt-utils)
[![MIT](https://img.shields.io/github/license/stefantaubert/txt-utils.svg)](https://github.com/stefantaubert/txt-utils/blob/master/LICENSE)
[![PyPI](https://img.shields.io/pypi/wheel/txt-utils.svg)](https://pypi.python.org/pypi/txt-utils)
[![PyPI](https://img.shields.io/pypi/implementation/txt-utils.svg)](https://pypi.python.org/pypi/txt-utils)
[![PyPI](https://img.shields.io/github/commits-since/stefantaubert/txt-utils/latest/master.svg)](https://github.com/stefantaubert/txt-utils/compare/v0.0.3...master)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10571273.svg)](https://doi.org/10.5281/zenodo.10571273)
CLI to modify text files.
## Features
- `merge`: merge multiple text files into one
- `extract-vocabulary`: extract unit vocabulary
- `transcribe`: transcribe units
- `replace`: replace text
- `replace-line`: replace text in a line
- `trim-units`: trim units
- `remove-units`: remove units
- `create-unit-occurrence-stats`: create unit occurrence statistics
## Roadmap
- create n-grams
- map units
- merge units right/left
- calculate units TF-IDF
## Installation
```sh
pip install txt-utils --user
```
## Usage
```sh
txt-utils-cli
```
## Citation
If you want to cite this repo, you can use the BibTeX-entry generated by GitHub (see *About => Cite this repository*).
```txt
Taubert, S. (2024). txt-utils (Version 0.0.3) [Computer software]. https://doi.org/10.5281/zenodo.10571273
```
## Acknowledgments
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
Raw data
{
"_id": null,
"home_page": "",
"name": "txt-utils",
"maintainer": "",
"docs_url": null,
"requires_python": "<3.13,>=3.8",
"maintainer_email": "Stefan Taubert <pypi@stefantaubert.com>",
"keywords": "Preprocessing,Processing,Text-to-speech,Speech synthesis,Utils,Language,Linguistics",
"author": "",
"author_email": "Stefan Taubert <pypi@stefantaubert.com>",
"download_url": "https://files.pythonhosted.org/packages/81/1f/f2d43792902efb95d966b62805e55f2ecf3d6bea05dc0ba3942c40ad3640/txt-utils-0.0.3.tar.gz",
"platform": null,
"description": "# txt-utils\n\n[![PyPI](https://img.shields.io/pypi/v/txt-utils.svg)](https://pypi.python.org/pypi/txt-utils)\n[![PyPI](https://img.shields.io/pypi/pyversions/txt-utils.svg)](https://pypi.python.org/pypi/txt-utils)\n[![MIT](https://img.shields.io/github/license/stefantaubert/txt-utils.svg)](https://github.com/stefantaubert/txt-utils/blob/master/LICENSE)\n[![PyPI](https://img.shields.io/pypi/wheel/txt-utils.svg)](https://pypi.python.org/pypi/txt-utils)\n[![PyPI](https://img.shields.io/pypi/implementation/txt-utils.svg)](https://pypi.python.org/pypi/txt-utils)\n[![PyPI](https://img.shields.io/github/commits-since/stefantaubert/txt-utils/latest/master.svg)](https://github.com/stefantaubert/txt-utils/compare/v0.0.3...master)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10571273.svg)](https://doi.org/10.5281/zenodo.10571273)\n\nCLI to modify text files.\n\n## Features\n\n- `merge`: merge multiple text files into one\n- `extract-vocabulary`: extract unit vocabulary\n- `transcribe`: transcribe units\n- `replace`: replace text\n- `replace-line`: replace text in a line\n- `trim-units`: trim units\n- `remove-units`: remove units\n- `create-unit-occurrence-stats`: create unit occurrence statistics\n\n## Roadmap\n\n- create n-grams\n- map units\n- merge units right/left\n- calculate units TF-IDF\n\n## Installation\n\n```sh\npip install txt-utils --user\n```\n\n## Usage\n\n```sh\ntxt-utils-cli\n```\n\n## Citation\n\nIf you want to cite this repo, you can use the BibTeX-entry generated by GitHub (see *About => Cite this repository*).\n\n```txt\nTaubert, S. (2024). txt-utils (Version 0.0.3) [Computer software]. https://doi.org/10.5281/zenodo.10571273\n```\n\n## Acknowledgments\n\nFunded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) \u2013 Project-ID 416228727 \u2013 CRC 1410\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "CLI to modify text files.",
"version": "0.0.3",
"project_urls": {
"Homepage": "https://github.com/stefantaubert/txt-utils",
"Issues": "https://github.com/stefantaubert/txt-utils/issues"
},
"split_keywords": [
"preprocessing",
"processing",
"text-to-speech",
"speech synthesis",
"utils",
"language",
"linguistics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "04ca858cf072ba3072bbba6b2f36d832d4fe49d3f387b199e98fae38eac89256",
"md5": "dda14c319f3a3463a67b0fdffef55926",
"sha256": "ad13471a43090eca4ecf4be377f8a65a7e5589cd7a5bb60ad02e3578687cecd8"
},
"downloads": -1,
"filename": "txt_utils-0.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "dda14c319f3a3463a67b0fdffef55926",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.8",
"size": 23300,
"upload_time": "2024-01-26T13:42:25",
"upload_time_iso_8601": "2024-01-26T13:42:25.735887Z",
"url": "https://files.pythonhosted.org/packages/04/ca/858cf072ba3072bbba6b2f36d832d4fe49d3f387b199e98fae38eac89256/txt_utils-0.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "811ff2d43792902efb95d966b62805e55f2ecf3d6bea05dc0ba3942c40ad3640",
"md5": "c20254852b7fb50a93783e49df762ff2",
"sha256": "ca3a06ac16312ea79761137e580fd1003941fe11bf69e1a132d5ebe5bb834e33"
},
"downloads": -1,
"filename": "txt-utils-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "c20254852b7fb50a93783e49df762ff2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.8",
"size": 50568,
"upload_time": "2024-01-26T13:42:27",
"upload_time_iso_8601": "2024-01-26T13:42:27.189844Z",
"url": "https://files.pythonhosted.org/packages/81/1f/f2d43792902efb95d966b62805e55f2ecf3d6bea05dc0ba3942c40ad3640/txt-utils-0.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-26 13:42:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "stefantaubert",
"github_project": "txt-utils",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "txt-utils"
}