tokeniser


Nametokeniser JSON
Version 0.0.3 PyPI version JSON
download
home_pagehttps://github.com/chigwell/tokeniser
SummaryNone
upload_time2024-04-15 09:46:50
maintainerNone
docs_urlNone
authorEugene Evstafev
requires_python>=3.6
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![PyPI version](https://badge.fury.io/py/tokeniser.svg)](https://badge.fury.io/py/tokeniser)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![Downloads](https://static.pepy.tech/badge/tokeniser)](https://pepy.tech/project/tokeniser)

# Tokeniser

`Tokeniser` is a lightweight Python package designed for simple and efficient token counting in text. It uses regular expressions to identify tokens, providing a straightforward approach to tokenization without relying on complex NLP models.

## Installation

To install `Tokeniser`, you can use pip:

```bash
pip install tokeniser
```

## Usage

`Tokeniser` is easy to use in your Python scripts. Here's a basic example:

```python
import tokeniser

text = "Hello, World!"
token_count = tokeniser.estimate_tokens(text)
print(f"Number of tokens: {token_count}")
```

This package is ideal for scenarios where a simple token count is needed, without the overhead of more complex NLP tools.

## Features

- Simple and efficient token counting using regular expressions.
- Lightweight with no dependencies on large NLP models or frameworks.
- Versatile for use in various text processing tasks.

## Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the [issues page](https://github.com/chigwell/tokeniser/issues).

## License

This project is licensed under the [MIT License](https://choosealicense.com/licenses/mit/).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/chigwell/tokeniser",
    "name": "tokeniser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "Eugene Evstafev",
    "author_email": "chigwel@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/07/ea/1548d27059f09987588d2a9dcf3fe75a9f058e59215d07a257ebc5d4386f/tokeniser-0.0.3.tar.gz",
    "platform": null,
    "description": "[![PyPI version](https://badge.fury.io/py/tokeniser.svg)](https://badge.fury.io/py/tokeniser)\n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n[![Downloads](https://static.pepy.tech/badge/tokeniser)](https://pepy.tech/project/tokeniser)\n\n# Tokeniser\n\n`Tokeniser` is a lightweight Python package designed for simple and efficient token counting in text. It uses regular expressions to identify tokens, providing a straightforward approach to tokenization without relying on complex NLP models.\n\n## Installation\n\nTo install `Tokeniser`, you can use pip:\n\n```bash\npip install tokeniser\n```\n\n## Usage\n\n`Tokeniser` is easy to use in your Python scripts. Here's a basic example:\n\n```python\nimport tokeniser\n\ntext = \"Hello, World!\"\ntoken_count = tokeniser.estimate_tokens(text)\nprint(f\"Number of tokens: {token_count}\")\n```\n\nThis package is ideal for scenarios where a simple token count is needed, without the overhead of more complex NLP tools.\n\n## Features\n\n- Simple and efficient token counting using regular expressions.\n- Lightweight with no dependencies on large NLP models or frameworks.\n- Versatile for use in various text processing tasks.\n\n## Contributing\n\nContributions, issues, and feature requests are welcome! Feel free to check the [issues page](https://github.com/chigwell/tokeniser/issues).\n\n## License\n\nThis project is licensed under the [MIT License](https://choosealicense.com/licenses/mit/).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": null,
    "version": "0.0.3",
    "project_urls": {
        "Homepage": "https://github.com/chigwell/tokeniser"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "240d8777b5942fb608ac3b6a81427658d7336667e61656e927f62ad9ca800518",
                "md5": "f2ce4884c26af5da24f8c4cdaf6a006c",
                "sha256": "7940ab3b2a02b8b02307805c2cc53cf7c591fd9b106c963ad017349cf65330f0"
            },
            "downloads": -1,
            "filename": "tokeniser-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f2ce4884c26af5da24f8c4cdaf6a006c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 3152,
            "upload_time": "2024-04-15T09:46:47",
            "upload_time_iso_8601": "2024-04-15T09:46:47.388403Z",
            "url": "https://files.pythonhosted.org/packages/24/0d/8777b5942fb608ac3b6a81427658d7336667e61656e927f62ad9ca800518/tokeniser-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "07ea1548d27059f09987588d2a9dcf3fe75a9f058e59215d07a257ebc5d4386f",
                "md5": "fdde3f89d5b3f6cb15fd5df9d17f9734",
                "sha256": "5d3160809f4ea9288b93aeff67fe0f22bccc63fd729173df591a5b8b65543c95"
            },
            "downloads": -1,
            "filename": "tokeniser-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "fdde3f89d5b3f6cb15fd5df9d17f9734",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 2803,
            "upload_time": "2024-04-15T09:46:50",
            "upload_time_iso_8601": "2024-04-15T09:46:50.339533Z",
            "url": "https://files.pythonhosted.org/packages/07/ea/1548d27059f09987588d2a9dcf3fe75a9f058e59215d07a257ebc5d4386f/tokeniser-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-15 09:46:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "chigwell",
    "github_project": "tokeniser",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "tokeniser"
}
        
Elapsed time: 0.86095s