gpt3-tokenizer


Namegpt3-tokenizer JSON
Version 0.1.4 PyPI version JSON
download
home_pagehttps://github.com/alisonjf/gpt3-tokenizer
SummaryEncoder/Decoder and tokens counter for GPT3
upload_time2023-05-16 00:50:21
maintainer
docs_urlNone
authorAlison Ferrenha
requires_python>=2.7
licenseMIT
keywords openai gpt gpt-3 gpt3 gpt4 gpt-4 tokenizer
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            gpt3_tokenizer
===============
| An `OpenAI`_ GPT3 helper library for encoding/decoding strings and counting tokens.
| Counting tokens gives the same output as OpenAI's `tokenizer`_
|
| Tested with versions: **2.7.12**, **2.7.18** and all **3.x.x** versions

Installing
--------------
.. code-block:: bash

    pip install gpt3_tokenizer

    
Examples
---------------------

**Encoding/decoding a string**

.. code-block:: python

    import gpt3_tokenizer

    a_string = "That's my beautiful and sweet string"
    encoded = gpt3_tokenizer.encode(a_string) # outputs [2504, 338, 616, 4950, 290, 6029, 4731]
    decoded = gpt3_tokenizer.decode(encoded) # outputs "That's my beautiful and sweet string"

**Counting tokens**

.. code-block:: python

    import gpt3_tokenizer

    a_string = "That's my beautiful and sweet string"
    tokens_count = gpt3_tokenizer.count_tokens(a_string) # outputs 7

.. _tokenizer: https://platform.openai.com/tokenizer
.. _OpenAI: https://openai.com/
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/alisonjf/gpt3-tokenizer",
    "name": "gpt3-tokenizer",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=2.7",
    "maintainer_email": "",
    "keywords": "openai,gpt,gpt-3,gpt3,gpt4,gpt-4,tokenizer",
    "author": "Alison Ferrenha",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/5f/cf/745fd3de6def51797ba6717f46cf27ac1871435505afb52fb778ba210361/gpt3_tokenizer-0.1.4.tar.gz",
    "platform": null,
    "description": "gpt3_tokenizer\n===============\n| An `OpenAI`_ GPT3 helper library for encoding/decoding strings and counting tokens.\n| Counting tokens gives the same output as OpenAI's `tokenizer`_\n|\n| Tested with versions: **2.7.12**, **2.7.18** and all **3.x.x** versions\n\nInstalling\n--------------\n.. code-block:: bash\n\n    pip install gpt3_tokenizer\n\n    \nExamples\n---------------------\n\n**Encoding/decoding a string**\n\n.. code-block:: python\n\n    import gpt3_tokenizer\n\n    a_string = \"That's my beautiful and sweet string\"\n    encoded = gpt3_tokenizer.encode(a_string) # outputs [2504, 338, 616, 4950, 290, 6029, 4731]\n    decoded = gpt3_tokenizer.decode(encoded) # outputs \"That's my beautiful and sweet string\"\n\n**Counting tokens**\n\n.. code-block:: python\n\n    import gpt3_tokenizer\n\n    a_string = \"That's my beautiful and sweet string\"\n    tokens_count = gpt3_tokenizer.count_tokens(a_string) # outputs 7\n\n.. _tokenizer: https://platform.openai.com/tokenizer\n.. _OpenAI: https://openai.com/",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Encoder/Decoder and tokens counter for GPT3",
    "version": "0.1.4",
    "project_urls": {
        "Homepage": "https://github.com/alisonjf/gpt3-tokenizer",
        "Repository": "https://github.com/alisonjf/gpt3-tokenizer"
    },
    "split_keywords": [
        "openai",
        "gpt",
        "gpt-3",
        "gpt3",
        "gpt4",
        "gpt-4",
        "tokenizer"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d8dc42614343a5e0c2540648c3ef7be95d1e1d221ecf038623141ec4211ba5f7",
                "md5": "9b00788b25af418d8b27d2756ceb2b3d",
                "sha256": "f7cee371f541c0aafa0c917785b305b076819e368e310db1038bc52304aae5e7"
            },
            "downloads": -1,
            "filename": "gpt3_tokenizer-0.1.4-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9b00788b25af418d8b27d2756ceb2b3d",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=2.7",
            "size": 567863,
            "upload_time": "2023-05-16T00:50:19",
            "upload_time_iso_8601": "2023-05-16T00:50:19.744198Z",
            "url": "https://files.pythonhosted.org/packages/d8/dc/42614343a5e0c2540648c3ef7be95d1e1d221ecf038623141ec4211ba5f7/gpt3_tokenizer-0.1.4-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5fcf745fd3de6def51797ba6717f46cf27ac1871435505afb52fb778ba210361",
                "md5": "2b6e4c340ee6db3f506c59f26b2781ae",
                "sha256": "630e191017db0e9413341e0595f858c866e76ba4fbb7f794839d6e3d8e3cf847"
            },
            "downloads": -1,
            "filename": "gpt3_tokenizer-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "2b6e4c340ee6db3f506c59f26b2781ae",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=2.7",
            "size": 560709,
            "upload_time": "2023-05-16T00:50:21",
            "upload_time_iso_8601": "2023-05-16T00:50:21.905022Z",
            "url": "https://files.pythonhosted.org/packages/5f/cf/745fd3de6def51797ba6717f46cf27ac1871435505afb52fb778ba210361/gpt3_tokenizer-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-16 00:50:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alisonjf",
    "github_project": "gpt3-tokenizer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "gpt3-tokenizer"
}
        
Elapsed time: 0.20174s