bs4-token-ext


Namebs4-token-ext JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryAdd your description here
upload_time2025-10-25 12:48:06
maintainerNone
docs_urlNone
authorNone
requires_python>=3.12
licenseMIT
keywords beautifulsoup llm tiktoken token web-scraping
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # bs4-token-ext
Add token counting capabilities to BeautifulSoup tags for LLM applications.
---
LLM(大規模言語モデル)向けに、BeautifulSoup のタグにトークン数をカウントする機能を追加する。
## Usage
```python
from bs4_token_ext import TokenAwareBeautifulSoup

html = "<div><p>Hello, world!</p></div>"
soup = TokenAwareBeautifulSoup(html, 'html.parser')
div = soup.find('div')
p = soup.find('p')

print(p.token_count) 
print(p.token_count_with_html)  
print(div.token_count)  
print(div.token_count_with_html)  
```
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "bs4-token-ext",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "beautifulsoup, llm, tiktoken, token, web-scraping",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/8c/33/e8b29456b8b082200fdcd4ca4e4e3429d1eed884cda74d9bd887399e1ba9/bs4_token_ext-0.0.2.tar.gz",
    "platform": null,
    "description": "# bs4-token-ext\nAdd token counting capabilities to BeautifulSoup tags for LLM applications.\n---\nLLM\uff08\u5927\u898f\u6a21\u8a00\u8a9e\u30e2\u30c7\u30eb\uff09\u5411\u3051\u306b\u3001BeautifulSoup \u306e\u30bf\u30b0\u306b\u30c8\u30fc\u30af\u30f3\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3059\u308b\u6a5f\u80fd\u3092\u8ffd\u52a0\u3059\u308b\u3002\n## Usage\n```python\nfrom bs4_token_ext import TokenAwareBeautifulSoup\n\nhtml = \"<div><p>Hello, world!</p></div>\"\nsoup = TokenAwareBeautifulSoup(html, 'html.parser')\ndiv = soup.find('div')\np = soup.find('p')\n\nprint(p.token_count) \nprint(p.token_count_with_html)  \nprint(div.token_count)  \nprint(div.token_count_with_html)  \n```",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Add your description here",
    "version": "0.0.2",
    "project_urls": null,
    "split_keywords": [
        "beautifulsoup",
        " llm",
        " tiktoken",
        " token",
        " web-scraping"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b10170f1e07203620c5f4c93a55a6b7d8c97b15b9c84c15132b63d5f801d21c4",
                "md5": "ffcf4c661728982772ce9ad2ab2be342",
                "sha256": "c8b3bda2d59fcd847c0c7fbba5be917e5213005eaa0a37868c1106ca73949e12"
            },
            "downloads": -1,
            "filename": "bs4_token_ext-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ffcf4c661728982772ce9ad2ab2be342",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 3626,
            "upload_time": "2025-10-25T12:48:03",
            "upload_time_iso_8601": "2025-10-25T12:48:03.431729Z",
            "url": "https://files.pythonhosted.org/packages/b1/01/70f1e07203620c5f4c93a55a6b7d8c97b15b9c84c15132b63d5f801d21c4/bs4_token_ext-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8c33e8b29456b8b082200fdcd4ca4e4e3429d1eed884cda74d9bd887399e1ba9",
                "md5": "3872befef5bc809cc9c4b713093d66fb",
                "sha256": "565e0578230cd4c059f1623161542fbd80aaeb524d5ef180495b4b5be707e0be"
            },
            "downloads": -1,
            "filename": "bs4_token_ext-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "3872befef5bc809cc9c4b713093d66fb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 38672,
            "upload_time": "2025-10-25T12:48:06",
            "upload_time_iso_8601": "2025-10-25T12:48:06.888762Z",
            "url": "https://files.pythonhosted.org/packages/8c/33/e8b29456b8b082200fdcd4ca4e4e3429d1eed884cda74d9bd887399e1ba9/bs4_token_ext-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-25 12:48:06",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "bs4-token-ext"
}
        
Elapsed time: 1.41858s