# bs4-token-ext
Add token counting capabilities to BeautifulSoup tags for LLM applications.
---
LLM(大規模言語モデル)向けに、BeautifulSoup のタグにトークン数をカウントする機能を追加する。
## Usage
```python
from bs4_token_ext import TokenAwareBeautifulSoup
html = "<div><p>Hello, world!</p></div>"
soup = TokenAwareBeautifulSoup(html, 'html.parser')
div = soup.find('div')
p = soup.find('p')
print(p.token_count)
print(p.token_count_with_html)
print(div.token_count)
print(div.token_count_with_html)
```
Raw data
{
"_id": null,
"home_page": null,
"name": "bs4-token-ext",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "beautifulsoup, llm, tiktoken, token, web-scraping",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/8c/33/e8b29456b8b082200fdcd4ca4e4e3429d1eed884cda74d9bd887399e1ba9/bs4_token_ext-0.0.2.tar.gz",
"platform": null,
"description": "# bs4-token-ext\nAdd token counting capabilities to BeautifulSoup tags for LLM applications.\n---\nLLM\uff08\u5927\u898f\u6a21\u8a00\u8a9e\u30e2\u30c7\u30eb\uff09\u5411\u3051\u306b\u3001BeautifulSoup \u306e\u30bf\u30b0\u306b\u30c8\u30fc\u30af\u30f3\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3059\u308b\u6a5f\u80fd\u3092\u8ffd\u52a0\u3059\u308b\u3002\n## Usage\n```python\nfrom bs4_token_ext import TokenAwareBeautifulSoup\n\nhtml = \"<div><p>Hello, world!</p></div>\"\nsoup = TokenAwareBeautifulSoup(html, 'html.parser')\ndiv = soup.find('div')\np = soup.find('p')\n\nprint(p.token_count) \nprint(p.token_count_with_html) \nprint(div.token_count) \nprint(div.token_count_with_html) \n```",
"bugtrack_url": null,
"license": "MIT",
"summary": "Add your description here",
"version": "0.0.2",
"project_urls": null,
"split_keywords": [
"beautifulsoup",
" llm",
" tiktoken",
" token",
" web-scraping"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b10170f1e07203620c5f4c93a55a6b7d8c97b15b9c84c15132b63d5f801d21c4",
"md5": "ffcf4c661728982772ce9ad2ab2be342",
"sha256": "c8b3bda2d59fcd847c0c7fbba5be917e5213005eaa0a37868c1106ca73949e12"
},
"downloads": -1,
"filename": "bs4_token_ext-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ffcf4c661728982772ce9ad2ab2be342",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 3626,
"upload_time": "2025-10-25T12:48:03",
"upload_time_iso_8601": "2025-10-25T12:48:03.431729Z",
"url": "https://files.pythonhosted.org/packages/b1/01/70f1e07203620c5f4c93a55a6b7d8c97b15b9c84c15132b63d5f801d21c4/bs4_token_ext-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8c33e8b29456b8b082200fdcd4ca4e4e3429d1eed884cda74d9bd887399e1ba9",
"md5": "3872befef5bc809cc9c4b713093d66fb",
"sha256": "565e0578230cd4c059f1623161542fbd80aaeb524d5ef180495b4b5be707e0be"
},
"downloads": -1,
"filename": "bs4_token_ext-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "3872befef5bc809cc9c4b713093d66fb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 38672,
"upload_time": "2025-10-25T12:48:06",
"upload_time_iso_8601": "2025-10-25T12:48:06.888762Z",
"url": "https://files.pythonhosted.org/packages/8c/33/e8b29456b8b082200fdcd4ca4e4e3429d1eed884cda74d9bd887399e1ba9/bs4_token_ext-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-25 12:48:06",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "bs4-token-ext"
}