# Ping-ze Classifier
**Ping-ze Classifier** is a Python package designed to classify Chinese characters as 'ping' (平) or 'ze' (仄) tonal patterns based on the [Pingshui Rhyme](https://zh.wikisource.org/wiki/%E5%B9%B3%E6%B0%B4%E9%9F%BB) (平水韻) rhyme scheme. You can read more about ping-ze, or tonal patterns, [here.](https://en.wikipedia.org/wiki/Tone_pattern) The package includes a pre-scraped JSON file that contains the complete data from the rhyme dictionary, and also provides a scraper to update the data if necessary.
## Features
- **Classify Chinese characters**: Easily classify characters as 'ping' or 'ze' based on the Pingshui Rhyme scheme.
- **Pre-packaged data**: Includes a pre-scraped JSON file with the complete rhyme data.
- **Scraper**: An optional scraper is included to regenerate the JSON file when the source data changes.
## Installation
You can install the package by running:
```bash
pip install pingze_classifier
```
## Usage
### Classifying Characters
Once installed, you can use the PingZeClassifier to classify Chinese characters based on the Pingshui Rhyme (平水韻) scheme:
```python
from pingze_classifier import PingZeClassifier
# Initialize the classifier (uses pre-packaged JSON by default)
classifier = PingZeClassifier()
# Classify a sentence
sentence = "知否?知否?應是綠肥紅瘦。"
result = classifier.classify(sentence)
print(result)
# Output: ['ping', 'ze', 'unknown', 'ping', 'ze', 'unknown', 'ping', 'ze', 'ze', 'ping', 'ping', 'ze', 'unknown']
```
### Scraping and Regenerating the JSON Data
The package includes a pre-generated JSON file, but if the Pingshui Rhyme source from wikisource changes you can run the scraper.
#### Running the Scraper
By default, the scraper won't run if the JSON file already exists. To force a refresh and regenerate the JSON data, run the following command:
```bash
scrape-pingze --force-refresh
```
This will scrape the latest data from the source and regenerate the `organized_ping_ze_rhyme_dict.json` file.
## Dependencies
`requests`: For web scraping the Pingshui Rhyme data.
`beautifulsoup4`: For parsing the HTML content from the Pingshui Rhyme source page.
Raw data
{
"_id": null,
"home_page": "https://github.com/rbnyng/pingze_classifier",
"name": "pingze-classifier",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "rbnyng",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/b5/1c/0af67219cf9260c5ccc25e7f1b943926c2d3cf78df6dfc1e5a3f5a899931/pingze_classifier-0.11.tar.gz",
"platform": null,
"description": "# Ping-ze Classifier\r\n\r\n**Ping-ze Classifier** is a Python package designed to classify Chinese characters as 'ping' (\u5e73) or 'ze' (\u4ec4) tonal patterns based on the [Pingshui Rhyme](https://zh.wikisource.org/wiki/%E5%B9%B3%E6%B0%B4%E9%9F%BB) (\u5e73\u6c34\u97fb) rhyme scheme. You can read more about ping-ze, or tonal patterns, [here.](https://en.wikipedia.org/wiki/Tone_pattern) The package includes a pre-scraped JSON file that contains the complete data from the rhyme dictionary, and also provides a scraper to update the data if necessary.\r\n\r\n## Features\r\n\r\n- **Classify Chinese characters**: Easily classify characters as 'ping' or 'ze' based on the Pingshui Rhyme scheme.\r\n- **Pre-packaged data**: Includes a pre-scraped JSON file with the complete rhyme data.\r\n- **Scraper**: An optional scraper is included to regenerate the JSON file when the source data changes.\r\n\r\n## Installation\r\n\r\nYou can install the package by running:\r\n\r\n```bash\r\npip install pingze_classifier\r\n```\r\n\r\n## Usage\r\n\r\n### Classifying Characters\r\n\r\nOnce installed, you can use the PingZeClassifier to classify Chinese characters based on the Pingshui Rhyme (\u5e73\u6c34\u97fb) scheme:\r\n\r\n```python\r\nfrom pingze_classifier import PingZeClassifier\r\n\r\n# Initialize the classifier (uses pre-packaged JSON by default)\r\nclassifier = PingZeClassifier()\r\n\r\n# Classify a sentence\r\nsentence = \"\u77e5\u5426\uff1f\u77e5\u5426\uff1f\u61c9\u662f\u7da0\u80a5\u7d05\u7626\u3002\"\r\nresult = classifier.classify(sentence)\r\nprint(result)\r\n# Output: ['ping', 'ze', 'unknown', 'ping', 'ze', 'unknown', 'ping', 'ze', 'ze', 'ping', 'ping', 'ze', 'unknown']\r\n```\r\n\r\n### Scraping and Regenerating the JSON Data\r\n\r\nThe package includes a pre-generated JSON file, but if the Pingshui Rhyme source from wikisource changes you can run the scraper.\r\n\r\n#### Running the Scraper\r\n\r\nBy default, the scraper won't run if the JSON file already exists. To force a refresh and regenerate the JSON data, run the following command:\r\n\r\n```bash\r\nscrape-pingze --force-refresh\r\n```\r\n\r\nThis will scrape the latest data from the source and regenerate the `organized_ping_ze_rhyme_dict.json` file.\r\n\r\n## Dependencies\r\n\r\n `requests`: For web scraping the Pingshui Rhyme data.\r\n `beautifulsoup4`: For parsing the HTML content from the Pingshui Rhyme source page.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python package for classifying Chinese characters based on the Pingshui rhyme scheme",
"version": "0.11",
"project_urls": {
"Homepage": "https://github.com/rbnyng/pingze_classifier"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5efd7f860bc5bdb5911187fa377095197dcc706acf30e0a126a4684c59be454f",
"md5": "dd2a40ad9635203462fc6234789acd31",
"sha256": "c2977721560dc8791ffb1a5b7c9379dcb93fa9f1909d6ceaf6bc6dbaeeac43aa"
},
"downloads": -1,
"filename": "pingze_classifier-0.11-py3-none-any.whl",
"has_sig": false,
"md5_digest": "dd2a40ad9635203462fc6234789acd31",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 27533,
"upload_time": "2024-09-04T16:19:50",
"upload_time_iso_8601": "2024-09-04T16:19:50.013639Z",
"url": "https://files.pythonhosted.org/packages/5e/fd/7f860bc5bdb5911187fa377095197dcc706acf30e0a126a4684c59be454f/pingze_classifier-0.11-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b51c0af67219cf9260c5ccc25e7f1b943926c2d3cf78df6dfc1e5a3f5a899931",
"md5": "9e31e35d0dc84cbfb366958e351e8d7d",
"sha256": "69c9caf35faad0d3238ae66af8aa7e14f4686f7cd616865f4b4535c01c6c5542"
},
"downloads": -1,
"filename": "pingze_classifier-0.11.tar.gz",
"has_sig": false,
"md5_digest": "9e31e35d0dc84cbfb366958e351e8d7d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 29581,
"upload_time": "2024-09-04T16:19:51",
"upload_time_iso_8601": "2024-09-04T16:19:51.905783Z",
"url": "https://files.pythonhosted.org/packages/b5/1c/0af67219cf9260c5ccc25e7f1b943926c2d3cf78df6dfc1e5a3f5a899931/pingze_classifier-0.11.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-04 16:19:51",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "rbnyng",
"github_project": "pingze_classifier",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pingze-classifier"
}