Name | textknnassifier JSON |
Version |
0.0.1rc1
JSON |
| download |
home_page | |
Summary | TextKNNClassifier is a k-nearest neighbors classifier for text data. It uses a compression algorithm to compute the distance between texts and predicts the label of a test entry based on the labels of the k-nearest neighbors in the training data. |
upload_time | 2023-07-13 19:31:23 |
maintainer | |
docs_url | None |
author | Reinder Vos de Wael |
requires_python | >=3.8,<4.0 |
license | LGPL-2.1 |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# TextKNNClassifier
[![Build](https://github.com/cmi-dair/text-knnassifier/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/cmi-dair/text-knnassifier/actions/workflows/test.yaml?query=branch%3Amain)
[![codecov](https://codecov.io/gh/cmi-dair/text-knnassifier/branch/main/graph/badge.svg?token=22HWWFWPW5)](https://codecov.io/gh/cmi-dair/text-knnassifier)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![L-GPL License](https://img.shields.io/badge/license-L--GPL-blue.svg)](LICENSE)
[![pages](https://img.shields.io/badge/api-docs-blue)](https://cmi-dair.github.io/text-knnassifier)
`TextKNNClassifier` is a k-nearest neighbors classifier for text data. It uses a compression algorithm to compute the distance between texts and predicts the label of a test entry based on the labels of the k-nearest neighbors in the training data.
## Installation
You can install `TextKNNassifier` using pip:
```bash
pip install textknnassifier
```
## Usage
Here's an example of how to use `TextKNNClassifier`:
```python
from textknnassifier import classifier
training_text = [
"This is a test",
"Another test",
"General Tarkin",
"General Grievous",
]
training_labels = ["test", "test", "star_wars", "star_wars"]
testing_data = [
"This is a test",
"Testing here too!",
"General Kenobi",
"General Skywalker",
]
KNN = classifier.TextKNNClassifier(n_neighbors=2)
KNN.fit(training_data, training_labels)
predicted_labels = KNN.predict(testing_data)
print(predicted_labels)
# Output: ['test1', 'test1', 'star_wars', 'star_wars']
```
In this example, we create a `TextKNNClassifier` instance and use it to predict the labels of the test entries. The initialization is given `n_neighbors=2`, this denotes the number of training datapoints to consider for predicting the testing label. The `fit` method takes two arguments: the training data, and the training labels. It simply stores these values for later use. The `predict` method takes the testing data as an argument and returns the predicted labels.
## References
- Jiang, Z., Yang, M., Tsirlin, M., Tang, R., Dai, Y., & Lin, J. (2023, July). “Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors. In Findings of the Association for Computational Linguistics: ACL 2023 (pp. 6810-6828).
Raw data
{
"_id": null,
"home_page": "",
"name": "textknnassifier",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<4.0",
"maintainer_email": "",
"keywords": "",
"author": "Reinder Vos de Wael",
"author_email": "reinder.vosdewael@childmind.org",
"download_url": "https://files.pythonhosted.org/packages/6c/0c/3d4f589696e7e8f2951d44ef64a57c4016a3e185662099759ee1fdcdf547/textknnassifier-0.0.1rc1.tar.gz",
"platform": null,
"description": "# TextKNNClassifier\n\n[![Build](https://github.com/cmi-dair/text-knnassifier/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/cmi-dair/text-knnassifier/actions/workflows/test.yaml?query=branch%3Amain)\n[![codecov](https://codecov.io/gh/cmi-dair/text-knnassifier/branch/main/graph/badge.svg?token=22HWWFWPW5)](https://codecov.io/gh/cmi-dair/text-knnassifier)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![L-GPL License](https://img.shields.io/badge/license-L--GPL-blue.svg)](LICENSE)\n[![pages](https://img.shields.io/badge/api-docs-blue)](https://cmi-dair.github.io/text-knnassifier)\n\n`TextKNNClassifier` is a k-nearest neighbors classifier for text data. It uses a compression algorithm to compute the distance between texts and predicts the label of a test entry based on the labels of the k-nearest neighbors in the training data.\n\n## Installation\n\nYou can install `TextKNNassifier` using pip:\n\n```bash\npip install textknnassifier\n```\n\n## Usage\n\nHere's an example of how to use `TextKNNClassifier`:\n\n```python\nfrom textknnassifier import classifier\n\ntraining_text = [\n \"This is a test\",\n \"Another test\",\n \"General Tarkin\",\n \"General Grievous\",\n]\ntraining_labels = [\"test\", \"test\", \"star_wars\", \"star_wars\"]\ntesting_data = [\n \"This is a test\",\n \"Testing here too!\",\n \"General Kenobi\",\n \"General Skywalker\",\n]\n\nKNN = classifier.TextKNNClassifier(n_neighbors=2)\nKNN.fit(training_data, training_labels)\npredicted_labels = KNN.predict(testing_data)\n\nprint(predicted_labels)\n# Output: ['test1', 'test1', 'star_wars', 'star_wars']\n```\n\nIn this example, we create a `TextKNNClassifier` instance and use it to predict the labels of the test entries. The initialization is given `n_neighbors=2`, this denotes the number of training datapoints to consider for predicting the testing label. The `fit` method takes two arguments: the training data, and the training labels. It simply stores these values for later use. The `predict` method takes the testing data as an argument and returns the predicted labels.\n\n## References\n\n- Jiang, Z., Yang, M., Tsirlin, M., Tang, R., Dai, Y., & Lin, J. (2023, July). \u201cLow-Resource\u201d Text Classification: A Parameter-Free Classification Method with Compressors. In Findings of the Association for Computational Linguistics: ACL 2023 (pp. 6810-6828).\n\n",
"bugtrack_url": null,
"license": "LGPL-2.1",
"summary": "TextKNNClassifier is a k-nearest neighbors classifier for text data. It uses a compression algorithm to compute the distance between texts and predicts the label of a test entry based on the labels of the k-nearest neighbors in the training data.",
"version": "0.0.1rc1",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "73aefa26d8e94c4b3e92762f0986216467ec2da4dd116823ad4553cf4eabc99b",
"md5": "105f7c373608f2dd59a2948e4c4bedc1",
"sha256": "b679b6f9e368c7029d54f89e28d1f86529a198114b762626d60dfe3304e33de2"
},
"downloads": -1,
"filename": "textknnassifier-0.0.1rc1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "105f7c373608f2dd59a2948e4c4bedc1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<4.0",
"size": 13914,
"upload_time": "2023-07-13T19:31:22",
"upload_time_iso_8601": "2023-07-13T19:31:22.482302Z",
"url": "https://files.pythonhosted.org/packages/73/ae/fa26d8e94c4b3e92762f0986216467ec2da4dd116823ad4553cf4eabc99b/textknnassifier-0.0.1rc1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6c0c3d4f589696e7e8f2951d44ef64a57c4016a3e185662099759ee1fdcdf547",
"md5": "93bb2324d298f7d43e73eb5f2cdcc228",
"sha256": "b63abdd38dcedd76bec5e90911eb6037fee2bd4c7eecc0f8286027e5aa302b46"
},
"downloads": -1,
"filename": "textknnassifier-0.0.1rc1.tar.gz",
"has_sig": false,
"md5_digest": "93bb2324d298f7d43e73eb5f2cdcc228",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<4.0",
"size": 12619,
"upload_time": "2023-07-13T19:31:23",
"upload_time_iso_8601": "2023-07-13T19:31:23.741439Z",
"url": "https://files.pythonhosted.org/packages/6c/0c/3d4f589696e7e8f2951d44ef64a57c4016a3e185662099759ee1fdcdf547/textknnassifier-0.0.1rc1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-07-13 19:31:23",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "textknnassifier"
}