textknnassifier


Nametextknnassifier JSON
Version 0.0.1rc1 PyPI version JSON
download
home_page
SummaryTextKNNClassifier is a k-nearest neighbors classifier for text data. It uses a compression algorithm to compute the distance between texts and predicts the label of a test entry based on the labels of the k-nearest neighbors in the training data.
upload_time2023-07-13 19:31:23
maintainer
docs_urlNone
authorReinder Vos de Wael
requires_python>=3.8,<4.0
licenseLGPL-2.1
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TextKNNClassifier

[![Build](https://github.com/cmi-dair/text-knnassifier/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/cmi-dair/text-knnassifier/actions/workflows/test.yaml?query=branch%3Amain)
[![codecov](https://codecov.io/gh/cmi-dair/text-knnassifier/branch/main/graph/badge.svg?token=22HWWFWPW5)](https://codecov.io/gh/cmi-dair/text-knnassifier)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![L-GPL License](https://img.shields.io/badge/license-L--GPL-blue.svg)](LICENSE)
[![pages](https://img.shields.io/badge/api-docs-blue)](https://cmi-dair.github.io/text-knnassifier)

`TextKNNClassifier` is a k-nearest neighbors classifier for text data. It uses a compression algorithm to compute the distance between texts and predicts the label of a test entry based on the labels of the k-nearest neighbors in the training data.

## Installation

You can install `TextKNNassifier` using pip:

```bash
pip install textknnassifier
```

## Usage

Here's an example of how to use `TextKNNClassifier`:

```python
from textknnassifier import classifier

training_text = [
    "This is a test",
    "Another test",
    "General Tarkin",
    "General Grievous",
]
training_labels = ["test", "test", "star_wars", "star_wars"]
testing_data = [
    "This is a test",
    "Testing here too!",
    "General Kenobi",
    "General Skywalker",
]

KNN = classifier.TextKNNClassifier(n_neighbors=2)
KNN.fit(training_data, training_labels)
predicted_labels = KNN.predict(testing_data)

print(predicted_labels)
# Output: ['test1', 'test1', 'star_wars', 'star_wars']
```

In this example, we create a `TextKNNClassifier` instance and use it to predict the labels of the test entries. The initialization is given `n_neighbors=2`, this denotes the number of training datapoints to consider for predicting the testing label. The `fit` method takes two arguments: the training data, and the training labels. It simply stores these values for later use. The `predict` method takes the testing data as an argument and returns the predicted labels.

## References

- Jiang, Z., Yang, M., Tsirlin, M., Tang, R., Dai, Y., & Lin, J. (2023, July). “Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors. In Findings of the Association for Computational Linguistics: ACL 2023 (pp. 6810-6828).


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "textknnassifier",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Reinder Vos de Wael",
    "author_email": "reinder.vosdewael@childmind.org",
    "download_url": "https://files.pythonhosted.org/packages/6c/0c/3d4f589696e7e8f2951d44ef64a57c4016a3e185662099759ee1fdcdf547/textknnassifier-0.0.1rc1.tar.gz",
    "platform": null,
    "description": "# TextKNNClassifier\n\n[![Build](https://github.com/cmi-dair/text-knnassifier/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/cmi-dair/text-knnassifier/actions/workflows/test.yaml?query=branch%3Amain)\n[![codecov](https://codecov.io/gh/cmi-dair/text-knnassifier/branch/main/graph/badge.svg?token=22HWWFWPW5)](https://codecov.io/gh/cmi-dair/text-knnassifier)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![L-GPL License](https://img.shields.io/badge/license-L--GPL-blue.svg)](LICENSE)\n[![pages](https://img.shields.io/badge/api-docs-blue)](https://cmi-dair.github.io/text-knnassifier)\n\n`TextKNNClassifier` is a k-nearest neighbors classifier for text data. It uses a compression algorithm to compute the distance between texts and predicts the label of a test entry based on the labels of the k-nearest neighbors in the training data.\n\n## Installation\n\nYou can install `TextKNNassifier` using pip:\n\n```bash\npip install textknnassifier\n```\n\n## Usage\n\nHere's an example of how to use `TextKNNClassifier`:\n\n```python\nfrom textknnassifier import classifier\n\ntraining_text = [\n    \"This is a test\",\n    \"Another test\",\n    \"General Tarkin\",\n    \"General Grievous\",\n]\ntraining_labels = [\"test\", \"test\", \"star_wars\", \"star_wars\"]\ntesting_data = [\n    \"This is a test\",\n    \"Testing here too!\",\n    \"General Kenobi\",\n    \"General Skywalker\",\n]\n\nKNN = classifier.TextKNNClassifier(n_neighbors=2)\nKNN.fit(training_data, training_labels)\npredicted_labels = KNN.predict(testing_data)\n\nprint(predicted_labels)\n# Output: ['test1', 'test1', 'star_wars', 'star_wars']\n```\n\nIn this example, we create a `TextKNNClassifier` instance and use it to predict the labels of the test entries. The initialization is given `n_neighbors=2`, this denotes the number of training datapoints to consider for predicting the testing label. The `fit` method takes two arguments: the training data, and the training labels. It simply stores these values for later use. The `predict` method takes the testing data as an argument and returns the predicted labels.\n\n## References\n\n- Jiang, Z., Yang, M., Tsirlin, M., Tang, R., Dai, Y., & Lin, J. (2023, July). \u201cLow-Resource\u201d Text Classification: A Parameter-Free Classification Method with Compressors. In Findings of the Association for Computational Linguistics: ACL 2023 (pp. 6810-6828).\n\n",
    "bugtrack_url": null,
    "license": "LGPL-2.1",
    "summary": "TextKNNClassifier is a k-nearest neighbors classifier for text data. It uses a compression algorithm to compute the distance between texts and predicts the label of a test entry based on the labels of the k-nearest neighbors in the training data.",
    "version": "0.0.1rc1",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "73aefa26d8e94c4b3e92762f0986216467ec2da4dd116823ad4553cf4eabc99b",
                "md5": "105f7c373608f2dd59a2948e4c4bedc1",
                "sha256": "b679b6f9e368c7029d54f89e28d1f86529a198114b762626d60dfe3304e33de2"
            },
            "downloads": -1,
            "filename": "textknnassifier-0.0.1rc1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "105f7c373608f2dd59a2948e4c4bedc1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 13914,
            "upload_time": "2023-07-13T19:31:22",
            "upload_time_iso_8601": "2023-07-13T19:31:22.482302Z",
            "url": "https://files.pythonhosted.org/packages/73/ae/fa26d8e94c4b3e92762f0986216467ec2da4dd116823ad4553cf4eabc99b/textknnassifier-0.0.1rc1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6c0c3d4f589696e7e8f2951d44ef64a57c4016a3e185662099759ee1fdcdf547",
                "md5": "93bb2324d298f7d43e73eb5f2cdcc228",
                "sha256": "b63abdd38dcedd76bec5e90911eb6037fee2bd4c7eecc0f8286027e5aa302b46"
            },
            "downloads": -1,
            "filename": "textknnassifier-0.0.1rc1.tar.gz",
            "has_sig": false,
            "md5_digest": "93bb2324d298f7d43e73eb5f2cdcc228",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 12619,
            "upload_time": "2023-07-13T19:31:23",
            "upload_time_iso_8601": "2023-07-13T19:31:23.741439Z",
            "url": "https://files.pythonhosted.org/packages/6c/0c/3d4f589696e7e8f2951d44ef64a57c4016a3e185662099759ee1fdcdf547/textknnassifier-0.0.1rc1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-13 19:31:23",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "textknnassifier"
}
        
Elapsed time: 0.09476s