# pykosinus
pykosinus is an open-source Python library for text similarity search scoring. It provides a fast and memory-efficient way to calculate cosine similarity scores, making it suitable for various text similarity applications. The library is designed to be user-friendly and encourages contributions from the community.
## Installation
To install pykosinus, make sure you have Python 3.8.17 or higher installed. Then, you can install the library using pip:
```shell
pip install pykosinus
```
## Additional Library for Mac Users
If you are using pykosinus on a Mac, you may need to install the GCC compiler to enable certain features. GCC is a widely used compiler for various programming languages.
To install GCC on macOS, you can use Homebrew, a popular package manager for macOS. Follow these steps to install GCC using Homebrew:
- Open a terminal window.
- Install Homebrew by running the following command:
```sh
[/bin/bash](VALID_FILE) -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
```
- Install GCC by running the following command:
```sh
brew install gcc
```
- Verify the installation by running the following command:
```sh
gcc --version
```
- Set gfortran
```sh
export FC=gfortran
```
- Verify gfortran installation
```sh
gfortran --version
```
- Install openblas and set pkg config openblas
```sh
brew install openblas
```
```sh
export PKG_CONFIG_PATH="/opt/homebrew/opt/openblas/lib/pkgconfig"
```
## Usage
To use pykosinus in your Python project, you can follow these steps:
- Import the necessary modules and classes:
```python
from pykosinus import Content
from pykosinus.lib.scoring import TextScoring
```
- Create an instance of the **TextScoring** class, providing the collection name as a parameter:
```python
similarity = TextScoring(collection_name)
```
- Set the contents to be searched using the **push_contents** method, passing a list of **Content** objects:
```python
contents = [
Content(
content="Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
identifier="blog-1",
section="blog_title",
),
Content(
content="Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.",
identifier="blog-2",
section="blog_title",
),
Content(
content="Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris.",
identifier="blog-3",
section="blog_title",
),
# Add more contents as needed
]
similarity.push_contents(contents)
```
- Initialize the similarity search by calling the **initialize** method:
```python
similarity.initialize()
```
- Perform a similarity search by calling the **search** method, providing a keyword and an optional threshold:
```python
results = similarity.search(keyword="search keyword", threshold=0.2)
```
- The **search** method returns a list of **ScoringResult** objects, which contain the relevant information about the search results. You can access the properties of each result, such as **identifier**, **content**, **section**, **similar**, and **score**.
```python
for result in results:
print(
result.identifier, result.content, result.section, result.similar, result.score
)
```
## Contributing
pykosinus welcomes contributions from the community. If you would like to contribute to the library, please follow these steps:
- Fork the pykosinus repository on [**GitHub**](https://github.com/ruriazz/pykosinus).
- Create a new branch for your feature or bug fix.
- Make your changes and commit them with descriptive commit messages.
- Push your changes to your forked repository.
- Submit a pull request to the master pykosinus repository, explaining the changes you have made.
## Versioning
pykosinus is currently in version 0.2.0. We encourage continuous development and contributions from other contributors to improve and expand the library.
## License
pykosinus is released under the [MIT License](https://opensource.org/licenses/MIT).
Raw data
{
"_id": null,
"home_page": "",
"name": "pykosinus",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "pykosinus,cosine similarity,text similarity,python text similarity",
"author": "",
"author_email": "Otoklix <engineering@otoklix.com>",
"download_url": "https://files.pythonhosted.org/packages/75/51/d8353919f1848644cad9a6d94d385dd42f0cc66f7152a95abe2c725ba12e/pykosinus-0.2.0.tar.gz",
"platform": null,
"description": "# pykosinus\n\npykosinus is an open-source Python library for text similarity search scoring. It provides a fast and memory-efficient way to calculate cosine similarity scores, making it suitable for various text similarity applications. The library is designed to be user-friendly and encourages contributions from the community.\n\n## Installation\n\nTo install pykosinus, make sure you have Python 3.8.17 or higher installed. Then, you can install the library using pip:\n\n```shell\npip install pykosinus\n```\n\n## Additional Library for Mac Users\nIf you are using pykosinus on a Mac, you may need to install the GCC compiler to enable certain features. GCC is a widely used compiler for various programming languages.\n\nTo install GCC on macOS, you can use Homebrew, a popular package manager for macOS. Follow these steps to install GCC using Homebrew:\n\n- Open a terminal window.\n- Install Homebrew by running the following command:\n```sh\n[/bin/bash](VALID_FILE) -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"\n```\n- Install GCC by running the following command:\n```sh\nbrew install gcc\n```\n- Verify the installation by running the following command:\n```sh\ngcc --version\n```\n- Set gfortran\n```sh\nexport FC=gfortran\n```\n\n- Verify gfortran installation\n```sh\ngfortran --version\n```\n\n- Install openblas and set pkg config openblas\n```sh\nbrew install openblas\n```\n```sh\nexport PKG_CONFIG_PATH=\"/opt/homebrew/opt/openblas/lib/pkgconfig\"\n```\n\n\n## Usage\nTo use pykosinus in your Python project, you can follow these steps:\n\n- Import the necessary modules and classes:\n```python\nfrom pykosinus import Content\nfrom pykosinus.lib.scoring import TextScoring\n```\n\n- Create an instance of the **TextScoring** class, providing the collection name as a parameter:\n```python\nsimilarity = TextScoring(collection_name)\n```\n\n- Set the contents to be searched using the **push_contents** method, passing a list of **Content** objects:\n```python\ncontents = [\n Content(\n content=\"Lorem ipsum dolor sit amet, consectetur adipiscing elit.\",\n identifier=\"blog-1\",\n section=\"blog_title\",\n ),\n Content(\n content=\"Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\",\n identifier=\"blog-2\",\n section=\"blog_title\",\n ),\n Content(\n content=\"Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris.\",\n identifier=\"blog-3\",\n section=\"blog_title\",\n ),\n # Add more contents as needed\n]\nsimilarity.push_contents(contents)\n```\n\n- Initialize the similarity search by calling the **initialize** method:\n```python\nsimilarity.initialize()\n```\n\n- Perform a similarity search by calling the **search** method, providing a keyword and an optional threshold:\n```python\nresults = similarity.search(keyword=\"search keyword\", threshold=0.2)\n```\n\n- The **search** method returns a list of **ScoringResult** objects, which contain the relevant information about the search results. You can access the properties of each result, such as **identifier**, **content**, **section**, **similar**, and **score**.\n```python\nfor result in results:\n print(\n result.identifier, result.content, result.section, result.similar, result.score\n )\n```\n\n\n## Contributing\npykosinus welcomes contributions from the community. If you would like to contribute to the library, please follow these steps:\n- Fork the pykosinus repository on [**GitHub**](https://github.com/ruriazz/pykosinus).\n- Create a new branch for your feature or bug fix.\n- Make your changes and commit them with descriptive commit messages.\n- Push your changes to your forked repository.\n- Submit a pull request to the master pykosinus repository, explaining the changes you have made.\n\n## Versioning\npykosinus is currently in version 0.2.0. We encourage continuous development and contributions from other contributors to improve and expand the library.\n\n## License\npykosinus is released under the [MIT License](https://opensource.org/licenses/MIT).\n",
"bugtrack_url": null,
"license": "",
"summary": "Simple Text similarity python",
"version": "0.2.0",
"project_urls": {
"Bug Tracker": "https://github.com/ruriazz/pykosinus/issues",
"Homepage": "https://github.com/ruriazz/pykosinus"
},
"split_keywords": [
"pykosinus",
"cosine similarity",
"text similarity",
"python text similarity"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9de697274dca348e368e90b26f9e7c6e2800ee78b9771c5e8c16b5b9d40552c3",
"md5": "bedc503d8b6acafb9d72893ba7c0bc19",
"sha256": "919d87fd16abfc240b044c666d21435e88d77bac894ea9b720862e070c3d5005"
},
"downloads": -1,
"filename": "pykosinus-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "bedc503d8b6acafb9d72893ba7c0bc19",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 11746,
"upload_time": "2023-12-03T09:33:24",
"upload_time_iso_8601": "2023-12-03T09:33:24.966208Z",
"url": "https://files.pythonhosted.org/packages/9d/e6/97274dca348e368e90b26f9e7c6e2800ee78b9771c5e8c16b5b9d40552c3/pykosinus-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7551d8353919f1848644cad9a6d94d385dd42f0cc66f7152a95abe2c725ba12e",
"md5": "83d4f495353af187e217ba1711c474d9",
"sha256": "22b85509b3ac9f41f2128b792bc7d95b94a51a56b2506e645ec13980175f4a60"
},
"downloads": -1,
"filename": "pykosinus-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "83d4f495353af187e217ba1711c474d9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 11471,
"upload_time": "2023-12-03T09:33:26",
"upload_time_iso_8601": "2023-12-03T09:33:26.955462Z",
"url": "https://files.pythonhosted.org/packages/75/51/d8353919f1848644cad9a6d94d385dd42f0cc66f7152a95abe2c725ba12e/pykosinus-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-03 09:33:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ruriazz",
"github_project": "pykosinus",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pykosinus"
}