# recs-searcher (Registry error correction system - Searcher)
pip install recs-searcher
Система способна решить следующие задачи:
1. Исправление реестровых ошибок пользовательского ввода при сравнении с базой данных;
2. Поиск схожих текстовых записей на пользовательский текст по базе данных.
Функциональные возможности:
1. обучение моделей для создания эмбеддингов (например, TfIDF, FastText, SentenceTransformer) на собственных данных;
2. быстрый поиск по базе данных (например, KNN, Faiss, Chroma-DB, TheFuzzSearch);
3. дообучение.
Реализованные модули:
1. api;
2. augmentation;
3. dataset;
4. models;
5. preprocessing;
6. similarity_search.
## Примеры применения
Пример для быстрого использования: [пример API](https://github.com/sheriff1max/recs-searcher/blob/master/notebooks/tutorial_rus.ipynb)
Raw data
{
"_id": null,
"home_page": "https://github.com/sheriff1max/recs-searcher",
"name": "recs-searcher",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "python searcher corrector faiss chroma-bd embeddings",
"author": "sheriff1max",
"author_email": "kobelevmaxim48@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/15/b6/10c7db014f8a6e2760ea53b639068e3cd121021f7a1babfb83c682bad2be/recs-searcher-0.1.0.tar.gz",
"platform": null,
"description": "# recs-searcher (Registry error correction system - Searcher)\r\n\r\n pip install recs-searcher\r\n\r\n\u0421\u0438\u0441\u0442\u0435\u043c\u0430 \u0441\u043f\u043e\u0441\u043e\u0431\u043d\u0430 \u0440\u0435\u0448\u0438\u0442\u044c \u0441\u043b\u0435\u0434\u0443\u044e\u0449\u0438\u0435 \u0437\u0430\u0434\u0430\u0447\u0438:\r\n1. \u0418\u0441\u043f\u0440\u0430\u0432\u043b\u0435\u043d\u0438\u0435 \u0440\u0435\u0435\u0441\u0442\u0440\u043e\u0432\u044b\u0445 \u043e\u0448\u0438\u0431\u043e\u043a \u043f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u0442\u0435\u043b\u044c\u0441\u043a\u043e\u0433\u043e \u0432\u0432\u043e\u0434\u0430 \u043f\u0440\u0438 \u0441\u0440\u0430\u0432\u043d\u0435\u043d\u0438\u0438 \u0441 \u0431\u0430\u0437\u043e\u0439 \u0434\u0430\u043d\u043d\u044b\u0445;\r\n2. \u041f\u043e\u0438\u0441\u043a \u0441\u0445\u043e\u0436\u0438\u0445 \u0442\u0435\u043a\u0441\u0442\u043e\u0432\u044b\u0445 \u0437\u0430\u043f\u0438\u0441\u0435\u0439 \u043d\u0430 \u043f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u0442\u0435\u043b\u044c\u0441\u043a\u0438\u0439 \u0442\u0435\u043a\u0441\u0442 \u043f\u043e \u0431\u0430\u0437\u0435 \u0434\u0430\u043d\u043d\u044b\u0445.\r\n\r\n\u0424\u0443\u043d\u043a\u0446\u0438\u043e\u043d\u0430\u043b\u044c\u043d\u044b\u0435 \u0432\u043e\u0437\u043c\u043e\u0436\u043d\u043e\u0441\u0442\u0438:\r\n1. \u043e\u0431\u0443\u0447\u0435\u043d\u0438\u0435 \u043c\u043e\u0434\u0435\u043b\u0435\u0439 \u0434\u043b\u044f \u0441\u043e\u0437\u0434\u0430\u043d\u0438\u044f \u044d\u043c\u0431\u0435\u0434\u0434\u0438\u043d\u0433\u043e\u0432 (\u043d\u0430\u043f\u0440\u0438\u043c\u0435\u0440, TfIDF, FastText, SentenceTransformer) \u043d\u0430 \u0441\u043e\u0431\u0441\u0442\u0432\u0435\u043d\u043d\u044b\u0445 \u0434\u0430\u043d\u043d\u044b\u0445;\r\n2. \u0431\u044b\u0441\u0442\u0440\u044b\u0439 \u043f\u043e\u0438\u0441\u043a \u043f\u043e \u0431\u0430\u0437\u0435 \u0434\u0430\u043d\u043d\u044b\u0445 (\u043d\u0430\u043f\u0440\u0438\u043c\u0435\u0440, KNN, Faiss, Chroma-DB, TheFuzzSearch);\r\n3. \u0434\u043e\u043e\u0431\u0443\u0447\u0435\u043d\u0438\u0435.\r\n\r\n\u0420\u0435\u0430\u043b\u0438\u0437\u043e\u0432\u0430\u043d\u043d\u044b\u0435 \u043c\u043e\u0434\u0443\u043b\u0438:\r\n1. api;\r\n2. augmentation;\r\n3. dataset;\r\n4. models;\r\n5. preprocessing;\r\n6. similarity_search.\r\n\r\n## \u041f\u0440\u0438\u043c\u0435\u0440\u044b \u043f\u0440\u0438\u043c\u0435\u043d\u0435\u043d\u0438\u044f\r\n\u041f\u0440\u0438\u043c\u0435\u0440 \u0434\u043b\u044f \u0431\u044b\u0441\u0442\u0440\u043e\u0433\u043e \u0438\u0441\u043f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u043d\u0438\u044f: [\u043f\u0440\u0438\u043c\u0435\u0440 API](https://github.com/sheriff1max/recs-searcher/blob/master/notebooks/tutorial_rus.ipynb)\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Search engine and registry error corrector",
"version": "0.1.0",
"project_urls": {
"Bug tracker": "https://github.com/sheriff1max/recs-searcher/issues",
"Homepage": "https://github.com/sheriff1max/recs-searcher"
},
"split_keywords": [
"python",
"searcher",
"corrector",
"faiss",
"chroma-bd",
"embeddings"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "54b73a21d9dc174b5cf1a2d5e70eecd98d0b4bd6b9710b73b56115e73108af49",
"md5": "47b7ca54f6a9ac78dfd157d9a870261a",
"sha256": "0b0cf38c1bcd41489140210957d4fad1558a78e8ec11a0dd0cef5c74a8e96017"
},
"downloads": -1,
"filename": "recs_searcher-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "47b7ca54f6a9ac78dfd157d9a870261a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 732185,
"upload_time": "2024-04-26T12:09:44",
"upload_time_iso_8601": "2024-04-26T12:09:44.912204Z",
"url": "https://files.pythonhosted.org/packages/54/b7/3a21d9dc174b5cf1a2d5e70eecd98d0b4bd6b9710b73b56115e73108af49/recs_searcher-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "15b610c7db014f8a6e2760ea53b639068e3cd121021f7a1babfb83c682bad2be",
"md5": "59e38f0b98a36eefb5bc815d473e9cf5",
"sha256": "28750b11fc595bf8692b3b4284f15542317e7644f729d2ec4a8c3d78d8b5c22b"
},
"downloads": -1,
"filename": "recs-searcher-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "59e38f0b98a36eefb5bc815d473e9cf5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 680727,
"upload_time": "2024-04-26T12:09:46",
"upload_time_iso_8601": "2024-04-26T12:09:46.557705Z",
"url": "https://files.pythonhosted.org/packages/15/b6/10c7db014f8a6e2760ea53b639068e3cd121021f7a1babfb83c682bad2be/recs-searcher-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-26 12:09:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sheriff1max",
"github_project": "recs-searcher",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "recs-searcher"
}