semsim


Namesemsim JSON
Version 1.1.1 PyPI version JSON
download
home_pagehttps://gitlab.com/Mathematician2000/semsim
SummaryA free tool for sentence similarity evaluation
upload_time2023-05-08 19:19:21
maintainer
docs_urlNone
authorDavid Avagyan
requires_python>=3.9
licenseBSD 3-Clause License
keywords nlp dependency parsing conll-u sentence similarity
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # semsim

Compare texts easily with `semsim` Python package!

## Features
- Dozens of **parameters** to be tuned by you for better performance!
- **Default values** of all the parameters validated on datasets for paraphrase detection task
- 6 different **algorithms** for efficient syntax tree comparison
- A small pack of **standard "built-in" models** which can be easily downloaded via `semsim` package itself
- Flexible **class taxonomy** which you can extend by simply inheriting from one of the model base classes
- Python library `semsim` with **command line interface** (powered by `click`)

## Dependencies
- attrs
- click
- networkx
- numpy
- pymorphy2
- scipy
- simple_elmo
- tensorflow
- tensorrt
- textract
- torch
- torch-geometric
- torch-scatter
- torch-sparse
- torchwordemb
- tqdm
- ufal.udpipe

## Quick start
To install `semsim` simply run:

`pip install semsim`

---
> **NOTE**: If you encounter problems when installing `semsim` package,
> consider first installing some prerequisites in advance:
> `$ pip install torch tensorflow tensorrt`
> Then proceed to install `semsim`.
---

Now you can use `semsim` CLI tool as follows:

`$ semsim first_src.txt second_src.txt -o output.txt`

You might want to download standard "built-in" (or we should say "add-on") models for better performance.
This can be done by executing the following line:

`$ semsim download cbow`

for fetching pretrained CBOW embeddings or

`$ semsim download -a`

for downloading **all** the add-ons at once in parallel.

More info can be found on the [documentation](https://pysemsim.readthedocs.io) page.

## Codestyle linters and test frameworks
This library has been fully checked and tested with the following tools:
- flake8
- mypy
- pydocstyle
- pytest

## Interface
CLI interface is described in the [examples](https://pysemsim.readthedocs.io/examples)
section of [documentation](https://pysemsim.readthedocs.io).
This is how you can use `semsim` CLI tool:

`$ semsim compare first_src.txt second_src.txt -e cbow -k neural -o output.txt --max-out-pairs 200 -v`

## Authors
- [Mathematician2000](https://gitlab.com/Mathematician2000)

            

Raw data

            {
    "_id": null,
    "home_page": "https://gitlab.com/Mathematician2000/semsim",
    "name": "semsim",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "NLP,dependency parsing,CoNLL-U,sentence similarity",
    "author": "David Avagyan",
    "author_email": "david_avagyan@list.ru",
    "download_url": "https://files.pythonhosted.org/packages/84/b4/a9dff7c56a7fef02cd95e4f23f373f4f07bd97f0d495c1f2882810a7f944/semsim-1.1.1.tar.gz",
    "platform": null,
    "description": "# semsim\n\nCompare texts easily with `semsim` Python package!\n\n## Features\n- Dozens of **parameters** to be tuned by you for better performance!\n- **Default values** of all the parameters validated on datasets for paraphrase detection task\n- 6 different **algorithms** for efficient syntax tree comparison\n- A small pack of **standard \"built-in\" models** which can be easily downloaded via `semsim` package itself\n- Flexible **class taxonomy** which you can extend by simply inheriting from one of the model base classes\n- Python library `semsim` with **command line interface** (powered by `click`)\n\n## Dependencies\n- attrs\n- click\n- networkx\n- numpy\n- pymorphy2\n- scipy\n- simple_elmo\n- tensorflow\n- tensorrt\n- textract\n- torch\n- torch-geometric\n- torch-scatter\n- torch-sparse\n- torchwordemb\n- tqdm\n- ufal.udpipe\n\n## Quick start\nTo install `semsim` simply run:\n\n`pip install semsim`\n\n---\n> **NOTE**: If you encounter problems when installing `semsim` package,\n> consider first installing some prerequisites in advance:\n> `$ pip install torch tensorflow tensorrt`\n> Then proceed to install `semsim`.\n---\n\nNow you can use `semsim` CLI tool as follows:\n\n`$ semsim first_src.txt second_src.txt -o output.txt`\n\nYou might want to download standard \"built-in\" (or we should say \"add-on\") models for better performance.\nThis can be done by executing the following line:\n\n`$ semsim download cbow`\n\nfor fetching pretrained CBOW embeddings or\n\n`$ semsim download -a`\n\nfor downloading **all** the add-ons at once in parallel.\n\nMore info can be found on the [documentation](https://pysemsim.readthedocs.io) page.\n\n## Codestyle linters and test frameworks\nThis library has been fully checked and tested with the following tools:\n- flake8\n- mypy\n- pydocstyle\n- pytest\n\n## Interface\nCLI interface is described in the [examples](https://pysemsim.readthedocs.io/examples)\nsection of [documentation](https://pysemsim.readthedocs.io).\nThis is how you can use `semsim` CLI tool:\n\n`$ semsim compare first_src.txt second_src.txt -e cbow -k neural -o output.txt --max-out-pairs 200 -v`\n\n## Authors\n- [Mathematician2000](https://gitlab.com/Mathematician2000)\n",
    "bugtrack_url": null,
    "license": "BSD 3-Clause License",
    "summary": "A free tool for sentence similarity evaluation",
    "version": "1.1.1",
    "project_urls": {
        "Documentation": "https://pysemsim.readthedocs.io/",
        "Homepage": "https://gitlab.com/Mathematician2000/semsim"
    },
    "split_keywords": [
        "nlp",
        "dependency parsing",
        "conll-u",
        "sentence similarity"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6bd778426c1345af25c759ee6cab4c3f52692d36f528b8f019814a3e2f8e77c4",
                "md5": "c486f2bfa273108fd468f7b511b03c59",
                "sha256": "e37c55e38f72696fd6c4549bfc20baad1cf035d372023d0209270f3dbcebbe27"
            },
            "downloads": -1,
            "filename": "semsim-1.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c486f2bfa273108fd468f7b511b03c59",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 34526,
            "upload_time": "2023-05-08T19:19:19",
            "upload_time_iso_8601": "2023-05-08T19:19:19.990530Z",
            "url": "https://files.pythonhosted.org/packages/6b/d7/78426c1345af25c759ee6cab4c3f52692d36f528b8f019814a3e2f8e77c4/semsim-1.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "84b4a9dff7c56a7fef02cd95e4f23f373f4f07bd97f0d495c1f2882810a7f944",
                "md5": "b929054f95c3d3cdc5b9f6724b79ab37",
                "sha256": "7f54318463115d6fcef6e9434cad3f48ef995b68c5969b852e3710c31b92ea6f"
            },
            "downloads": -1,
            "filename": "semsim-1.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "b929054f95c3d3cdc5b9f6724b79ab37",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 29359,
            "upload_time": "2023-05-08T19:19:21",
            "upload_time_iso_8601": "2023-05-08T19:19:21.592510Z",
            "url": "https://files.pythonhosted.org/packages/84/b4/a9dff7c56a7fef02cd95e4f23f373f4f07bd97f0d495c1f2882810a7f944/semsim-1.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-08 19:19:21",
    "github": false,
    "gitlab": true,
    "bitbucket": false,
    "codeberg": false,
    "gitlab_user": "Mathematician2000",
    "gitlab_project": "semsim",
    "lcname": "semsim"
}
        
Elapsed time: 0.07696s