<p align="center">
<img src="https://github.com/dobraczka/eche/raw/main/docs/assets/logo.png" alt="eche logo", width=200/>
</p>
<p align="center">
<a href="https://github.com/dobraczka/eche/actions/workflows/main.yml"><img alt="Actions Status" src="https://github.com/dobraczka/eche/actions/workflows/main.yml/badge.svg?branch=main"></a>
<a href='https://eche.readthedocs.io/en/latest/?badge=latest'><img src='https://readthedocs.org/projects/eche/badge/?version=latest' alt='Documentation Status' /></a>
<a href="https://pypi.org/project/eche"/><img alt="Stable python versions" src="https://img.shields.io/pypi/pyversions/eche"></a>
<a href="https://github.com/astral-sh/ruff"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json" alt="Ruff" style="max-width:100%;"></a>
</p>
Usage
=====
Eche provides a `ClusterHelper` class to conveniently handle entity clusters.
```python
from eche import ClusterHelper
ch = ClusterHelper([{"a1", "b1"}, {"a2", "b2"}])
print(ch.clusters)
{0: {'a1', 'b1'}, 1: {'a2', 'b2'}}
```
Add an element to a cluster
```python
ch.add_to_cluster(0, "c1")
print(ch.clusters)
{0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}}
```
Add a new cluster
```python
ch.add({"e2", "f1", "c3"})
print(ch.clusters)
{0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}
```
Remove an element from a cluster
```python
ch.remove("b1")
print(ch.clusters)
{0: {'a1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}
```
The ``__contains__`` function is smartly overloaded. You can check if an entity is in the `ClusterHelper`:
```python
"a1" in ch
# True
```
If a cluster is present
```python
{"c1","a1"} in ch
# True
```
And even if a link exists or not
```python
("f1","e2") in ch
# True
("a1","e2") in ch
# False
```
To know the cluster id of an entity you can look it up with
```python
print(ch.elements["a1"])
0
```
To get members of a cluster either use
```python
print(ch.members(0))
{'a1', 'b1', 'c1'}
```
or simply
```python
print(ch[0])
{'a1', 'b1', 'c1'}
```
More functions can be found in the [Documentation](https://eche.readthedocs.io).
Installation
============
Simply use `pip` for installation:
```
pip install eche
```
Raw data
{
"_id": null,
"home_page": "https://github.com/dobraczka/eche",
"name": "eche",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.8",
"maintainer_email": null,
"keywords": "entity resolution, record linkage, clustering, connected components, transitive closure",
"author": "Daniel Obraczka",
"author_email": "obraczka@informatik.uni-leipzig.de",
"download_url": "https://files.pythonhosted.org/packages/02/f3/dacb63e2d2054a22dbd403912eafb8c49f98dabf8be3e3c0b8ca694f5c3c/eche-0.2.1.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n<img src=\"https://github.com/dobraczka/eche/raw/main/docs/assets/logo.png\" alt=\"eche logo\", width=200/>\n</p>\n\n<p align=\"center\">\n<a href=\"https://github.com/dobraczka/eche/actions/workflows/main.yml\"><img alt=\"Actions Status\" src=\"https://github.com/dobraczka/eche/actions/workflows/main.yml/badge.svg?branch=main\"></a>\n<a href='https://eche.readthedocs.io/en/latest/?badge=latest'><img src='https://readthedocs.org/projects/eche/badge/?version=latest' alt='Documentation Status' /></a>\n<a href=\"https://pypi.org/project/eche\"/><img alt=\"Stable python versions\" src=\"https://img.shields.io/pypi/pyversions/eche\"></a>\n<a href=\"https://github.com/astral-sh/ruff\"><img src=\"https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json\" alt=\"Ruff\" style=\"max-width:100%;\"></a>\n</p>\n\nUsage\n=====\nEche provides a `ClusterHelper` class to conveniently handle entity clusters.\n\n```python\n from eche import ClusterHelper\n ch = ClusterHelper([{\"a1\", \"b1\"}, {\"a2\", \"b2\"}])\n print(ch.clusters)\n {0: {'a1', 'b1'}, 1: {'a2', 'b2'}}\n```\n\nAdd an element to a cluster\n\n```python\n ch.add_to_cluster(0, \"c1\")\n print(ch.clusters)\n {0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}}\n```\n\nAdd a new cluster\n\n```python\n ch.add({\"e2\", \"f1\", \"c3\"})\n print(ch.clusters)\n {0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}\n```\n\nRemove an element from a cluster\n\n```python\n ch.remove(\"b1\")\n print(ch.clusters)\n {0: {'a1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}\n```\n\nThe ``__contains__`` function is smartly overloaded. You can check if an entity is in the `ClusterHelper`:\n\n```python\n \"a1\" in ch\n # True\n```\n\nIf a cluster is present\n\n```python\n {\"c1\",\"a1\"} in ch\n # True\n```\n\nAnd even if a link exists or not\n\n```python\n (\"f1\",\"e2\") in ch\n # True\n (\"a1\",\"e2\") in ch\n # False\n```\n\nTo know the cluster id of an entity you can look it up with\n\n```python\n print(ch.elements[\"a1\"])\n 0\n```\n\nTo get members of a cluster either use\n\n```python\n print(ch.members(0))\n {'a1', 'b1', 'c1'}\n```\n\nor simply\n\n```python\n print(ch[0])\n {'a1', 'b1', 'c1'}\n```\n\nMore functions can be found in the [Documentation](https://eche.readthedocs.io).\n\nInstallation\n============\nSimply use `pip` for installation:\n```\npip install eche\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Little helper for handling entity clusters",
"version": "0.2.1",
"project_urls": {
"Bug Tracker": "https://github.com/dobraczka/eche/issues",
"Documentation": "https://eche.readthedocs.io",
"Homepage": "https://github.com/dobraczka/eche",
"Repository": "https://github.com/dobraczka/eche",
"Source": "https://github.com/dobraczka/eche"
},
"split_keywords": [
"entity resolution",
" record linkage",
" clustering",
" connected components",
" transitive closure"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e4a49f497a6cec96f97e5dffcc0929c9d8dadc50539298f923a2b64dfe5fec2e",
"md5": "d21e99c7956c8ec9f08a32086c5c4e7b",
"sha256": "277c1fe5dbe40ac92d5eca00917bcb278ccb987f7bf044df885da41779330ad7"
},
"downloads": -1,
"filename": "eche-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d21e99c7956c8ec9f08a32086c5c4e7b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.8",
"size": 10283,
"upload_time": "2024-03-22T14:22:56",
"upload_time_iso_8601": "2024-03-22T14:22:56.709218Z",
"url": "https://files.pythonhosted.org/packages/e4/a4/9f497a6cec96f97e5dffcc0929c9d8dadc50539298f923a2b64dfe5fec2e/eche-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "02f3dacb63e2d2054a22dbd403912eafb8c49f98dabf8be3e3c0b8ca694f5c3c",
"md5": "73d407fe1b6a16db9d6d619234a69660",
"sha256": "27f0d6fc2bfe3f27361bd1205b2b88d92b30acf99032eef00449a8ab0de348b0"
},
"downloads": -1,
"filename": "eche-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "73d407fe1b6a16db9d6d619234a69660",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.8",
"size": 11601,
"upload_time": "2024-03-22T14:22:58",
"upload_time_iso_8601": "2024-03-22T14:22:58.687554Z",
"url": "https://files.pythonhosted.org/packages/02/f3/dacb63e2d2054a22dbd403912eafb8c49f98dabf8be3e3c0b8ca694f5c3c/eche-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-22 14:22:58",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dobraczka",
"github_project": "eche",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "eche"
}