eche


Nameeche JSON
Version 0.2.1 PyPI version JSON
download
home_pagehttps://github.com/dobraczka/eche
SummaryLittle helper for handling entity clusters
upload_time2024-03-22 14:22:58
maintainerNone
docs_urlNone
authorDaniel Obraczka
requires_python<4.0,>=3.8
licenseMIT
keywords entity resolution record linkage clustering connected components transitive closure
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
<img src="https://github.com/dobraczka/eche/raw/main/docs/assets/logo.png" alt="eche logo", width=200/>
</p>

<p align="center">
<a href="https://github.com/dobraczka/eche/actions/workflows/main.yml"><img alt="Actions Status" src="https://github.com/dobraczka/eche/actions/workflows/main.yml/badge.svg?branch=main"></a>
<a href='https://eche.readthedocs.io/en/latest/?badge=latest'><img src='https://readthedocs.org/projects/eche/badge/?version=latest' alt='Documentation Status' /></a>
<a href="https://pypi.org/project/eche"/><img alt="Stable python versions" src="https://img.shields.io/pypi/pyversions/eche"></a>
<a href="https://github.com/astral-sh/ruff"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json" alt="Ruff" style="max-width:100%;"></a>
</p>

Usage
=====
Eche provides a `ClusterHelper` class to conveniently handle entity clusters.

```python
  from eche import ClusterHelper
  ch = ClusterHelper([{"a1", "b1"}, {"a2", "b2"}])
  print(ch.clusters)
  {0: {'a1', 'b1'}, 1: {'a2', 'b2'}}
```

Add an element to a cluster

```python
  ch.add_to_cluster(0, "c1")
  print(ch.clusters)
  {0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}}
```

Add a new cluster

```python
  ch.add({"e2", "f1", "c3"})
  print(ch.clusters)
  {0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}
```

Remove an element from a cluster

```python
  ch.remove("b1")
  print(ch.clusters)
  {0: {'a1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}
```

The ``__contains__`` function is smartly overloaded. You can check if an entity is in the `ClusterHelper`:

```python
  "a1" in ch
  # True
```

If a cluster is present

```python
  {"c1","a1"} in ch
  # True
```

And even if a link exists or not

```python
  ("f1","e2") in ch
  # True
  ("a1","e2") in ch
  # False
```

To know the cluster id of an entity you can look it up with

```python
  print(ch.elements["a1"])
  0
```

To get members of a cluster either use

```python
  print(ch.members(0))
  {'a1', 'b1', 'c1'}
```

or simply

```python
  print(ch[0])
  {'a1', 'b1', 'c1'}
```

More functions can be found in the [Documentation](https://eche.readthedocs.io).

Installation
============
Simply use `pip` for installation:
```
pip install eche
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dobraczka/eche",
    "name": "eche",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "entity resolution, record linkage, clustering, connected components, transitive closure",
    "author": "Daniel Obraczka",
    "author_email": "obraczka@informatik.uni-leipzig.de",
    "download_url": "https://files.pythonhosted.org/packages/02/f3/dacb63e2d2054a22dbd403912eafb8c49f98dabf8be3e3c0b8ca694f5c3c/eche-0.2.1.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n<img src=\"https://github.com/dobraczka/eche/raw/main/docs/assets/logo.png\" alt=\"eche logo\", width=200/>\n</p>\n\n<p align=\"center\">\n<a href=\"https://github.com/dobraczka/eche/actions/workflows/main.yml\"><img alt=\"Actions Status\" src=\"https://github.com/dobraczka/eche/actions/workflows/main.yml/badge.svg?branch=main\"></a>\n<a href='https://eche.readthedocs.io/en/latest/?badge=latest'><img src='https://readthedocs.org/projects/eche/badge/?version=latest' alt='Documentation Status' /></a>\n<a href=\"https://pypi.org/project/eche\"/><img alt=\"Stable python versions\" src=\"https://img.shields.io/pypi/pyversions/eche\"></a>\n<a href=\"https://github.com/astral-sh/ruff\"><img src=\"https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json\" alt=\"Ruff\" style=\"max-width:100%;\"></a>\n</p>\n\nUsage\n=====\nEche provides a `ClusterHelper` class to conveniently handle entity clusters.\n\n```python\n  from eche import ClusterHelper\n  ch = ClusterHelper([{\"a1\", \"b1\"}, {\"a2\", \"b2\"}])\n  print(ch.clusters)\n  {0: {'a1', 'b1'}, 1: {'a2', 'b2'}}\n```\n\nAdd an element to a cluster\n\n```python\n  ch.add_to_cluster(0, \"c1\")\n  print(ch.clusters)\n  {0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}}\n```\n\nAdd a new cluster\n\n```python\n  ch.add({\"e2\", \"f1\", \"c3\"})\n  print(ch.clusters)\n  {0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}\n```\n\nRemove an element from a cluster\n\n```python\n  ch.remove(\"b1\")\n  print(ch.clusters)\n  {0: {'a1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}\n```\n\nThe ``__contains__`` function is smartly overloaded. You can check if an entity is in the `ClusterHelper`:\n\n```python\n  \"a1\" in ch\n  # True\n```\n\nIf a cluster is present\n\n```python\n  {\"c1\",\"a1\"} in ch\n  # True\n```\n\nAnd even if a link exists or not\n\n```python\n  (\"f1\",\"e2\") in ch\n  # True\n  (\"a1\",\"e2\") in ch\n  # False\n```\n\nTo know the cluster id of an entity you can look it up with\n\n```python\n  print(ch.elements[\"a1\"])\n  0\n```\n\nTo get members of a cluster either use\n\n```python\n  print(ch.members(0))\n  {'a1', 'b1', 'c1'}\n```\n\nor simply\n\n```python\n  print(ch[0])\n  {'a1', 'b1', 'c1'}\n```\n\nMore functions can be found in the [Documentation](https://eche.readthedocs.io).\n\nInstallation\n============\nSimply use `pip` for installation:\n```\npip install eche\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Little helper for handling entity clusters",
    "version": "0.2.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/dobraczka/eche/issues",
        "Documentation": "https://eche.readthedocs.io",
        "Homepage": "https://github.com/dobraczka/eche",
        "Repository": "https://github.com/dobraczka/eche",
        "Source": "https://github.com/dobraczka/eche"
    },
    "split_keywords": [
        "entity resolution",
        " record linkage",
        " clustering",
        " connected components",
        " transitive closure"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e4a49f497a6cec96f97e5dffcc0929c9d8dadc50539298f923a2b64dfe5fec2e",
                "md5": "d21e99c7956c8ec9f08a32086c5c4e7b",
                "sha256": "277c1fe5dbe40ac92d5eca00917bcb278ccb987f7bf044df885da41779330ad7"
            },
            "downloads": -1,
            "filename": "eche-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d21e99c7956c8ec9f08a32086c5c4e7b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 10283,
            "upload_time": "2024-03-22T14:22:56",
            "upload_time_iso_8601": "2024-03-22T14:22:56.709218Z",
            "url": "https://files.pythonhosted.org/packages/e4/a4/9f497a6cec96f97e5dffcc0929c9d8dadc50539298f923a2b64dfe5fec2e/eche-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "02f3dacb63e2d2054a22dbd403912eafb8c49f98dabf8be3e3c0b8ca694f5c3c",
                "md5": "73d407fe1b6a16db9d6d619234a69660",
                "sha256": "27f0d6fc2bfe3f27361bd1205b2b88d92b30acf99032eef00449a8ab0de348b0"
            },
            "downloads": -1,
            "filename": "eche-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "73d407fe1b6a16db9d6d619234a69660",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 11601,
            "upload_time": "2024-03-22T14:22:58",
            "upload_time_iso_8601": "2024-03-22T14:22:58.687554Z",
            "url": "https://files.pythonhosted.org/packages/02/f3/dacb63e2d2054a22dbd403912eafb8c49f98dabf8be3e3c0b8ca694f5c3c/eche-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-22 14:22:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dobraczka",
    "github_project": "eche",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "eche"
}
        
Elapsed time: 0.42774s