<p align="center">
<img src="https://github.com/dobraczka/embarrassment/raw/main/docs/logo.png" alt="kiez logo", width=200/>
</p>
<p align="center">
<a href="https://github.com/dobraczka/embarrassment/actions/workflows/main.yml"><img alt="Actions Status" src="https://github.com/dobraczka/embarrassment/actions/workflows/main.yml/badge.svg?branch=main"></a>
<a href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
</p>
Convenience functions for pandas dataframes containing triples. Fun fact: a group of pandas (e.g. three) is commonly referred to as an [embarrassment](https://www.zmescience.com/feature-post/what-is-a-group-of-pandas-called-its-surprisingly-complicated/).
This library's main focus is to easily make commonly used functions available, when exploring [triples](https://en.wikipedia.org/wiki/Semantic_triple) stored in pandas dataframes. It is not meant to be an efficient graph analysis library.
Usage
=====
You can use a variety of convenience functions, let's create some simple example triples:
```python
>>> import pandas as pd
>>> rel = pd.DataFrame([("e1","rel1","e2"), ("e3", "rel2", "e1")], columns=["head","relation","tail"])
>>> attr = pd.DataFrame([("e1","attr1","lorem ipsum"), ("e2","attr2","dolor")], columns=["head","relation","tail"])
```
Search in attribute triples:
```python
>>> from embarrassment import search
>>> search(attr, "lorem ipsum")
head relation tail
0 e1 attr1 lorem ipsum
>>> search(attr, "lorem", method="substring")
head relation tail
0 e1 attr1 lorem ipsum
```
Select triples with a specific relation:
```python
>>> from embarrassment import select_rel
>>> select_rel(rel, "rel1")
head relation tail
0 e1 rel1 e2
```
Perform operations on the immediate neighbor(s) of an entity, e.g. get the attribute triples:
```python
>>> from embarrassment import neighbor_attr_triples
>>> neighbor_attr_triples(rel, attr, "e1")
head relation tail
1 e2 attr2 dolor
```
Or just get the triples:
```python
>>> from embarrassment import neighbor_rel_triples
>>> neighbor_rel_triples(rel, "e1")
head relation tail
1 e3 rel2 e1
0 e1 rel1 e2
```
By default you get in- and out-links, but you can specify a direction:
```python
>>> neighbor_rel_triples(rel, "e1", in_out_both="in")
head relation tail
1 e3 rel2 e1
>>> neighbor_rel_triples(rel, "e1", in_out_both="out")
head relation tail
0 e1 rel1 e2
```
Using pandas' [pipe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pipe.html) operator you can chain operations.
Let's see a more elaborate example by loading a dataset from [sylloge](https://github.com/dobraczka/sylloge):
```python
>>> from sylloge import MovieGraphBenchmark
>>> from embarrassment import clean, neighbor_attr_triples, search, select_rel
>>> ds = MovieGraphBenchmark()
>>> # clean attribute triples
>>> cleaned_attr = clean(ds.attr_triples_left)
>>> # find uri of James Tolkan
>>> jt = search(cleaned_attr, query="James Tolkan")["head"].iloc[0]
>>> # get neighbor triples
>>> # and select triples with title and show values
>>> title_rel = "https://www.scads.de/movieBenchmark/ontology/title"
>>> neighbor_attr_triples(ds.rel_triples_left, cleaned_attr, jt).pipe(
select_rel, rel=title_rel
)["tail"]
)
12234 A Nero Wolfe Mystery
12282 Door to Death
12440 Die Like a Dog
12461 The Next Witness
Name: tail, dtype: object
```
Installation
============
You can install `embarrassment` via pip:
```
pip install embarrassment
```
Raw data
{
"_id": null,
"home_page": "https://github.com/dobraczka/embarrassment",
"name": "embarrassment",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<4.0",
"maintainer_email": "",
"keywords": "pandas,rdf,knowledge graph",
"author": "Daniel Obraczka",
"author_email": "obraczka@informatik.uni-leipzig.de",
"download_url": "https://files.pythonhosted.org/packages/91/43/53a032549b9867f168c7a1f8810788bbb9dfcce440ff3e891e83bad73b3e/embarrassment-0.1.0.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n<img src=\"https://github.com/dobraczka/embarrassment/raw/main/docs/logo.png\" alt=\"kiez logo\", width=200/>\n</p>\n<p align=\"center\">\n<a href=\"https://github.com/dobraczka/embarrassment/actions/workflows/main.yml\"><img alt=\"Actions Status\" src=\"https://github.com/dobraczka/embarrassment/actions/workflows/main.yml/badge.svg?branch=main\"></a>\n<a href=\"https://github.com/psf/black\"><img alt=\"Code style: black\" src=\"https://img.shields.io/badge/code%20style-black-000000.svg\"></a>\n</p>\n\nConvenience functions for pandas dataframes containing triples. Fun fact: a group of pandas (e.g. three) is commonly referred to as an [embarrassment](https://www.zmescience.com/feature-post/what-is-a-group-of-pandas-called-its-surprisingly-complicated/).\n\nThis library's main focus is to easily make commonly used functions available, when exploring [triples](https://en.wikipedia.org/wiki/Semantic_triple) stored in pandas dataframes. It is not meant to be an efficient graph analysis library.\n\nUsage\n=====\nYou can use a variety of convenience functions, let's create some simple example triples:\n```python\n>>> import pandas as pd\n>>> rel = pd.DataFrame([(\"e1\",\"rel1\",\"e2\"), (\"e3\", \"rel2\", \"e1\")], columns=[\"head\",\"relation\",\"tail\"])\n>>> attr = pd.DataFrame([(\"e1\",\"attr1\",\"lorem ipsum\"), (\"e2\",\"attr2\",\"dolor\")], columns=[\"head\",\"relation\",\"tail\"])\n```\nSearch in attribute triples:\n```python\n>>> from embarrassment import search\n>>> search(attr, \"lorem ipsum\")\n head relation tail\n0 e1 attr1 lorem ipsum\n>>> search(attr, \"lorem\", method=\"substring\")\n head relation tail\n0 e1 attr1 lorem ipsum\n```\nSelect triples with a specific relation:\n```python\n>>> from embarrassment import select_rel\n>>> select_rel(rel, \"rel1\")\n head relation tail\n0 e1 rel1 e2\n```\nPerform operations on the immediate neighbor(s) of an entity, e.g. get the attribute triples:\n```python\n>>> from embarrassment import neighbor_attr_triples\n>>> neighbor_attr_triples(rel, attr, \"e1\")\n head relation tail\n1 e2 attr2 dolor\n```\nOr just get the triples:\n```python\n>>> from embarrassment import neighbor_rel_triples\n>>> neighbor_rel_triples(rel, \"e1\")\n head relation tail\n1 e3 rel2 e1\n0 e1 rel1 e2\n```\nBy default you get in- and out-links, but you can specify a direction:\n```python\n>>> neighbor_rel_triples(rel, \"e1\", in_out_both=\"in\")\n head relation tail\n1 e3 rel2 e1\n>>> neighbor_rel_triples(rel, \"e1\", in_out_both=\"out\")\n head relation tail\n0 e1 rel1 e2\n```\n\nUsing pandas' [pipe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pipe.html) operator you can chain operations.\nLet's see a more elaborate example by loading a dataset from [sylloge](https://github.com/dobraczka/sylloge):\n\n```python\n>>> from sylloge import MovieGraphBenchmark\n>>> from embarrassment import clean, neighbor_attr_triples, search, select_rel\n>>> ds = MovieGraphBenchmark()\n>>> # clean attribute triples\n>>> cleaned_attr = clean(ds.attr_triples_left)\n>>> # find uri of James Tolkan\n>>> jt = search(cleaned_attr, query=\"James Tolkan\")[\"head\"].iloc[0]\n>>> # get neighbor triples\n>>> # and select triples with title and show values\n>>> title_rel = \"https://www.scads.de/movieBenchmark/ontology/title\"\n>>> neighbor_attr_triples(ds.rel_triples_left, cleaned_attr, jt).pipe(\n select_rel, rel=title_rel\n )[\"tail\"]\n )\n 12234 A Nero Wolfe Mystery\n 12282 Door to Death\n 12440 Die Like a Dog\n 12461 The Next Witness\n Name: tail, dtype: object\n```\n\n\nInstallation\n============\nYou can install `embarrassment` via pip:\n```\npip install embarrassment\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Convenience functions to work with pandas triple dataframes \ud83d\udc3c\ud83d\udc3c\ud83d\udc3c",
"version": "0.1.0",
"project_urls": {
"Bug Tracker": "https://github.com/dobraczka/embarrassment/issues",
"Documentation": "https://embarrassment.readthedocs.io",
"Homepage": "https://github.com/dobraczka/embarrassment",
"Repository": "https://github.com/dobraczka/embarrassment",
"Source": "https://github.com/dobraczka/embarrassment"
},
"split_keywords": [
"pandas",
"rdf",
"knowledge graph"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5050c05f1f1c465a7092d2976321355574608af88a39f030409cd8f8dc240cd9",
"md5": "67b9ef186970fa7ebc03a07db7527e14",
"sha256": "13ededfcd4d4da50ff7f0a4c9d0851b9f26b6b90282b0b02d66fe4289fd0f08b"
},
"downloads": -1,
"filename": "embarrassment-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "67b9ef186970fa7ebc03a07db7527e14",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<4.0",
"size": 5898,
"upload_time": "2024-02-12T15:36:24",
"upload_time_iso_8601": "2024-02-12T15:36:24.719991Z",
"url": "https://files.pythonhosted.org/packages/50/50/c05f1f1c465a7092d2976321355574608af88a39f030409cd8f8dc240cd9/embarrassment-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "914353a032549b9867f168c7a1f8810788bbb9dfcce440ff3e891e83bad73b3e",
"md5": "86d6634138208787e8904ec5436e5c84",
"sha256": "6942f354c6a5f627b3f8444832e171df78ee8385a2daab84b15e824791631dc0"
},
"downloads": -1,
"filename": "embarrassment-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "86d6634138208787e8904ec5436e5c84",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<4.0",
"size": 6962,
"upload_time": "2024-02-12T15:36:26",
"upload_time_iso_8601": "2024-02-12T15:36:26.725354Z",
"url": "https://files.pythonhosted.org/packages/91/43/53a032549b9867f168c7a1f8810788bbb9dfcce440ff3e891e83bad73b3e/embarrassment-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-12 15:36:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dobraczka",
"github_project": "embarrassment",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "embarrassment"
}