# wikidataloader
Easy pythonic wrapper around the [Wikidata SPARQL API](https://query.wikidata.org/) for quick creation of datasets from Wikidata.
Only supports simple, non-recursive queries - for complex queries please directly use the [SPARQL API](https://query.wikidata.org/) provided by Wikidata.
It does not support complex operators (ordering, datetime conversion, string/numeric filtering etc.), because these can be substituted by preprocessing the dataset in Python after retrieval.
## Usage
Look up the URIs for properties (e.g. _P31_) and objects (e.g. _Q5_) on [Wikidata's search engine](https://www.wikidata.org/).
```python
from wikidataloader import WikidataQuery
# Linguists from Germany with birth places and gender
results = WikidataQuery.search(
# {is_instance:human, country_of_origin:Germany, profession:linguist}
filters={"P31": "Q5", "P27": "Q183", "P106": "Q14467526"},
# selects the properties "Gender" and "Birth Place" as columns in the dataframe and names them "Gender" and "City of Birth"
select=[("P21", "Gender"), ("P19", "City of Birth")],
# returns a maximum of 5 results
limit=5,
# retrieves labels in English, if available
default_language="en"
).to_pandas()
results
>>> item Gender City_of_Birth
>>> 0 Hermann Weller male Schwäbisch Gmünd
>>> 1 Hans Wehr male Leipzig
>>> 2 Theodor Haecker male Mulfingen
>>> 3 Gottfried Bernhardy male Gorzów Wielkopolski
>>> 4 Wilhelm Streitberg male Rüdesheim am Rhein
```
For more examples, see [example.ipynb](./example.ipynb)
## Install
Install using pip:
```pip install wikidataloader```
## Limitations
- Does not support recursive queries
- Does not support Senses and Forms for Lexeme queries
Raw data
{
"_id": null,
"home_page": null,
"name": "wikidataloader",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "datasets, wikidata, sparql",
"author": null,
"author_email": "Nizar <nizar.neezr@web.de>",
"download_url": "https://files.pythonhosted.org/packages/9e/2b/fb0b07b004eb9751e8c9d29c2ce8f12e008ac3cff9f33c2774888243c9ce/wikidataloader-0.0.6.tar.gz",
"platform": null,
"description": "# wikidataloader\r\n\r\nEasy pythonic wrapper around the [Wikidata SPARQL API](https://query.wikidata.org/) for quick creation of datasets from Wikidata.\r\n\r\nOnly supports simple, non-recursive queries - for complex queries please directly use the [SPARQL API](https://query.wikidata.org/) provided by Wikidata.\r\n\r\nIt does not support complex operators (ordering, datetime conversion, string/numeric filtering etc.), because these can be substituted by preprocessing the dataset in Python after retrieval.\r\n\r\n## Usage\r\n\r\nLook up the URIs for properties (e.g. _P31_) and objects (e.g. _Q5_) on [Wikidata's search engine](https://www.wikidata.org/).\r\n\r\n```python\r\nfrom wikidataloader import WikidataQuery\r\n\r\n# Linguists from Germany with birth places and gender\r\n\r\nresults = WikidataQuery.search(\r\n # {is_instance:human, country_of_origin:Germany, profession:linguist}\r\n filters={\"P31\": \"Q5\", \"P27\": \"Q183\", \"P106\": \"Q14467526\"}, \r\n\r\n # selects the properties \"Gender\" and \"Birth Place\" as columns in the dataframe and names them \"Gender\" and \"City of Birth\"\r\n select=[(\"P21\", \"Gender\"), (\"P19\", \"City of Birth\")],\r\n\r\n # returns a maximum of 5 results\r\n limit=5,\r\n\r\n # retrieves labels in English, if available\r\n default_language=\"en\" \r\n).to_pandas()\r\n\r\nresults\r\n\r\n>>> item Gender City_of_Birth\r\n>>> 0 Hermann Weller male Schw\u00e4bisch Gm\u00fcnd\r\n>>> 1 Hans Wehr male Leipzig\r\n>>> 2 Theodor Haecker male Mulfingen\r\n>>> 3 Gottfried Bernhardy male Gorz\u00f3w Wielkopolski\r\n>>> 4 Wilhelm Streitberg male R\u00fcdesheim am Rhein\r\n\r\n```\r\n\r\nFor more examples, see [example.ipynb](./example.ipynb)\r\n\r\n## Install\r\n\r\nInstall using pip:\r\n\r\n```pip install wikidataloader```\r\n\r\n## Limitations\r\n\r\n- Does not support recursive queries\r\n- Does not support Senses and Forms for Lexeme queries \r\n",
"bugtrack_url": null,
"license": null,
"summary": "Pythonic wrapper around the Wikidata SPARQL API for quick creation of datasets from Wikidata",
"version": "0.0.6",
"project_urls": {
"Homepage": "https://github.com/neezr/wikidataloader",
"Issues": "https://github.com/neezr/wikidataloader/issues",
"Repository": "https://github.com/neezr/wikidataloader.git"
},
"split_keywords": [
"datasets",
" wikidata",
" sparql"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "c7fa64e453559f9121099babe0d17ea2969304e1f9866c6ad55faa83b2364c16",
"md5": "fcbbfef29e4de109bd8a7a06db10db49",
"sha256": "fda81a51f708f76fde64fbf898d9f55416f39cea853aacbcfb4cc4636f933790"
},
"downloads": -1,
"filename": "wikidataloader-0.0.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fcbbfef29e4de109bd8a7a06db10db49",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 17876,
"upload_time": "2025-10-09T06:21:50",
"upload_time_iso_8601": "2025-10-09T06:21:50.739170Z",
"url": "https://files.pythonhosted.org/packages/c7/fa/64e453559f9121099babe0d17ea2969304e1f9866c6ad55faa83b2364c16/wikidataloader-0.0.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "9e2bfb0b07b004eb9751e8c9d29c2ce8f12e008ac3cff9f33c2774888243c9ce",
"md5": "4f1edf78a79f25f3a2dbc569bcf7e974",
"sha256": "334570314afd2727cc4efd5e26e3275d0f3febc9183065cea203b49db9ca1011"
},
"downloads": -1,
"filename": "wikidataloader-0.0.6.tar.gz",
"has_sig": false,
"md5_digest": "4f1edf78a79f25f3a2dbc569bcf7e974",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 17437,
"upload_time": "2025-10-09T06:21:53",
"upload_time_iso_8601": "2025-10-09T06:21:53.235140Z",
"url": "https://files.pythonhosted.org/packages/9e/2b/fb0b07b004eb9751e8c9d29c2ce8f12e008ac3cff9f33c2774888243c9ce/wikidataloader-0.0.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-09 06:21:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "neezr",
"github_project": "wikidataloader",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "requests",
"specs": [
[
"==",
"2.32.5"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"2.3.2"
]
]
}
],
"lcname": "wikidataloader"
}