# LODlit
## Simplifying retrieval of literals from Linked Open Data
Different LOD-datasets are available online in different formats with diffferent user-friendliness levels.
LODlit allows you to search over different linked open datasets in one place using keywords and outputs the search results in the same json structure convinient for further processing.
For example, LODlit retrieves labels, aliases, and descriptions of Wikidata entities by search terms in a specific language with optional search filtering. It is also possible to get literals in different languages by entity identifiers.
Additionally, LODlit provides the functionality to make bag-of-words from literals for natural language processing, for example, to calculate cosine similarity between literals.
Currently, LODlit supports parsing of [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page), [Getty Art & Architecture Thesaurus (AAT)](https://www.getty.edu/research/tools/vocabularies/aat/), [Princeton WordNet (3.1)](https://wordnet.princeton.edu/), and [Open Dutch WordNet (1.3)](https://github.com/cultural-ai/OpenDutchWordnet).
### Installation
```pip install LODlit```
LODlit is available on [PyPI](https://pypi.org/project/LODlit/).
* To parse Princeton WordNet 3.1: After NLTK is installed, download the [wordnet31](https://github.com/nltk/nltk_data/blob/gh-pages/packages/corpora/wordnet31.zip) corpus; Put the content of "wordnet31" to "wordnet" in "nltk_data/corpora" (it is not possible to import wordnet31 from nltk.corpus directly; see explanations on [the WordNet website](https://wordnet.princeton.edu/download/current-version) (retrieved on 10.02.2023): "WordNet 3.1 DATABASE FILES ONLY. You can download the WordNet 3.1 database files. Note that this is not a full package as those above, nor does it contain any code for running WordNet. However, you can replace the files in the database directory of your 3.0 local installation with these files and the WordNet interface will run, returning entries from the 3.1 database. This is simply a compressed tar file of the WordNet 3.1 database files"; Use `pwn31.check_version()` to ensure that WordNet 3.1 is imported;
* To parse Open Dutch WordNet: Download the ODWN from [https://github.com/cultural-ai/OpenDutchWordnet](https://github.com/cultural-ai/OpenDutchWordnet);
### License
[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Raw data
{
"_id": null,
"home_page": "https://github.com/cultural-ai/LODlit",
"name": "LODlit",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "LOD,literals,linked open data,strings,NLP",
"author": "Andrei Nesterov",
"author_email": "nesterov@cwi.nl",
"download_url": "https://files.pythonhosted.org/packages/e4/d2/01b7a651c8a6eeec0e6cffc4e287275e3fa5722ef0e3c60006d45730edbd/LODlit-0.6.0.tar.gz",
"platform": null,
"description": "# LODlit\n## Simplifying retrieval of literals from Linked Open Data\n\nDifferent LOD-datasets are available online in different formats with diffferent user-friendliness levels.\nLODlit allows you to search over different linked open datasets in one place using keywords and outputs the search results in the same json structure convinient for further processing.\n\nFor example, LODlit retrieves labels, aliases, and descriptions of Wikidata entities by search terms in a specific language with optional search filtering. It is also possible to get literals in different languages by entity identifiers.\nAdditionally, LODlit provides the functionality to make bag-of-words from literals for natural language processing, for example, to calculate cosine similarity between literals.\n\nCurrently, LODlit supports parsing of [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page), [Getty Art & Architecture Thesaurus (AAT)](https://www.getty.edu/research/tools/vocabularies/aat/), [Princeton WordNet (3.1)](https://wordnet.princeton.edu/), and [Open Dutch WordNet (1.3)](https://github.com/cultural-ai/OpenDutchWordnet).\n\n### Installation\n\n```pip install LODlit```\n\nLODlit is available on [PyPI](https://pypi.org/project/LODlit/).\n\n* To parse Princeton WordNet 3.1: After NLTK is installed, download the [wordnet31](https://github.com/nltk/nltk_data/blob/gh-pages/packages/corpora/wordnet31.zip) corpus; Put the content of \"wordnet31\" to \"wordnet\" in \"nltk_data/corpora\" (it is not possible to import wordnet31 from nltk.corpus directly; see explanations on [the WordNet website](https://wordnet.princeton.edu/download/current-version) (retrieved on 10.02.2023): \"WordNet 3.1 DATABASE FILES ONLY. You can download the WordNet 3.1 database files. Note that this is not a full package as those above, nor does it contain any code for running WordNet. However, you can replace the files in the database directory of your 3.0 local installation with these files and the WordNet interface will run, returning entries from the 3.1 database. This is simply a compressed tar file of the WordNet 3.1 database files\"; Use `pwn31.check_version()` to ensure that WordNet 3.1 is imported;\n\n* To parse Open Dutch WordNet: Download the ODWN from [https://github.com/cultural-ai/OpenDutchWordnet](https://github.com/cultural-ai/OpenDutchWordnet);\n\n### License\n\n[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/).\n\n\n",
"bugtrack_url": null,
"license": "CC BY SA 4.0",
"summary": "Retrieving literal values from LOD",
"version": "0.6.0",
"project_urls": {
"Homepage": "https://github.com/cultural-ai/LODlit"
},
"split_keywords": [
"lod",
"literals",
"linked open data",
"strings",
"nlp"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "13683626c5fa987202f4c91e4f131d973adae6bd0d74f227bb05d4f6c2791db2",
"md5": "5793c02912cc7ba0e64ae2c2563cc510",
"sha256": "f99c61f81de71e43ffc01b1055d3404f454b9105873536db85a79956b7734cce"
},
"downloads": -1,
"filename": "LODlit-0.6.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5793c02912cc7ba0e64ae2c2563cc510",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 2693,
"upload_time": "2024-02-20T23:25:35",
"upload_time_iso_8601": "2024-02-20T23:25:35.106918Z",
"url": "https://files.pythonhosted.org/packages/13/68/3626c5fa987202f4c91e4f131d973adae6bd0d74f227bb05d4f6c2791db2/LODlit-0.6.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e4d201b7a651c8a6eeec0e6cffc4e287275e3fa5722ef0e3c60006d45730edbd",
"md5": "a858089f5d632bc23f4adb57dc3cfe18",
"sha256": "8631bb0f8c551a6da9d2ed9fc390c93e496e21e7ed7ba2390cb913e40d274f81"
},
"downloads": -1,
"filename": "LODlit-0.6.0.tar.gz",
"has_sig": false,
"md5_digest": "a858089f5d632bc23f4adb57dc3cfe18",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 2493,
"upload_time": "2024-02-20T23:25:36",
"upload_time_iso_8601": "2024-02-20T23:25:36.819950Z",
"url": "https://files.pythonhosted.org/packages/e4/d2/01b7a651c8a6eeec0e6cffc4e287275e3fa5722ef0e3c60006d45730edbd/LODlit-0.6.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-20 23:25:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "cultural-ai",
"github_project": "LODlit",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "lodlit"
}