kgdata


Namekgdata JSON
Version 7.0.4 PyPI version JSON
download
home_pagehttps://github.com/binh-vu/kgdata
SummaryLibrary to process dumps of knowledge graphs (Wikipedia, DBpedia, Wikidata)
upload_time2024-05-11 07:38:33
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords knowledge-graph wikidata wikipedia dbpedia
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # kgdata ![PyPI](https://img.shields.io/pypi/v/kgdata) ![Documentation](https://readthedocs.org/projects/kgdata/badge/?version=latest&style=flat)

KGData is a library to process dumps of Wikipedia, Wikidata. What it can do:

- Clean up the dumps to ensure the data is consistent (resolve redirect, remove dangling references)
- Create embedded key-value databases to access entities from the dumps.
- Extract Wikidata ontology.
- Extract Wikipedia tables and convert the hyperlinks to Wikidata entities.
- Create Pyserini indices to search Wikidata’s entities.
- and more

For a full documentation, please see [the website](https://kgdata.readthedocs.io/).

## Installation

From PyPI (using pre-built binaries):

```bash
pip install kgdata[spark]   # omit spark to manually specify its version if your cluster has different version
```


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/binh-vu/kgdata",
    "name": "kgdata",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "knowledge-graph, wikidata, wikipedia, dbpedia",
    "author": null,
    "author_email": "Binh Vu <binh@toan2.com>",
    "download_url": "https://files.pythonhosted.org/packages/80/c8/b64411a2bc1bd4b7cb6801badd40e0d1f2fdc461c786a3fb0480cb34ef2a/kgdata-7.0.4.tar.gz",
    "platform": null,
    "description": "# kgdata ![PyPI](https://img.shields.io/pypi/v/kgdata) ![Documentation](https://readthedocs.org/projects/kgdata/badge/?version=latest&style=flat)\n\nKGData is a library to process dumps of Wikipedia, Wikidata. What it can do:\n\n- Clean up the dumps to ensure the data is consistent (resolve redirect, remove dangling references)\n- Create embedded key-value databases to access entities from the dumps.\n- Extract Wikidata ontology.\n- Extract Wikipedia tables and convert the hyperlinks to Wikidata entities.\n- Create Pyserini indices to search Wikidata\u2019s entities.\n- and more\n\nFor a full documentation, please see [the website](https://kgdata.readthedocs.io/).\n\n## Installation\n\nFrom PyPI (using pre-built binaries):\n\n```bash\npip install kgdata[spark]   # omit spark to manually specify its version if your cluster has different version\n```\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Library to process dumps of knowledge graphs (Wikipedia, DBpedia, Wikidata)",
    "version": "7.0.4",
    "project_urls": {
        "Homepage": "https://github.com/binh-vu/kgdata",
        "repository": "https://github.com/binh-vu/kgdata"
    },
    "split_keywords": [
        "knowledge-graph",
        " wikidata",
        " wikipedia",
        " dbpedia"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c139e53cc4f5c43879a7513fe110c5383d5abd597d3d446f1f16789e17dd6fb6",
                "md5": "3f4ef5ac0d23f13ea253d44cb0678e9e",
                "sha256": "26155cd09d9fb89459cfbc68d0ebe61689ae58dcaf37f708b55465b1af4dc223"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp310-cp310-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl",
            "has_sig": false,
            "md5_digest": "3f4ef5ac0d23f13ea253d44cb0678e9e",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.10",
            "size": 5467979,
            "upload_time": "2024-05-11T07:37:57",
            "upload_time_iso_8601": "2024-05-11T07:37:57.072501Z",
            "url": "https://files.pythonhosted.org/packages/c1/39/e53cc4f5c43879a7513fe110c5383d5abd597d3d446f1f16789e17dd6fb6/kgdata-7.0.4-cp310-cp310-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "592e0385f2cfb33dcc4d718d0334fbec17a560c752e08f91b33a462e48c7ec6b",
                "md5": "348cde176b575f8d95839810740ba59d",
                "sha256": "d346320f23670b7cb18503076124a68d07be360042579ff36abaffe622fe8c9f"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "348cde176b575f8d95839810740ba59d",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.10",
            "size": 4011136,
            "upload_time": "2024-05-11T07:37:59",
            "upload_time_iso_8601": "2024-05-11T07:37:59.687262Z",
            "url": "https://files.pythonhosted.org/packages/59/2e/0385f2cfb33dcc4d718d0334fbec17a560c752e08f91b33a462e48c7ec6b/kgdata-7.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "de28acd3d0bf61f44e8697456198a862ec1f90cd0687d18cba6f0ba8a1436255",
                "md5": "c4ee051784dbcec70b93ccfc23ec4d1f",
                "sha256": "a3240b8b08d8dd9c89c2ea9f670ddbe3e5b9ee43e4fb5aba41e8682aadbf6141"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp310-cp310-manylinux_2_35_x86_64.whl",
            "has_sig": false,
            "md5_digest": "c4ee051784dbcec70b93ccfc23ec4d1f",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.10",
            "size": 3280756,
            "upload_time": "2024-05-11T07:38:02",
            "upload_time_iso_8601": "2024-05-11T07:38:02.256735Z",
            "url": "https://files.pythonhosted.org/packages/de/28/acd3d0bf61f44e8697456198a862ec1f90cd0687d18cba6f0ba8a1436255/kgdata-7.0.4-cp310-cp310-manylinux_2_35_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c829304cf7fa3249da664627fb6c75ae7375c49302c44ca8de5757108dc8ac07",
                "md5": "3a0c76e04b6bb46ff2caa3c2484128ea",
                "sha256": "0d6dc8ffec6d4197141e25ed81ad32849a09ceb1450206848f264baac940cf5d"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp310-none-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "3a0c76e04b6bb46ff2caa3c2484128ea",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.10",
            "size": 2250409,
            "upload_time": "2024-05-11T07:38:04",
            "upload_time_iso_8601": "2024-05-11T07:38:04.402144Z",
            "url": "https://files.pythonhosted.org/packages/c8/29/304cf7fa3249da664627fb6c75ae7375c49302c44ca8de5757108dc8ac07/kgdata-7.0.4-cp310-none-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4a9cb4ac35bc95f5e4e0431476a0c2eaf63ebf6a481e7245ddedbb746de7f1fd",
                "md5": "df6e95cd54376179dfc067463e573e45",
                "sha256": "7ca4b6ac228b2ed1a3f0e51a8fde91bf7741377f81a9f446158d7a27f85fee49"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp311-cp311-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl",
            "has_sig": false,
            "md5_digest": "df6e95cd54376179dfc067463e573e45",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.10",
            "size": 5467971,
            "upload_time": "2024-05-11T07:38:06",
            "upload_time_iso_8601": "2024-05-11T07:38:06.346931Z",
            "url": "https://files.pythonhosted.org/packages/4a/9c/b4ac35bc95f5e4e0431476a0c2eaf63ebf6a481e7245ddedbb746de7f1fd/kgdata-7.0.4-cp311-cp311-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "170afbf91adba83fe4bf0385260d1ac4cac8edf3ad7e3cac4779be3743b25ed1",
                "md5": "5b4475028cda9d97017e1d212158e376",
                "sha256": "35aa4ce4eaf02c7ecb4a88166f2b197026d99ac88bd2380f5373f6452dbaa74d"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "5b4475028cda9d97017e1d212158e376",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.10",
            "size": 4011086,
            "upload_time": "2024-05-11T07:38:08",
            "upload_time_iso_8601": "2024-05-11T07:38:08.673941Z",
            "url": "https://files.pythonhosted.org/packages/17/0a/fbf91adba83fe4bf0385260d1ac4cac8edf3ad7e3cac4779be3743b25ed1/kgdata-7.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c3e1bf70d9fa5f45232c6df98da4606dc6bd825b5319acbc628b2ef0769ac688",
                "md5": "9bced9b9582d357bc7c8e9f2433d1372",
                "sha256": "733e184e6d4d2bca07923dea0c1ebe7a4f5bf661fd7a98588236b41730ac41e6"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp311-cp311-manylinux_2_35_x86_64.whl",
            "has_sig": false,
            "md5_digest": "9bced9b9582d357bc7c8e9f2433d1372",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.10",
            "size": 3280695,
            "upload_time": "2024-05-11T07:38:10",
            "upload_time_iso_8601": "2024-05-11T07:38:10.889833Z",
            "url": "https://files.pythonhosted.org/packages/c3/e1/bf70d9fa5f45232c6df98da4606dc6bd825b5319acbc628b2ef0769ac688/kgdata-7.0.4-cp311-cp311-manylinux_2_35_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ecddd8d65e5a6646775a34e4b82c7767a88c77d131d36de766b6f6bc0ddb3953",
                "md5": "27c2b4f7f215bf696619398898382712",
                "sha256": "29981188670e8d964f4a58b22ac5c25647401c227bdb30fff0ef1b3120e14ce2"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp311-none-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "27c2b4f7f215bf696619398898382712",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.10",
            "size": 2250404,
            "upload_time": "2024-05-11T07:38:13",
            "upload_time_iso_8601": "2024-05-11T07:38:13.192910Z",
            "url": "https://files.pythonhosted.org/packages/ec/dd/d8d65e5a6646775a34e4b82c7767a88c77d131d36de766b6f6bc0ddb3953/kgdata-7.0.4-cp311-none-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "eeefc525b57214eaf35f84ebbb537c45aac0fddabe92b0c1e9a155ba4cfec51d",
                "md5": "f9def8e89ce99352b01858618bb76185",
                "sha256": "53929186d9420d2637805ded26271461232e1ad663d285fad2ce4a8ad0fb91a5"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp312-cp312-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl",
            "has_sig": false,
            "md5_digest": "f9def8e89ce99352b01858618bb76185",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.10",
            "size": 5467905,
            "upload_time": "2024-05-11T07:38:15",
            "upload_time_iso_8601": "2024-05-11T07:38:15.424284Z",
            "url": "https://files.pythonhosted.org/packages/ee/ef/c525b57214eaf35f84ebbb537c45aac0fddabe92b0c1e9a155ba4cfec51d/kgdata-7.0.4-cp312-cp312-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a92de4d3b7b92e9501402ba7f5adfa0003d250fac434e9c67bed12a29b731896",
                "md5": "9bc506b408c940d3c0bcd21dc6584ba5",
                "sha256": "a0e5d194721779c4f34f9c32fbf5d03693b0ca0feacae8ee327160086b997dd3"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "9bc506b408c940d3c0bcd21dc6584ba5",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.10",
            "size": 4005424,
            "upload_time": "2024-05-11T07:38:17",
            "upload_time_iso_8601": "2024-05-11T07:38:17.803612Z",
            "url": "https://files.pythonhosted.org/packages/a9/2d/e4d3b7b92e9501402ba7f5adfa0003d250fac434e9c67bed12a29b731896/kgdata-7.0.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "26fac4c3a0adbbc3f967578d4aceffedde9509cb8806b655f4d6ea17eaafaf7c",
                "md5": "a53867ad53efee9f07f1305feb6370a0",
                "sha256": "341bcb519e71060ef694dfa65841f6821e9577ce0f50319c79e5cfa3ee2ca762"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp312-cp312-manylinux_2_35_x86_64.whl",
            "has_sig": false,
            "md5_digest": "a53867ad53efee9f07f1305feb6370a0",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.10",
            "size": 3277147,
            "upload_time": "2024-05-11T07:38:19",
            "upload_time_iso_8601": "2024-05-11T07:38:19.973034Z",
            "url": "https://files.pythonhosted.org/packages/26/fa/c4c3a0adbbc3f967578d4aceffedde9509cb8806b655f4d6ea17eaafaf7c/kgdata-7.0.4-cp312-cp312-manylinux_2_35_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2e9306c174b74a827e6e2e6f05b031219557b128ed0c07a91db1f9ff22d435cb",
                "md5": "55e723cb8711c6331bee0c62fc1d1a0f",
                "sha256": "7a4641bcb36c0003456bf459b3ace8259b489fa3f63e083ae8dd0c2d762d3972"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp312-none-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "55e723cb8711c6331bee0c62fc1d1a0f",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": ">=3.10",
            "size": 2248918,
            "upload_time": "2024-05-11T07:38:22",
            "upload_time_iso_8601": "2024-05-11T07:38:22.245364Z",
            "url": "https://files.pythonhosted.org/packages/2e/93/06c174b74a827e6e2e6f05b031219557b128ed0c07a91db1f9ff22d435cb/kgdata-7.0.4-cp312-none-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ae53dcfa1b0cb8c64d86432083215319392e470aa7316c473ff45ec2d659f912",
                "md5": "75d5f2bd5d7c0c27f313b86cc1cbc3ec",
                "sha256": "96d5929fb7669580efd490e79235297c72a764d8f2b4e407c6e860b84644ccad"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "75d5f2bd5d7c0c27f313b86cc1cbc3ec",
            "packagetype": "bdist_wheel",
            "python_version": "cp313",
            "requires_python": ">=3.10",
            "size": 4005422,
            "upload_time": "2024-05-11T07:38:24",
            "upload_time_iso_8601": "2024-05-11T07:38:24.424842Z",
            "url": "https://files.pythonhosted.org/packages/ae/53/dcfa1b0cb8c64d86432083215319392e470aa7316c473ff45ec2d659f912/kgdata-7.0.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fa27468d61eed6561e0cc942b28a55f258a576ab47ba50bdd2d69e61962afb48",
                "md5": "38dc2e13deaa8a13f115212e30b001d7",
                "sha256": "9ddc3f342f7da134a608b2ac49a642844f530acf1f2935331926e9f709821423"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "38dc2e13deaa8a13f115212e30b001d7",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.10",
            "size": 4011038,
            "upload_time": "2024-05-11T07:38:26",
            "upload_time_iso_8601": "2024-05-11T07:38:26.989804Z",
            "url": "https://files.pythonhosted.org/packages/fa/27/468d61eed6561e0cc942b28a55f258a576ab47ba50bdd2d69e61962afb48/kgdata-7.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2f7b4841cc2edd6a1ec3226de74e6c0238603ee799652239c0fee658d20e1453",
                "md5": "2d6f69b32957ed5c55ad12d033f6b831",
                "sha256": "867abaa9f2f0db21bb40c7b98e7567de5bd855f9fbb55141ef884f769fa6de16"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "2d6f69b32957ed5c55ad12d033f6b831",
            "packagetype": "bdist_wheel",
            "python_version": "pp310",
            "requires_python": ">=3.10",
            "size": 4010277,
            "upload_time": "2024-05-11T07:38:29",
            "upload_time_iso_8601": "2024-05-11T07:38:29.439606Z",
            "url": "https://files.pythonhosted.org/packages/2f/7b/4841cc2edd6a1ec3226de74e6c0238603ee799652239c0fee658d20e1453/kgdata-7.0.4-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "373b9fedb7962c550992b672ab230d320f45253c0ab048be8317fb388b662e04",
                "md5": "731e7bb18a91c69ed82b40e585ed7ee7",
                "sha256": "a982abc19eb9dc1aaf50c961173c18519d2af61b5d503805dd14473d7f8dd90f"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "731e7bb18a91c69ed82b40e585ed7ee7",
            "packagetype": "bdist_wheel",
            "python_version": "pp39",
            "requires_python": ">=3.10",
            "size": 4010579,
            "upload_time": "2024-05-11T07:38:31",
            "upload_time_iso_8601": "2024-05-11T07:38:31.814819Z",
            "url": "https://files.pythonhosted.org/packages/37/3b/9fedb7962c550992b672ab230d320f45253c0ab048be8317fb388b662e04/kgdata-7.0.4-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "80c8b64411a2bc1bd4b7cb6801badd40e0d1f2fdc461c786a3fb0480cb34ef2a",
                "md5": "646ed05148d6d0c8589deeed03400592",
                "sha256": "13eb9ec6b781c201dd6607d19940b37f739f568f65c4654aa373a383d7f45219"
            },
            "downloads": -1,
            "filename": "kgdata-7.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "646ed05148d6d0c8589deeed03400592",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 150324,
            "upload_time": "2024-05-11T07:38:33",
            "upload_time_iso_8601": "2024-05-11T07:38:33.523823Z",
            "url": "https://files.pythonhosted.org/packages/80/c8/b64411a2bc1bd4b7cb6801badd40e0d1f2fdc461c786a3fb0480cb34ef2a/kgdata-7.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-11 07:38:33",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "binh-vu",
    "github_project": "kgdata",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "kgdata"
}
        
Elapsed time: 0.29408s