simple-wikiparser


Namesimple-wikiparser JSON
Version 0.0.0 PyPI version JSON
download
home_pagehttps://github.com/biswajit2903/SimpleWikiParser
SummaryA simple Wikipedia parser
upload_time2024-04-22 12:28:06
maintainerNone
docs_urlNone
authorBiswajit Satapathy
requires_python>=3.8
licenseApache License 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # SimpleWikiParser
An Simplified Wiki Data Parser

## Installation
```bash
pip install git+https://github.com/Biswajit2902/SimpleWikiParser.git
```

## Usage:
```python
from wikiparser.core import WikiMediaDumpParser

# initialise Parser for a language (say Hindi)
wiki_dump_parser = WikiMediaDumpParser(language="Hindi")

# parse
wiki_dump_parser.parse()

# export
wiki_dump_parser.export_hf_dataset("/path/to/data.jsonl", "dataset_name")
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/biswajit2903/SimpleWikiParser",
    "name": "simple-wikiparser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Biswajit Satapathy",
    "author_email": "biswajit2902@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/1e/33/78e7b5dbc0793899ab1903d7377c210690fe2cdf1c6fe98bbd3e91c2c142/simple-wikiparser-0.0.0.tar.gz",
    "platform": null,
    "description": "# SimpleWikiParser\nAn Simplified Wiki Data Parser\n\n## Installation\n```bash\npip install git+https://github.com/Biswajit2902/SimpleWikiParser.git\n```\n\n## Usage:\n```python\nfrom wikiparser.core import WikiMediaDumpParser\n\n# initialise Parser for a language (say Hindi)\nwiki_dump_parser = WikiMediaDumpParser(language=\"Hindi\")\n\n# parse\nwiki_dump_parser.parse()\n\n# export\nwiki_dump_parser.export_hf_dataset(\"/path/to/data.jsonl\", \"dataset_name\")\n```\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "A simple Wikipedia parser",
    "version": "0.0.0",
    "project_urls": {
        "Homepage": "https://github.com/biswajit2903/SimpleWikiParser"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1e3378e7b5dbc0793899ab1903d7377c210690fe2cdf1c6fe98bbd3e91c2c142",
                "md5": "bc5276345d5826cc6aaa5fab5df5d023",
                "sha256": "df7183e510757cd1a8e1259b2522f01e6638c818274e3249879dade077993f9d"
            },
            "downloads": -1,
            "filename": "simple-wikiparser-0.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "bc5276345d5826cc6aaa5fab5df5d023",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 7512,
            "upload_time": "2024-04-22T12:28:06",
            "upload_time_iso_8601": "2024-04-22T12:28:06.436048Z",
            "url": "https://files.pythonhosted.org/packages/1e/33/78e7b5dbc0793899ab1903d7377c210690fe2cdf1c6fe98bbd3e91c2c142/simple-wikiparser-0.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-22 12:28:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "biswajit2903",
    "github_project": "SimpleWikiParser",
    "github_not_found": true,
    "lcname": "simple-wikiparser"
}
        
Elapsed time: 0.24696s