wiki-fetch


Namewiki-fetch JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/d3z-the-dev/wiki-fetch
SummaryParser for Wikipedia.org
upload_time2023-01-20 10:12:46
maintainer
docs_urlNone
authord3z
requires_python>=3.10,<4.0
licenseApache-2.0
keywords parser wiki wikipedia web scraping
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # wiki-fetch

[![PyPI](https://img.shields.io/pypi/v/wiki-fetch)](https://github.com/d3z-the-dev/wiki-fetch/releases/)
[![Status](https://img.shields.io/pypi/status/wiki-fetch)](https://pypi.org/project/wiki-fetch/)
[![PyPI Downloads](https://img.shields.io/pypi/dm/wiki-fetch)](https://pypi.org/project/wiki-fetch/)
[![Python Version](https://img.shields.io/pypi/pyversions/wiki-fetch?color=%23244E71)](https://pypi.org/project/wiki-fetch/)
[![License](https://img.shields.io/pypi/l/wiki-fetch?color=272727)](https://en.wikipedia.org/wiki/Apache_License#Apache_License_2.0)
[![Issues](https://img.shields.io/github/issues/d3z-the-dev/wiki-fetch)](https://github.com/d3z-the-dev/wiki-fetch/issues)

## Installation

- PyPI

```bash
pip install wiki-fetch
```

- Source

```bash
git clone git@github.com:d3z-the-dev/wiki-fetch.git
cd wiki-fetch && poetry build
pip install ./dist/*.whl
```

## Usage

### CLI

<table>
<tr><th>Options for use in console</th></tr>
<tr><td>

| Option           | Flag | Long       | Default | Example                                   |
| ---------------- | ---- | ---------- | ------- | ----------------------------------------- |
| Wiki's page link | `-u` | `--url`    | None    | <https://en.wikipedia.org/wiki/The_Doors> |
| Search query     | `-q` | `--query`  | None    | The Doors (band)                          |
| Page language    | `-l` | `--lang`   | English | English                                   |
| Part of the page | `-p` | `--part`   | all     | infobox                                   |
| Parts by order   | `-i` | `--item`   | all     | first                                     |
| Output format    | `-o` | `--output` | text    | text                                      |
    
</td></tr>
</table>

```bash
wiki-fetch -q 'The Doors (band)' -p infobox -i first
```

<details>
<summary>output</summary>

```yaml
Infobox: 
    The Doors: 
        The Doors: 
            Image: https://upload.wikimedia.org/wikipedia/commons/thumb/6/69/The_Doors_1968.JPG/250px-The_Doors_1968.JPG
            Caption: The Doors in 1966: Morrison (left), Densmore (centre), Krieger (right) and Manzarek (seated)
        Background information: 
            Origin: Los Angeles, California, U.S.
            Genres: 
                Psychedelic Rock
                Blues Rock
                Acid Rock
            Years active: 
                1965-1973
                1978
            Labels: 
                Elektra
                Rhino
            Spinoffs: 
                The Psychedelic Rangers
                Butts Band
                Nite City
                Manzarek-Krieger
            Spinoff of: Rick & the Ravens
            Past members: 
                Jim Morrison
                Ray Manzarek
                Robby Krieger
                John Densmore
            Website: thedoors.com
URL: https://en.wikipedia.org/?search=The Doors (Band)
```
</details>

### Python

<table>
<tr><th>Arguments of function and class</th></tr>
<tr><td>
    
| Argument | Values                                                         | Description                     |
| -------- | -------------------------------------------------------------- | ------------------------------- |
| url      | `str`                                                          | Any Wiki's page URL             |
| query    | `str`                                                          | Any query string                |
| lang     | `str`                                                          | Any of available languages      |
| part     | `infobox`, `paragraph`, `table`, `list`, `thumb`, `toc`, `all` | Specify page part               |
| item     | `first`, `last`, `all`                                         | Specify the order of the part   |

</td></tr>
</table>

```python
from wiki_fetch.driver import Wiki

output = Wiki(lang='English').search(query='The Doors (band)', part='infobox', item='first')
print(output.json)
```

<details>
<summary>output</summary>

```json
{
    "Infobox": [
        {
            "The Doors": {
                "The Doors": {
                    "Image": "https://upload.wikimedia.org/wikipedia/commons/thumb/6/69/The_Doors_1968.JPG/250px-The_Doors_1968.JPG",
                    "Caption": "The Doors in 1966: Morrison (left), Densmore (centre), Krieger (right) and Manzarek (seated)"
                },
                "Background information": {
                    "Origin": "Los Angeles, California, U.S.",
                    "Genres": [
                        "Psychedelic Rock",
                        "Blues Rock",
                        "Acid Rock"
                    ],
                    "Years active": [
                        "1965-1973",
                        "1978"
                    ],
                    "Labels": [
                        "Elektra",
                        "Rhino"
                    ],
                    "Spinoffs": [
                        "The Psychedelic Rangers",
                        "Butts Band",
                        "Nite City",
                        "Manzarek-Krieger"
                    ],
                    "Spinoff of": "Rick & the Ravens",
                    "Past members": [
                        "Jim Morrison",
                        "Ray Manzarek",
                        "Robby Krieger",
                        "John Densmore"
                    ],
                    "Website": "thedoors.com"
                }
            }
        }
    ],
    "URL": "https://en.wikipedia.org/?search=The Doors (Band)"
}
```
</details>

## Specification
    
<table>
<tr><th>Available options</th><th> FAQ ? </th></tr>
<tr><td>

| Parts of page | Output formats | Language       |
| ------------- | -------------- | -------------- |
| `infobox`     | `text`         | `English`      |
| `paragraph`   | `json`         | `Ukrainian`    |
| `table`       | `dict`         | `Russian`      |
| `list`        |                | `Polish`       |
| `thumb`       |                | `German`       |
| `toc`         |                | `Nederlands`   |
|               |                | `Swedish`      |
|               |                | `Spanish`      |
|               |                | `French`       |
|               |                | `Italian`      |
|               |                | `Japanese`     |
|               |                | `Chainese`     |
|               |                | `Cebuano`      |

</td><td>

- If you find a bug or a shortage of functionality - create an issue with examples.
- If it is necessary to add missing languages, you can create an issue or make a fork and add a language to 'languages' variable in stuff.py file. Languages must be supported by Wikipedia.org.
- If you want to add an output format - create an issue with a description of the implementation. Implementation should contain only standard Python libraries.
- If you see the need in adding certain tests - you can create an issue with descriptive examples.
- If you have suggestions about the development of the project - you are welcome, create an issue with propose.
- If you don't like the style of naming variables - go ~~fuck~~ yourself.

</td></tr>
</table>

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/d3z-the-dev/wiki-fetch",
    "name": "wiki-fetch",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10,<4.0",
    "maintainer_email": "",
    "keywords": "parser,wiki,wikipedia,web scraping",
    "author": "d3z",
    "author_email": "d3z.the.dev@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/16/a5/79b9064b6b7771d5e4a550f0545e5851eedb4aa099df14bb5cb186e9a568/wiki_fetch-0.1.0.tar.gz",
    "platform": null,
    "description": "# wiki-fetch\n\n[![PyPI](https://img.shields.io/pypi/v/wiki-fetch)](https://github.com/d3z-the-dev/wiki-fetch/releases/)\n[![Status](https://img.shields.io/pypi/status/wiki-fetch)](https://pypi.org/project/wiki-fetch/)\n[![PyPI Downloads](https://img.shields.io/pypi/dm/wiki-fetch)](https://pypi.org/project/wiki-fetch/)\n[![Python Version](https://img.shields.io/pypi/pyversions/wiki-fetch?color=%23244E71)](https://pypi.org/project/wiki-fetch/)\n[![License](https://img.shields.io/pypi/l/wiki-fetch?color=272727)](https://en.wikipedia.org/wiki/Apache_License#Apache_License_2.0)\n[![Issues](https://img.shields.io/github/issues/d3z-the-dev/wiki-fetch)](https://github.com/d3z-the-dev/wiki-fetch/issues)\n\n## Installation\n\n- PyPI\n\n```bash\npip install wiki-fetch\n```\n\n- Source\n\n```bash\ngit clone git@github.com:d3z-the-dev/wiki-fetch.git\ncd wiki-fetch && poetry build\npip install ./dist/*.whl\n```\n\n## Usage\n\n### CLI\n\n<table>\n<tr><th>Options for use in console</th></tr>\n<tr><td>\n\n| Option           | Flag | Long       | Default | Example                                   |\n| ---------------- | ---- | ---------- | ------- | ----------------------------------------- |\n| Wiki's page link | `-u` | `--url`    | None    | <https://en.wikipedia.org/wiki/The_Doors> |\n| Search query     | `-q` | `--query`  | None    | The Doors (band)                          |\n| Page language    | `-l` | `--lang`   | English | English                                   |\n| Part of the page | `-p` | `--part`   | all     | infobox                                   |\n| Parts by order   | `-i` | `--item`   | all     | first                                     |\n| Output format    | `-o` | `--output` | text    | text                                      |\n    \n</td></tr>\n</table>\n\n```bash\nwiki-fetch -q 'The Doors (band)' -p infobox -i first\n```\n\n<details>\n<summary>output</summary>\n\n```yaml\nInfobox: \n    The Doors: \n        The Doors: \n            Image: https://upload.wikimedia.org/wikipedia/commons/thumb/6/69/The_Doors_1968.JPG/250px-The_Doors_1968.JPG\n            Caption: The Doors in 1966: Morrison (left), Densmore (centre), Krieger (right) and Manzarek (seated)\n        Background information: \n            Origin: Los Angeles, California, U.S.\n            Genres: \n                Psychedelic Rock\n                Blues Rock\n                Acid Rock\n            Years active: \n                1965-1973\n                1978\n            Labels: \n                Elektra\n                Rhino\n            Spinoffs: \n                The Psychedelic Rangers\n                Butts Band\n                Nite City\n                Manzarek-Krieger\n            Spinoff of: Rick & the Ravens\n            Past members: \n                Jim Morrison\n                Ray Manzarek\n                Robby Krieger\n                John Densmore\n            Website: thedoors.com\nURL: https://en.wikipedia.org/?search=The Doors (Band)\n```\n</details>\n\n### Python\n\n<table>\n<tr><th>Arguments of function and class</th></tr>\n<tr><td>\n    \n| Argument | Values                                                         | Description                     |\n| -------- | -------------------------------------------------------------- | ------------------------------- |\n| url      | `str`                                                          | Any Wiki's page URL             |\n| query    | `str`                                                          | Any query string                |\n| lang     | `str`                                                          | Any of available languages      |\n| part     | `infobox`, `paragraph`, `table`, `list`, `thumb`, `toc`, `all` | Specify page part               |\n| item     | `first`, `last`, `all`                                         | Specify the order of the part   |\n\n</td></tr>\n</table>\n\n```python\nfrom wiki_fetch.driver import Wiki\n\noutput = Wiki(lang='English').search(query='The Doors (band)', part='infobox', item='first')\nprint(output.json)\n```\n\n<details>\n<summary>output</summary>\n\n```json\n{\n    \"Infobox\": [\n        {\n            \"The Doors\": {\n                \"The Doors\": {\n                    \"Image\": \"https://upload.wikimedia.org/wikipedia/commons/thumb/6/69/The_Doors_1968.JPG/250px-The_Doors_1968.JPG\",\n                    \"Caption\": \"The Doors in 1966: Morrison (left), Densmore (centre), Krieger (right) and Manzarek (seated)\"\n                },\n                \"Background information\": {\n                    \"Origin\": \"Los Angeles, California, U.S.\",\n                    \"Genres\": [\n                        \"Psychedelic Rock\",\n                        \"Blues Rock\",\n                        \"Acid Rock\"\n                    ],\n                    \"Years active\": [\n                        \"1965-1973\",\n                        \"1978\"\n                    ],\n                    \"Labels\": [\n                        \"Elektra\",\n                        \"Rhino\"\n                    ],\n                    \"Spinoffs\": [\n                        \"The Psychedelic Rangers\",\n                        \"Butts Band\",\n                        \"Nite City\",\n                        \"Manzarek-Krieger\"\n                    ],\n                    \"Spinoff of\": \"Rick & the Ravens\",\n                    \"Past members\": [\n                        \"Jim Morrison\",\n                        \"Ray Manzarek\",\n                        \"Robby Krieger\",\n                        \"John Densmore\"\n                    ],\n                    \"Website\": \"thedoors.com\"\n                }\n            }\n        }\n    ],\n    \"URL\": \"https://en.wikipedia.org/?search=The Doors (Band)\"\n}\n```\n</details>\n\n## Specification\n    \n<table>\n<tr><th>Available options</th><th> FAQ ? </th></tr>\n<tr><td>\n\n| Parts of page | Output formats | Language       |\n| ------------- | -------------- | -------------- |\n| `infobox`     | `text`         | `English`      |\n| `paragraph`   | `json`         | `Ukrainian`    |\n| `table`       | `dict`         | `Russian`      |\n| `list`        |                | `Polish`       |\n| `thumb`       |                | `German`       |\n| `toc`         |                | `Nederlands`   |\n|               |                | `Swedish`      |\n|               |                | `Spanish`      |\n|               |                | `French`       |\n|               |                | `Italian`      |\n|               |                | `Japanese`     |\n|               |                | `Chainese`     |\n|               |                | `Cebuano`      |\n\n</td><td>\n\n- If you find a bug or a shortage of functionality - create an issue with examples.\n- If it is necessary to add missing languages, you can create an issue or make a fork and add a language to 'languages' variable in stuff.py file. Languages must be supported by Wikipedia.org.\n- If you want to add an output format - create an issue with a description of the implementation. Implementation should contain only standard Python libraries.\n- If you see the need in adding certain tests - you can create an issue with descriptive examples.\n- If you have suggestions about the development of the project - you are welcome, create an issue with propose.\n- If you don't like the style of naming variables - go ~~fuck~~ yourself.\n\n</td></tr>\n</table>\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Parser for Wikipedia.org",
    "version": "0.1.0",
    "split_keywords": [
        "parser",
        "wiki",
        "wikipedia",
        "web scraping"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "09dcbf3687deb6863d34ca674c1fd710bf4e64789ae44d84330919cdaef2bd92",
                "md5": "f2bd729881aed9e9bd324b95699086fa",
                "sha256": "8d892742cd6e139f32b3e1e9165d94a660af3477c5d8c3ce6378e456e8881842"
            },
            "downloads": -1,
            "filename": "wiki_fetch-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f2bd729881aed9e9bd324b95699086fa",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10,<4.0",
            "size": 19658,
            "upload_time": "2023-01-20T10:12:45",
            "upload_time_iso_8601": "2023-01-20T10:12:45.436034Z",
            "url": "https://files.pythonhosted.org/packages/09/dc/bf3687deb6863d34ca674c1fd710bf4e64789ae44d84330919cdaef2bd92/wiki_fetch-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "16a579b9064b6b7771d5e4a550f0545e5851eedb4aa099df14bb5cb186e9a568",
                "md5": "6ed28e8965170659368f23b76f3b6632",
                "sha256": "4090a24c477d8afe45e42ce0828814d3f37d8987a17b1856fdf33fe896f0be08"
            },
            "downloads": -1,
            "filename": "wiki_fetch-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "6ed28e8965170659368f23b76f3b6632",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10,<4.0",
            "size": 16555,
            "upload_time": "2023-01-20T10:12:46",
            "upload_time_iso_8601": "2023-01-20T10:12:46.662127Z",
            "url": "https://files.pythonhosted.org/packages/16/a5/79b9064b6b7771d5e4a550f0545e5851eedb4aa099df14bb5cb186e9a568/wiki_fetch-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-20 10:12:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "d3z-the-dev",
    "github_project": "wiki-fetch",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "wiki-fetch"
}
        
d3z
Elapsed time: 0.04035s