articlequality


Namearticlequality JSON
Version 0.4.4 PyPI version JSON
download
home_pagehttps://github.com/wikimedia/articlequality
SummaryA library for performing automatic detection of assessment classes of Wikipedia articles.
upload_time2022-12-21 15:00:37
maintainer
docs_urlNone
authorAaron Halfaker / Morten Warncke-Wang
requires_python
licenseMIT
keywords
VCS
bugtrack_url
requirements mwapi mwbase mwreverts mwtypes mwxml revscoring
Travis-CI
coveralls test coverage No coveralls.
            # Wikipedia article quality classification

This library provides a set of utilities for performing automatic detection of
assessment classes of Wikipedia articles.  For more information, see the full
documentation at https://articlequality.readthedocs.io .

**Compatible with Python 3.x only.**  Sorry.

* **Install:** ``pip install articlequality``
* **Models:** https://github.com/wikimedia/articlequality/tree/master/models
* **Documentation:** https://articlequality.readthedocs.io

## Basic usage

    >>> import articlequality
    >>> from revscoring import Model
    >>>
    >>> scorer_model = Model.load(open("models/enwiki.nettrom_wp10.gradient_boosting.model", "rb"))
    >>>
    >>> text = "I am the text of a page.  I have a <ref>word</ref>"
    >>> articlequality.score(scorer_model, text)
    {'prediction': 'stub',
     'probability': {'stub': 0.27156163795807853,
                     'b': 0.14707452309674252,
                     'fa': 0.16844898943510833,
                     'c': 0.057668704007171959,
                     'ga': 0.21617801281707663,
                     'start': 0.13906813268582238}}

## Install

### Requirements

* Python 3.5, 3.6 or 3.7
* All the system requirements of [revscoring](https://github.com/wikimedia/revscoring)

### Installation steps

1. clone this repository
2. install the package itself and its dependencies `python setup.py install`
3. You can verify that your installation worked by running `make enwiki_models` to build the English Wikipedia article quality model or `make wikidatawiki_models` to build the item quality model for Wikidata

### Retraining the models

To retrain a model, run `make -B MODEL` e.g. `make -B wikidatawiki_models`. This will redownload the labels, re-extract the features from the revisions, and then retrain and rescore the model.

To skip re-downloading the training labels and re-extracting the features, it is enough `touch` the files in the `datasets/` directory and run the `make` command without the `-B` flag.

### Running tests

Example:

```
pytest -vv tests/feature_lists/test_wikidatawiki.py
```

## Authors
* Aaron Halfaker -- https://github.com/halfak
* Morten Warncke-Wang -- https://github.com/nettrom

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/wikimedia/articlequality",
    "name": "articlequality",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Aaron Halfaker / Morten Warncke-Wang",
    "author_email": "ahalfaker@wikimedia.org",
    "download_url": "https://files.pythonhosted.org/packages/bb/34/f0817607bff0e4b1f6da7c328cd06db69f6adbc5aec06be0b53ed06f0ca3/articlequality-0.4.4.tar.gz",
    "platform": null,
    "description": "# Wikipedia article quality classification\n\nThis library provides a set of utilities for performing automatic detection of\nassessment classes of Wikipedia articles.  For more information, see the full\ndocumentation at https://articlequality.readthedocs.io .\n\n**Compatible with Python 3.x only.**  Sorry.\n\n* **Install:** ``pip install articlequality``\n* **Models:** https://github.com/wikimedia/articlequality/tree/master/models\n* **Documentation:** https://articlequality.readthedocs.io\n\n## Basic usage\n\n    >>> import articlequality\n    >>> from revscoring import Model\n    >>>\n    >>> scorer_model = Model.load(open(\"models/enwiki.nettrom_wp10.gradient_boosting.model\", \"rb\"))\n    >>>\n    >>> text = \"I am the text of a page.  I have a <ref>word</ref>\"\n    >>> articlequality.score(scorer_model, text)\n    {'prediction': 'stub',\n     'probability': {'stub': 0.27156163795807853,\n                     'b': 0.14707452309674252,\n                     'fa': 0.16844898943510833,\n                     'c': 0.057668704007171959,\n                     'ga': 0.21617801281707663,\n                     'start': 0.13906813268582238}}\n\n## Install\n\n### Requirements\n\n* Python 3.5, 3.6 or 3.7\n* All the system requirements of [revscoring](https://github.com/wikimedia/revscoring)\n\n### Installation steps\n\n1. clone this repository\n2. install the package itself and its dependencies `python setup.py install`\n3. You can verify that your installation worked by running `make enwiki_models` to build the English Wikipedia article quality model or `make wikidatawiki_models` to build the item quality model for Wikidata\n\n### Retraining the models\n\nTo retrain a model, run `make -B MODEL` e.g. `make -B wikidatawiki_models`. This will redownload the labels, re-extract the features from the revisions, and then retrain and rescore the model.\n\nTo skip re-downloading the training labels and re-extracting the features, it is enough `touch` the files in the `datasets/` directory and run the `make` command without the `-B` flag.\n\n### Running tests\n\nExample:\n\n```\npytest -vv tests/feature_lists/test_wikidatawiki.py\n```\n\n## Authors\n* Aaron Halfaker -- https://github.com/halfak\n* Morten Warncke-Wang -- https://github.com/nettrom\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A library for performing automatic detection of assessment classes of Wikipedia articles.",
    "version": "0.4.4",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "1a6328818e9f111602d8c26f176f7d40",
                "sha256": "eae688b3bf7d1c0b2a7b72e7c1bb92f18e40b604efb3ff138c091f68ed4e3b2d"
            },
            "downloads": -1,
            "filename": "articlequality-0.4.4-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1a6328818e9f111602d8c26f176f7d40",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 56132,
            "upload_time": "2022-12-21T15:00:35",
            "upload_time_iso_8601": "2022-12-21T15:00:35.829010Z",
            "url": "https://files.pythonhosted.org/packages/c4/71/a732ea3f6296f8906956eaed94aeff6485890a49070528cc2f3088860946/articlequality-0.4.4-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "e2d569caca034ea693310672b4f40ee4",
                "sha256": "c2a5b504890e5e41db17e44cdc5b473da73dbaa094b004013af9b4d771717262"
            },
            "downloads": -1,
            "filename": "articlequality-0.4.4.tar.gz",
            "has_sig": false,
            "md5_digest": "e2d569caca034ea693310672b4f40ee4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 36976,
            "upload_time": "2022-12-21T15:00:37",
            "upload_time_iso_8601": "2022-12-21T15:00:37.740449Z",
            "url": "https://files.pythonhosted.org/packages/bb/34/f0817607bff0e4b1f6da7c328cd06db69f6adbc5aec06be0b53ed06f0ca3/articlequality-0.4.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-21 15:00:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "wikimedia",
    "github_project": "articlequality",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "mwapi",
            "specs": []
        },
        {
            "name": "mwbase",
            "specs": [
                [
                    "<",
                    "0.1.999"
                ],
                [
                    ">=",
                    "0.1.0"
                ]
            ]
        },
        {
            "name": "mwreverts",
            "specs": []
        },
        {
            "name": "mwtypes",
            "specs": []
        },
        {
            "name": "mwxml",
            "specs": [
                [
                    ">=",
                    "0.3.3"
                ]
            ]
        },
        {
            "name": "revscoring",
            "specs": [
                [
                    "<",
                    "2.11.999"
                ],
                [
                    ">=",
                    "2.11.0"
                ]
            ]
        }
    ],
    "tox": true,
    "lcname": "articlequality"
}
        
Elapsed time: 0.08987s