WikibaseIntegrator


NameWikibaseIntegrator JSON
Version 0.12.5 PyPI version JSON
download
home_pagehttps://github.com/LeMyst/WikibaseIntegrator
SummaryPython package for reading from and writing to a Wikibase instance
upload_time2024-01-07 15:38:53
maintainer
docs_urlNone
authorMyst
requires_python>=3.8,<4.0
licenseMIT
keywords wikibase wikidata mediawiki sparql
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # Wikibase Integrator #

[![PyPi](https://img.shields.io/pypi/v/wikibaseintegrator.svg)](https://pypi.python.org/pypi/wikibaseintegrator)
[![Python pytest](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/python-pytest.yaml/badge.svg)](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/python-pytest.yaml)
[![Python Code Quality and Lint](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/python-lint.yaml/badge.svg)](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/python-lint.yaml)
[![CodeQL](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/codeql-analysis.yaml/badge.svg)](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/codeql-analysis.yaml)
[![Pyversions](https://img.shields.io/pypi/implementation/wikibaseintegrator.svg)](https://pypi.python.org/pypi/wikibaseintegrator)
[![Read the Docs](https://readthedocs.org/projects/pip/badge/?version=latest&style=flat)](https://wikibaseintegrator.readthedocs.io)

Wikibase Integrator is a python package whose purpose is to manipulate data present on a Wikibase instance (like
Wikidata).

# Breaking changes in v0.12 #

A complete rewrite of the WikibaseIntegrator core has been done in v0.12 which has led to some important changes.

It offers a new object-oriented approach, better code readability and support for Property, Lexeme and MediaInfo
entities (in addition to Item).

If you want to stay on v0.11.x, you can put this line in your requirements.txt:

```
wikibaseintegrator~=0.11.3
```

---

<!-- ToC generator: https://luciopaiva.com/markdown-toc/ -->

- [WikibaseIntegrator / WikidataIntegrator](#wikibaseintegrator--wikidataintegrator)
- [Documentation](#documentation)
    - [Jupyter notebooks](#jupyter-notebooks)
        - [Common use cases](#common-use-cases)
            - [Read an existing entity](#read-an-existing-entity)
            - [Start a new entity](#start-a-new-entity)
            - [Write an entity to instance](#write-an-entity-to-instance)
            - [Add labels](#add-labels)
            - [Get label value](#get-label-value)
            - [Add aliases](#add-aliases)
            - [Add descriptions](#add-descriptions)
            - [Add a simple claim](#add-a-simple-claim)
            - [Get claim value](#get-claim-value)
            - [Manipulate claim, add a qualifier](#manipulate-claim-add-a-qualifier)
            - [Manipulate claim, add references](#manipulate-claim-add-references)
            - [Set lemma on lexeme](#set-lemma-on-lexeme)
            - [Add gloss to a sense on lexeme](#add-gloss-to-a-sense-on-lexeme)
            - [Add form to a lexeme](#add-form-to-a-lexeme)
    - [Other projects](#other-projects)
- [Installation](#installation)
- [Using a Wikibase instance](#using-a-wikibase-instance)
    - [Wikimedia Foundation User-Agent policy](#wikimedia-foundation-user-agent-policy)
- [The Core Parts](#the-core-parts)
    - [Entity manipulation](#entity-manipulation)
    - [wbi_login](#wbi_login)
        - [Login using OAuth1 or OAuth2](#login-using-oauth1-or-oauth2)
            - [As a bot](#as-a-bot)
            - [To impersonate a user (OAuth 1.0a)](#to-impersonate-a-user-oauth-10a)
        - [Login with a bot password](#login-with-a-bot-password)
        - [Login with a username and a password](#login-with-a-username-and-a-password)
    - [Wikibase Data Types](#wikibase-data-types)
    - [Structured Data on Commons](#structured-data-on-commons)
        - [Retrieve data](#retrieve-data)
        - [Write data](#write-data)
- [More than Wikibase](#more-than-wikibase)
- [Helper Methods](#helper-methods)
    - [Use MediaWiki API](#use-mediawiki-api)
    - [Execute SPARQL queries](#execute-sparql-queries)
    - [Wikibase search entities](#wikibase-search-entities)
    - [Merge Wikibase items](#merge-wikibase-items)
- [Examples (in "normal" mode)](#examples-in-normal-mode)
    - [Create a new Item](#create-a-new-item)
    - [Modify an existing item](#modify-an-existing-item)
    - [A bot for Mass Import](#a-bot-for-mass-import)
- [Examples (in "fast run" mode)](#examples-in-fast-run-mode)
- [Debugging](#debugging)

# WikibaseIntegrator / WikidataIntegrator #

WikibaseIntegrator (wbi) is a fork of [WikidataIntegrator](https://github.com/SuLab/WikidataIntegrator) (wdi) whose
purpose is to focus on an improved compatibility with Wikibase and adding missing functionalities.
The main differences between these two libraries are :

* A complete rewrite of the library with a more object-oriented architecture allowing for easy interaction, data
  validation and extended functionality
* Add support for reading and writing Lexeme, MediaInfo and Property entities
* Python 3.8 to 3.12 support, validated with unit tests
* Type hints implementation for arguments and return, checked with mypy static type checker
* Add OAuth 2.0 login method
* Add logging module support

But WikibaseIntegrator lack the "fastrun" functionality implemented in WikidataIntegrator.

# Documentation #

A (basic) documentation generated from the python source code is available on
the [Read the Docs website](https://wikibaseintegrator.readthedocs.io/).

## Jupyter notebooks ##

You can find some sample code (adding an entity, a lexeme, etc.) in
the [Jupyter notebook directory](https://github.com/LeMyst/WikibaseIntegrator/tree/master/notebooks) of the repository.

### Common use cases

#### Read an existing entity

From [import_entity.ipynb](notebooks/import_entity.ipynb)

```python
entity = wbi.item.get('Q582')
```

#### Start a new entity

From [item_create_new.ipynb](notebooks/item_create_new.ipynb)

```python
entity = wbi.item.new()
```

#### Write an entity to instance

From [import_entity.ipynb](notebooks/import_entity.ipynb)

```python
entity.write()
```

#### Add labels

From [item_create_new.ipynb](notebooks/item_create_new.ipynb)

```python
entity.labels.set('en', 'New item')
entity.labels.set('fr', 'Nouvel élément')
```

#### Get label value

From [item_get.ipynb](notebooks/item_get.ipynb)

```python
entity.labels.get('en').value
```

#### Add aliases

From [item_create_new.ipynb](notebooks/item_create_new.ipynb)

```python
entity.aliases.set('en', 'Item')
entity.aliases.set('fr', 'Élément')
```

#### Add descriptions

From [item_create_new.ipynb](notebooks/item_create_new.ipynb)

```python
entity.descriptions.set('en', 'A freshly created element')
entity.descriptions.set('fr', 'Un élément fraichement créé')
```

#### Add a simple claim

From [item_create_new.ipynb](notebooks/item_create_new.ipynb)

```python
claim_time = datatypes.Time(prop_nr='P74', time='now')

entity.claims.add(claim_time)
```

#### Get claim value

From [item_get.ipynb](notebooks/item_get.ipynb)

```python
entity.claims.get('P2048')[0].mainsnak.datavalue['value']['amount']
```

#### Manipulate claim, add a qualifier

From [item_create_new.ipynb](notebooks/item_create_new.ipynb)

```python
qualifiers = Qualifiers()
qualifiers.add(datatypes.String(prop_nr='P828', value='Item qualifier'))

claim_string = datatypes.String(prop_nr='P31533', value='A String property', qualifiers=qualifiers)
entity.claims.add(claim_string)
```

#### Manipulate claim, add references

From [item_create_new.ipynb](notebooks/item_create_new.ipynb)

```python
references = References()
reference1 = Reference()
reference1.add(datatypes.String(prop_nr='P828', value='Item string reference'))

reference2 = Reference()
reference2.add(datatypes.String(prop_nr='P828', value='Another item string reference'))

references.add(reference1)
references.add(reference2)

new_claim_string = datatypes.String(prop_nr='P31533', value='A String property', references=references)
entity.claims.add(claim_string)
```

#### Get lemma on lexeme

```python
lexeme.lemmas.get(language='fr')
```

#### Set lemma on lexeme

From [lexeme_update.ipynb](notebooks/lexeme_update.ipynb)

```python
lexeme.lemmas.set(language='fr', value='réponse')
```

#### Add gloss to a sense on lexeme

From [lexeme_write.ipynb](notebooks/lexeme_write.ipynb)

```python
sense = Sense()
sense.glosses.set(language='en', value='English gloss')
sense.glosses.set(language='fr', value='French gloss')
claim = datatypes.String(prop_nr='P828', value="Create a string claim for sense")
sense.claims.add(claim)
lexeme.senses.add(sense)
```

#### Add form to a lexeme

From [lexeme_write.ipynb](notebooks/lexeme_write.ipynb)

```python
form = Form()
form.representations.set(language='en', value='English form representation')
form.representations.set(language='fr', value='French form representation')
claim = datatypes.String(prop_nr='P828', value="Create a string claim for form")
form.claims.add(claim)
lexeme.forms.add(form)
```

## Other projects ##

Here is a list of different projects that use the library:

* https://github.com/ACMILabs/acmi-wikidata-bot - A synchronisation robot to push ACMI API Wikidata links to Wikidata.
* https://github.com/LeMyst/wd-population - Update French population on Wikidata
* https://github.com/SisonkeBiotik-Africa/AfriBioML - Resources for developing a bibliometric study on machine learning for healthcare in Africa
* https://github.com/SisonkeBiotik-Africa/Relational-NER - A Python code for enhancing the output of multilingual named entity recognition based on Wikidata relations
* https://github.com/SoftwareUnderstanding/SALTbot - Software and Article Linker Toolbot
* https://github.com/dpriskorn/ItemSubjector - CLI-tool to easily add "main subject" aka topics in bulk to groups of items on Wikidata
* https://github.com/dpriskorn/hiking_trail_matcher - Script that helps link together hiking trails in Wikidata and OpenStreetMap
* https://github.com/eoan-ermine/wikidata_statistik_population - Update German population on Wikidata
* https://github.com/internetarchive/cgraphbot - Wikibase bot for updating identifiers and citation relationships
* https://github.com/lubianat/ibge_2021_to_wikidata - Update Population of Brazilian Cities
* https://github.com/lcnetdev/lccn-wikidata-bot - Adding LCCNs (Library of Congress Control Number) from NACO (Name Authority Cooperative Program) to Wikidata
* https://github.com/dpriskorn/WikidataEurLexScraper - Improve all Eur-Lex items in Wikidata based on information scraped from Eur-Lex
* https://github.com/dpriskorn/LexDanNet - Help link DanNet 2.2 word ID with Danish Wikidata lexemes
* https://github.com/lubianat/kudos_wikibase
* https://github.com/dlindem/wikibase

# Installation #

The easiest way to install WikibaseIntegrator is to use the `pip` package manager. WikibaseIntegrator supports Python
3.8 and above. If Python 2 is installed, `pip` will lead to an error indicating missing dependencies.

```bash
python -m pip install wikibaseintegrator
```

You can also clone the repo and run it with administrator rights or install it in a virtualenv.

```bash
git clone https://github.com/LeMyst/WikibaseIntegrator.git

cd WikibaseIntegrator

python -m pip install --upgrade pip setuptools

python -m pip install .
```

You can also use Poetry:

```bash
python -m pip install --upgrade poetry

python -m poetry install
```

To check that the installation is correct, launch a Python console and run the following code (which will retrieve the
Wikidata element for [Human](https://www.wikidata.org/entity/Q5)):

```python
from wikibaseintegrator import WikibaseIntegrator

wbi = WikibaseIntegrator()
my_first_wikidata_item = wbi.item.get(entity_id='Q5')

# to check successful installation and retrieval of the data, you can print the json representation of the item
print(my_first_wikidata_item.get_json())
```

# Using a Wikibase instance #

WikibaseIntegrator uses Wikidata as default endpoint. To use another instance of Wikibase instead, you can override the
wbi_config module.

An example for a Wikibase instance installed
with [wikibase-docker](https://github.com/wmde/wikibase-release-pipeline/tree/main/example), add this to the top of your
script:

```python
from wikibaseintegrator.wbi_config import config as wbi_config

wbi_config['MEDIAWIKI_API_URL'] = 'http://localhost/api.php'
wbi_config['SPARQL_ENDPOINT_URL'] = 'http://localhost:8834/proxy/wdqs/bigdata/namespace/wdq/sparql'
wbi_config['WIKIBASE_URL'] = 'http://wikibase.svc'
```

You can find more default settings in the file wbi_config.py

## Wikimedia Foundation User-Agent policy ##

If you interact with a Wikibase instance hosted by the Wikimedia Foundation (like Wikidata, Wikimedia Commons, etc.),
it's highly advised to follow the User-Agent policy that you can find on the
page [User-Agent policy](https://meta.wikimedia.org/wiki/User-Agent_policy)
of the Wikimedia Meta-Wiki.

You can set a complementary User-Agent by modifying the variable `wbi_config['USER_AGENT']` in wbi_config.

For example, with your library name and contact information:

```python
from wikibaseintegrator.wbi_config import config as wbi_config

wbi_config['USER_AGENT'] = 'MyWikibaseBot/1.0 (https://www.wikidata.org/wiki/User:MyUsername)'
```

# The Core Parts #

WikibaseIntegrator supports two modes in which it can be used, a normal mode, updating each item at a time, and a fast
run mode, which preloads some data locally and then just updates items if the new data provided differs from Wikidata.
The latter mode allows for great speedups when tens of thousands of Wikidata elements need to be checked for updates,
but only a small number will eventually be updated, a situation typically encountered when synchronising Wikidata with
an external resource.

## Entity manipulation ##

WikibaseIntegrator supports the manipulation of Item, Property, Lexeme and MediaInfo entities through these classes:

* wikibaseintegrator.entities.item.Item
* wikibaseintegrator.entities.property.Property
* wikibaseintegrator.entities.lexeme.Lexeme
* wikibaseintegrator.entities.mediainfo.MediaInfo

Features:

* Loading a Wikibase entity based on its Wikibase entity ID.
* All Wikibase data types are implemented (and some data types implemented by extensions).
* Full access to the entire Wikibase entity in the form of a JSON dict representation.

## wbi_login ##

`wbi_login` provides the login functionality and also stores the cookies and edit tokens required (For security reasons,
every MediaWiki edit requires an edit token). There is multiple methods to login:

* `wbi_login.OAuth2(consumer_token, consumer_secret)` (recommended)
* `wbi_login.OAuth1(consumer_token, consumer_secret, access_token, access_secret)`
* `wbi_login.Clientlogin(user, password)`
* `wbi_login.Login(user, password)`

There is more parameters available. If you want to authenticate on another instance than Wikidata, you can set the
mediawiki_api_url, mediawiki_rest_url or mediawiki_index_url. Read the documentation for more information.

### Login using OAuth1 or OAuth2 ###

OAuth is the authentication method recommended by the MediaWiki developers. It can be used to authenticate a bot or to
use WBI as a backend for an application.

#### As a bot ####

If you want to use WBI with a bot account, you should use OAuth as
an [Owner-only consumer](https://www.mediawiki.org/wiki/OAuth/Owner-only_consumers). This allows to use the
authentication without the "continue oauth" step.

The first step is to request a new OAuth consumer on your MediaWiki instance on the page
"Special:OAuthConsumerRegistration", the "Owner-only" (or "This consumer is for use only by ...") has to be checked and
the correct version of the OAuth protocol must be set (OAuth 2.0). You will get a consumer token and consumer secret
(and an access token and access secret if you chose OAuth 1.0a). For a Wikimedia instance (like Wikidata), you need to
use the [Meta-Wiki website](https://meta.wikimedia.org/wiki/Special:OAuthConsumerRegistration).

Example if you use OAuth 2.0:

```python
from wikibaseintegrator import wbi_login

login_instance = wbi_login.OAuth2(consumer_token='<your_client_app_key>', consumer_secret='<your_client_app_secret>')
```

Example if you use OAuth 1.0a:

```python
from wikibaseintegrator import wbi_login

login_instance = wbi_login.OAuth1(consumer_token='<your_consumer_key>', consumer_secret='<your_consumer_secret>',
                                  access_token='<your_access_token>', access_secret='<your_access_secret>')
```

#### To impersonate a user (OAuth 1.0a) ####

If WBI is to be used as a backend for a web application, the script must use OAuth for authentication, WBI supports
this, you just need to specify consumer key and consumer secret when instantiating `wbi_login.Login`. Unlike login by
username and password, OAuth is a 2-step process, as manual confirmation of the user for the OAuth login is required.
This means that the `wbi_login.OAuth1.continue_oauth()` method must be called after creating the `wbi_login.Login`
instance.

Example:

```python
from wikibaseintegrator import wbi_login

login_instance = wbi_login.OAuth1(consumer_token='<your_consumer_key>', consumer_secret='<your_consumer_secret>')
login_instance.continue_oauth(oauth_callback_data='<the_callback_url_returned>')
```

The `wbi_login.OAuth1.continue_oauth()` method will either ask the user for a callback URL (normal bot execution) or
take a parameter. Thus, in the case where WBI is used as a backend for a web application for example, the callback will
provide the authentication information directly to the backend and thus no copy and paste of the callback URL is needed.

### Login with a bot password ###

It's a good practice to use [Bot password](https://www.mediawiki.org/wiki/Manual:Bot_passwords) instead of simple
username and password, this allows limiting the permissions given to the bot.

```python
from wikibaseintegrator import wbi_login

login_instance = wbi_login.Login(user='<bot user name>', password='<bot password>')
```

### Login with a username and a password ###

If you want to log in with your user account, you can use the "clientlogin" authentication method. This method is not
recommended.

```python
from wikibaseintegrator import wbi_login

login_instance = wbi_login.Clientlogin(user='<user name>', password='<password>')
```

## Wikibase Data Types ##

Currently, Wikibase supports 17 different data types. The data types are represented as their own classes in
wikibaseintegrator.datatypes. Each datatype has its own peculiarities, which means that some of them require special
parameters (e.g. Globe Coordinates). They are available under the namespace `wikibase.datatypes`.

The data types currently implemented:

* CommonsMedia
* ExternalID
* Form
* GeoShape
* GlobeCoordinate
* Item
* Lexeme
* Math
* MonolingualText
* MusicalNotation
* Property
* Quantity
* Sense
* String
* TabularData
* Time
* URL

Two additional data types are also implemented but require the installation of the MediaWiki extension to work properly:

* extra.EDTF ([Wikibase EDTF](https://www.mediawiki.org/wiki/Extension:Wikibase_EDTF))
* extra.LocalMedia ([Wikibase Local Media](https://www.mediawiki.org/wiki/Extension:Wikibase_Local_Media))

For details of how to create values (=instances) with these data types, please (for now) consult the docstrings in the
source code or the documentation website. Of note, these data type instances hold the values and, if specified, data
type instances for references and qualifiers.

## Structured Data on Commons ##

WikibaseIntegrator supports SDC (Structured Data on Commons) to update a media file hosted on Wikimedia Commons.

### Retrieve data ###

```python
from wikibaseintegrator import WikibaseIntegrator

wbi = WikibaseIntegrator()
media = wbi.mediainfo.get('M16431477')

# Retrieve the first "depicts" (P180) claim
print(media.claims.get('P180')[0].mainsnak.datavalue['value']['id'])
```

### Write data ###

```python
from wikibaseintegrator import WikibaseIntegrator
from wikibaseintegrator.datatypes import Item

wbi = WikibaseIntegrator()
media = wbi.mediainfo.get('M16431477')

# Add the "depicts" (P180) claim
media.claims.add(Item(prop_nr='P180', value='Q3146211'))

media.write()
```

# More than Wikibase #

WikibaseIntegrator natively supports some extensions:

* MediaInfo entity - [WikibaseMediaInfo](https://www.mediawiki.org/wiki/Extension:WikibaseMediaInfo)
* EDTF datatype - [Wikibase EDTF](https://www.mediawiki.org/wiki/Extension:Wikibase_EDTF)
* LocalMedia datatype - [Wikibase Local Media](https://www.mediawiki.org/wiki/Extension:Wikibase_Local_Media)
* Lexeme entity and datatype - [WikibaseLexeme](https://www.mediawiki.org/wiki/Extension:WikibaseLexeme)

# Helper Methods #

## Use MediaWiki API ##

The method `wbi_helpers.mediawiki_api_call_helper()` allows you to execute MediaWiki API POST call. It takes a mandatory
data array (data) and multiple optionals parameters like a login object of type wbi_login.Login, a mediawiki_api_url
string if the MediaWiki is not Wikidata, a user_agent string to set a custom HTTP User Agent header, and an
allow_anonymous boolean to force authentication.

Example:

Retrieve last 10 revisions from Wikidata element Q2 (Earth):

```python
from wikibaseintegrator import wbi_helpers

data = {
    'action': 'query',
    'prop': 'revisions',
    'titles': 'Q2',
    'rvlimit': 10
}

print(wbi_helpers.mediawiki_api_call_helper(data=data, allow_anonymous=True))
```

## Execute SPARQL queries ##

The method `wbi_helpers.execute_sparql_query()` allows you to execute SPARQL queries without a hassle. It takes the
actual query string (query), optional prefixes (prefix) if you do not want to use the standard prefixes of Wikidata, the
actual endpoint URL (endpoint), and you can also specify a user agent for the http header sent to the SPARQL server (
user_agent). The latter is very useful to let the operators of the endpoint know who you are, especially if you execute
many queries on the endpoint. This allows the operators of the endpoint to contact you (e.g. specify an email address,
or the URL to your bot code repository.)

## Wikibase search entities ##

The method `wbi_helpers.search_entities()` allows for string search in a Wikibase instance. This means that labels,
descriptions and aliases can be searched for a string of interest. The method takes five arguments: The actual search
string (search_string), an optional server (mediawiki_api_url, in case the Wikibase instance used is not Wikidata), an
optional user_agent, an optional max_results (default 500), an optional language (default 'en'), and an option
dict_id_label to return a dict of item id and label as a result.

## Merge Wikibase items ##

Sometimes, Wikibase items need to be merged. An API call exists for that, and wbi_core implements a method accordingly.
`wbi_helpers.merge_items()` takes five arguments:

* the QID of the item which should be merged into another item (from_id)
* the QID of the item the first item should be merged into (to_id)
* a login object of type wbi_login.Login to provide the API call with the required authentication information
* a boolean if the changes need to be marked as made by a bot (is_bot)
* a flag for ignoring merge conflicts (ignore_conflicts), will do a partial merge for all statements which do not
  conflict. This should generally be avoided because it leaves a crippled item in Wikibase. Before a merge, any
  potential conflicts should be resolved first.

# Examples (in "normal" mode) #

In order to create a minimal bot based on wbi_core, two things are required:

* A datatype object containing a value.
* An entity object (Item/Property/Lexeme/...) which takes the data, does the checks and performs write.

An optional Login object can be used to be authenticated on the Wikibase instance.

## Create a new Item ##

```python
from wikibaseintegrator import wbi_login, WikibaseIntegrator
from wikibaseintegrator.datatypes import ExternalID
from wikibaseintegrator.wbi_config import config as wbi_config

wbi_config['USER_AGENT'] = 'MyWikibaseBot/1.0 (https://www.wikidata.org/wiki/User:MyUsername)'

# login object
login_instance = wbi_login.OAuth2(consumer_token='<consumer_token>', consumer_secret='<consumer_secret>')

wbi = WikibaseIntegrator(login=login_instance)

# data type object, e.g. for a NCBI gene entrez ID
entrez_gene_id = ExternalID(value='<some_entrez_id>', prop_nr='P351')

# data goes into a list, because many data objects can be provided to
data = [entrez_gene_id]

# Create a new item
item = wbi.item.new()

# Set an english label
item.labels.set(language='en', value='Newly created item')

# Set a French description
item.descriptions.set(language='fr', value='Une description un peu longue')

item.claims.add(data)
item.write()
```

## Modify an existing item ##

```python
from wikibaseintegrator import wbi_login, WikibaseIntegrator
from wikibaseintegrator.datatypes import ExternalID
from wikibaseintegrator.wbi_enums import ActionIfExists
from wikibaseintegrator.wbi_config import config as wbi_config

wbi_config['USER_AGENT'] = 'MyWikibaseBot/1.0 (https://www.wikidata.org/wiki/User:MyUsername)'

# login object
login_instance = wbi_login.OAuth2(consumer_token='<consumer_token>', consumer_secret='<consumer_secret>')

wbi = WikibaseIntegrator(login=login_instance)

# data type object, e.g. for a NCBI gene entrez ID
entrez_gene_id = ExternalID(value='<some_entrez_id>', prop_nr='P351')

# data goes into a list, because many data objects can be provided to
data = [entrez_gene_id]

# Search and then edit an Item
item = wbi.item.get(entity_id='Q141806')

# Set an english label but don't modify it if there is already an entry
item.labels.set(language='en', value='An updated item', action_if_exists=ActionIfExists.KEEP)

# Set a French description and replace the existing one
item.descriptions.set(language='fr', value='Une description un peu longue', action_if_exists=ActionIfExists.REPLACE_ALL)

item.claims.add(data)
item.write()
```

## A bot for Mass Import ##

An enhanced example of the previous bot just puts two of the three things into a 'for loop' and so allows mass creation,
or modification of items.

```python
from wikibaseintegrator import WikibaseIntegrator, wbi_login
from wikibaseintegrator.datatypes import ExternalID, Item, String, Time
from wikibaseintegrator.wbi_config import config as wbi_config
from wikibaseintegrator.wbi_enums import WikibaseDatePrecision

wbi_config['USER_AGENT'] = 'MyWikibaseBot/1.0 (https://www.wikidata.org/wiki/User:MyUsername)'

# login object
login_instance = wbi_login.OAuth2(consumer_token='<consumer_token>', consumer_secret='<consumer_secret>')

# We have raw data, which should be written to Wikidata, namely two human NCBI entrez gene IDs mapped to two Ensembl Gene IDs
raw_data = {
    '50943': 'ENST00000376197',
    '1029': 'ENST00000498124'
}

wbi = WikibaseIntegrator(login=login_instance)

for entrez_id, ensembl in raw_data.items():
    # add some references
    references = [
        [
            Item(value='Q20641742', prop_nr='P248'),
            Time(time='+2020-02-08T00:00:00Z', prop_nr='P813', precision=WikibaseDatePrecision.DAY),
            ExternalID(value='1017', prop_nr='P351')
        ]
    ]

    # data type object
    entrez_gene_id = String(value=entrez_id, prop_nr='P351', references=references)
    ensembl_transcript_id = String(value=ensembl, prop_nr='P704', references=references)

    # data goes into a list, because many data objects can be provided to
    data = [entrez_gene_id, ensembl_transcript_id]

    # Search for and then edit/create new item
    item = wbi.item.new()
    item.claims.add(data)
    item.write()
```

# Examples (in "fast run" mode) #

In order to use the fast run mode, you need to know the property/value combination which determines the data corpus you
would like to operate on. E.g. for operating on human genes, you need to know
that [P351](https://www.wikidata.org/entity/P351) is the NCBI Entrez Gene ID and you also need to know that you are
dealing with humans, best represented by the [found in taxon property (P703)](https://www.wikidata.org/entity/P703) with
the value [Q15978631](https://www.wikidata.org/entity/Q15978631) for Homo sapiens.

IMPORTANT: In order for the fast run mode to work, the data you provide in the constructor must contain at least one
unique value/id only present on one Wikidata element, e.g. an NCBI entrez gene ID, Uniprot ID, etc. Usually, these would
be the same unique core properties used for defining domains in wbi_core, e.g. for genes, proteins, drugs or your custom
domains.

Below, the normal mode run example from above, slightly modified, to meet the requirements for the fast run mode. To
enable it, ItemEngine requires two parameters, fast_run=True/False and fast_run_base_filter which is a dictionary
holding the properties to filter for as keys, and the item QIDs as dict values. If the value is not a QID but a literal,
just provide an empty string. For the above example, the dictionary looks like this:

```python
from wikibaseintegrator.datatypes import ExternalID, Item

fast_run_base_filter = [ExternalID(prop_nr='P351'), Item(prop_nr='P703', value='Q15978631')]
```

The full example:

```python
from wikibaseintegrator import WikibaseIntegrator, wbi_login
from wikibaseintegrator.datatypes import ExternalID, Item, String, Time
from wikibaseintegrator.wbi_enums import WikibaseDatePrecision

# login object
login = wbi_login.OAuth2(consumer_token='<consumer_token>', consumer_secret='<consumer_secret>')

fast_run_base_filter = [ExternalID(prop_nr='P351'), Item(prop_nr='P703', value='Q15978631')]
fast_run = True

# We have raw data, which should be written to Wikidata, namely two human NCBI entrez gene IDs mapped to two Ensembl Gene IDs
# You can iterate over any data source as long as you can map the values to Wikidata properties.
raw_data = {
    '50943': 'ENST00000376197',
    '1029': 'ENST00000498124'
}

for entrez_id, ensembl in raw_data.items():
    # add some references
    references = [
        [
            Item(value='Q20641742', prop_nr='P248')
        ],
        [
            Time(time='+2020-02-08T00:00:00Z', prop_nr='P813', precision=WikibaseDatePrecision.DAY),
            ExternalID(value='1017', prop_nr='P351')
        ]
    ]

    # data type object
    entrez_gene_id = String(value=entrez_id, prop_nr='P351', references=references)
    ensembl_transcript_id = String(value=ensembl, prop_nr='P704', references=references)

    # data goes into a list, because many data objects can be provided to
    data = [entrez_gene_id, ensembl_transcript_id]

    # Search for and then edit/create new item
    wb_item = WikibaseIntegrator(login=login).item.new()
    wb_item.add_claims(claims=data)
    wb_item.init_fastrun(base_filter=fast_run_base_filter)
    wb_item.write()
```

Note: Fastrun mode checks for equality of property/value pairs, qualifiers (not including qualifier attributes), labels,
aliases and description, but it ignores references by default!
References can be checked in fast run mode by setting `use_refs` to `True`.

# Debugging #

You can enable debugging by adding this piece of code to the top of your project:

```python
import logging

logging.basicConfig(level=logging.DEBUG)
```


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/LeMyst/WikibaseIntegrator",
    "name": "WikibaseIntegrator",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "wikibase,wikidata,mediawiki,sparql",
    "author": "Myst",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/b0/68/b808b346ffa7c902b0165f00e408dcd03ca5141cb9610df91ab42cc9485d/wikibaseintegrator-0.12.5.tar.gz",
    "platform": null,
    "description": "# Wikibase Integrator #\n\n[![PyPi](https://img.shields.io/pypi/v/wikibaseintegrator.svg)](https://pypi.python.org/pypi/wikibaseintegrator)\n[![Python pytest](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/python-pytest.yaml/badge.svg)](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/python-pytest.yaml)\n[![Python Code Quality and Lint](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/python-lint.yaml/badge.svg)](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/python-lint.yaml)\n[![CodeQL](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/codeql-analysis.yaml/badge.svg)](https://github.com/LeMyst/WikibaseIntegrator/actions/workflows/codeql-analysis.yaml)\n[![Pyversions](https://img.shields.io/pypi/implementation/wikibaseintegrator.svg)](https://pypi.python.org/pypi/wikibaseintegrator)\n[![Read the Docs](https://readthedocs.org/projects/pip/badge/?version=latest&style=flat)](https://wikibaseintegrator.readthedocs.io)\n\nWikibase Integrator is a python package whose purpose is to manipulate data present on a Wikibase instance (like\nWikidata).\n\n# Breaking changes in v0.12 #\n\nA complete rewrite of the WikibaseIntegrator core has been done in v0.12 which has led to some important changes.\n\nIt offers a new object-oriented approach, better code readability and support for Property, Lexeme and MediaInfo\nentities (in addition to Item).\n\nIf you want to stay on v0.11.x, you can put this line in your requirements.txt:\n\n```\nwikibaseintegrator~=0.11.3\n```\n\n---\n\n<!-- ToC generator: https://luciopaiva.com/markdown-toc/ -->\n\n- [WikibaseIntegrator / WikidataIntegrator](#wikibaseintegrator--wikidataintegrator)\n- [Documentation](#documentation)\n    - [Jupyter notebooks](#jupyter-notebooks)\n        - [Common use cases](#common-use-cases)\n            - [Read an existing entity](#read-an-existing-entity)\n            - [Start a new entity](#start-a-new-entity)\n            - [Write an entity to instance](#write-an-entity-to-instance)\n            - [Add labels](#add-labels)\n            - [Get label value](#get-label-value)\n            - [Add aliases](#add-aliases)\n            - [Add descriptions](#add-descriptions)\n            - [Add a simple claim](#add-a-simple-claim)\n            - [Get claim value](#get-claim-value)\n            - [Manipulate claim, add a qualifier](#manipulate-claim-add-a-qualifier)\n            - [Manipulate claim, add references](#manipulate-claim-add-references)\n            - [Set lemma on lexeme](#set-lemma-on-lexeme)\n            - [Add gloss to a sense on lexeme](#add-gloss-to-a-sense-on-lexeme)\n            - [Add form to a lexeme](#add-form-to-a-lexeme)\n    - [Other projects](#other-projects)\n- [Installation](#installation)\n- [Using a Wikibase instance](#using-a-wikibase-instance)\n    - [Wikimedia Foundation User-Agent policy](#wikimedia-foundation-user-agent-policy)\n- [The Core Parts](#the-core-parts)\n    - [Entity manipulation](#entity-manipulation)\n    - [wbi_login](#wbi_login)\n        - [Login using OAuth1 or OAuth2](#login-using-oauth1-or-oauth2)\n            - [As a bot](#as-a-bot)\n            - [To impersonate a user (OAuth 1.0a)](#to-impersonate-a-user-oauth-10a)\n        - [Login with a bot password](#login-with-a-bot-password)\n        - [Login with a username and a password](#login-with-a-username-and-a-password)\n    - [Wikibase Data Types](#wikibase-data-types)\n    - [Structured Data on Commons](#structured-data-on-commons)\n        - [Retrieve data](#retrieve-data)\n        - [Write data](#write-data)\n- [More than Wikibase](#more-than-wikibase)\n- [Helper Methods](#helper-methods)\n    - [Use MediaWiki API](#use-mediawiki-api)\n    - [Execute SPARQL queries](#execute-sparql-queries)\n    - [Wikibase search entities](#wikibase-search-entities)\n    - [Merge Wikibase items](#merge-wikibase-items)\n- [Examples (in \"normal\" mode)](#examples-in-normal-mode)\n    - [Create a new Item](#create-a-new-item)\n    - [Modify an existing item](#modify-an-existing-item)\n    - [A bot for Mass Import](#a-bot-for-mass-import)\n- [Examples (in \"fast run\" mode)](#examples-in-fast-run-mode)\n- [Debugging](#debugging)\n\n# WikibaseIntegrator / WikidataIntegrator #\n\nWikibaseIntegrator (wbi) is a fork of [WikidataIntegrator](https://github.com/SuLab/WikidataIntegrator) (wdi) whose\npurpose is to focus on an improved compatibility with Wikibase and adding missing functionalities.\nThe main differences between these two libraries are :\n\n* A complete rewrite of the library with a more object-oriented architecture allowing for easy interaction, data\n  validation and extended functionality\n* Add support for reading and writing Lexeme, MediaInfo and Property entities\n* Python 3.8 to 3.12 support, validated with unit tests\n* Type hints implementation for arguments and return, checked with mypy static type checker\n* Add OAuth 2.0 login method\n* Add logging module support\n\nBut WikibaseIntegrator lack the \"fastrun\" functionality implemented in WikidataIntegrator.\n\n# Documentation #\n\nA (basic) documentation generated from the python source code is available on\nthe [Read the Docs website](https://wikibaseintegrator.readthedocs.io/).\n\n## Jupyter notebooks ##\n\nYou can find some sample code (adding an entity, a lexeme, etc.) in\nthe [Jupyter notebook directory](https://github.com/LeMyst/WikibaseIntegrator/tree/master/notebooks) of the repository.\n\n### Common use cases\n\n#### Read an existing entity\n\nFrom [import_entity.ipynb](notebooks/import_entity.ipynb)\n\n```python\nentity = wbi.item.get('Q582')\n```\n\n#### Start a new entity\n\nFrom [item_create_new.ipynb](notebooks/item_create_new.ipynb)\n\n```python\nentity = wbi.item.new()\n```\n\n#### Write an entity to instance\n\nFrom [import_entity.ipynb](notebooks/import_entity.ipynb)\n\n```python\nentity.write()\n```\n\n#### Add labels\n\nFrom [item_create_new.ipynb](notebooks/item_create_new.ipynb)\n\n```python\nentity.labels.set('en', 'New item')\nentity.labels.set('fr', 'Nouvel \u00e9l\u00e9ment')\n```\n\n#### Get label value\n\nFrom [item_get.ipynb](notebooks/item_get.ipynb)\n\n```python\nentity.labels.get('en').value\n```\n\n#### Add aliases\n\nFrom [item_create_new.ipynb](notebooks/item_create_new.ipynb)\n\n```python\nentity.aliases.set('en', 'Item')\nentity.aliases.set('fr', '\u00c9l\u00e9ment')\n```\n\n#### Add descriptions\n\nFrom [item_create_new.ipynb](notebooks/item_create_new.ipynb)\n\n```python\nentity.descriptions.set('en', 'A freshly created element')\nentity.descriptions.set('fr', 'Un \u00e9l\u00e9ment fraichement cr\u00e9\u00e9')\n```\n\n#### Add a simple claim\n\nFrom [item_create_new.ipynb](notebooks/item_create_new.ipynb)\n\n```python\nclaim_time = datatypes.Time(prop_nr='P74', time='now')\n\nentity.claims.add(claim_time)\n```\n\n#### Get claim value\n\nFrom [item_get.ipynb](notebooks/item_get.ipynb)\n\n```python\nentity.claims.get('P2048')[0].mainsnak.datavalue['value']['amount']\n```\n\n#### Manipulate claim, add a qualifier\n\nFrom [item_create_new.ipynb](notebooks/item_create_new.ipynb)\n\n```python\nqualifiers = Qualifiers()\nqualifiers.add(datatypes.String(prop_nr='P828', value='Item qualifier'))\n\nclaim_string = datatypes.String(prop_nr='P31533', value='A String property', qualifiers=qualifiers)\nentity.claims.add(claim_string)\n```\n\n#### Manipulate claim, add references\n\nFrom [item_create_new.ipynb](notebooks/item_create_new.ipynb)\n\n```python\nreferences = References()\nreference1 = Reference()\nreference1.add(datatypes.String(prop_nr='P828', value='Item string reference'))\n\nreference2 = Reference()\nreference2.add(datatypes.String(prop_nr='P828', value='Another item string reference'))\n\nreferences.add(reference1)\nreferences.add(reference2)\n\nnew_claim_string = datatypes.String(prop_nr='P31533', value='A String property', references=references)\nentity.claims.add(claim_string)\n```\n\n#### Get lemma on lexeme\n\n```python\nlexeme.lemmas.get(language='fr')\n```\n\n#### Set lemma on lexeme\n\nFrom [lexeme_update.ipynb](notebooks/lexeme_update.ipynb)\n\n```python\nlexeme.lemmas.set(language='fr', value='r\u00e9ponse')\n```\n\n#### Add gloss to a sense on lexeme\n\nFrom [lexeme_write.ipynb](notebooks/lexeme_write.ipynb)\n\n```python\nsense = Sense()\nsense.glosses.set(language='en', value='English gloss')\nsense.glosses.set(language='fr', value='French gloss')\nclaim = datatypes.String(prop_nr='P828', value=\"Create a string claim for sense\")\nsense.claims.add(claim)\nlexeme.senses.add(sense)\n```\n\n#### Add form to a lexeme\n\nFrom [lexeme_write.ipynb](notebooks/lexeme_write.ipynb)\n\n```python\nform = Form()\nform.representations.set(language='en', value='English form representation')\nform.representations.set(language='fr', value='French form representation')\nclaim = datatypes.String(prop_nr='P828', value=\"Create a string claim for form\")\nform.claims.add(claim)\nlexeme.forms.add(form)\n```\n\n## Other projects ##\n\nHere is a list of different projects that use the library:\n\n* https://github.com/ACMILabs/acmi-wikidata-bot - A synchronisation robot to push ACMI API Wikidata links to Wikidata.\n* https://github.com/LeMyst/wd-population - Update French population on Wikidata\n* https://github.com/SisonkeBiotik-Africa/AfriBioML - Resources for developing a bibliometric study on machine learning for healthcare in Africa\n* https://github.com/SisonkeBiotik-Africa/Relational-NER - A Python code for enhancing the output of multilingual named entity recognition based on Wikidata relations\n* https://github.com/SoftwareUnderstanding/SALTbot - Software and Article Linker Toolbot\n* https://github.com/dpriskorn/ItemSubjector - CLI-tool to easily add \"main subject\" aka topics in bulk to groups of items on Wikidata\n* https://github.com/dpriskorn/hiking_trail_matcher - Script that helps link together hiking trails in Wikidata and OpenStreetMap\n* https://github.com/eoan-ermine/wikidata_statistik_population - Update German population on Wikidata\n* https://github.com/internetarchive/cgraphbot - Wikibase bot for updating identifiers and citation relationships\n* https://github.com/lubianat/ibge_2021_to_wikidata - Update Population of Brazilian Cities\n* https://github.com/lcnetdev/lccn-wikidata-bot - Adding LCCNs (Library of Congress Control Number) from NACO (Name Authority Cooperative Program) to Wikidata\n* https://github.com/dpriskorn/WikidataEurLexScraper - Improve all Eur-Lex items in Wikidata based on information scraped from Eur-Lex\n* https://github.com/dpriskorn/LexDanNet - Help link DanNet 2.2 word ID with Danish Wikidata lexemes\n* https://github.com/lubianat/kudos_wikibase\n* https://github.com/dlindem/wikibase\n\n# Installation #\n\nThe easiest way to install WikibaseIntegrator is to use the `pip` package manager. WikibaseIntegrator supports Python\n3.8 and above. If Python 2 is installed, `pip` will lead to an error indicating missing dependencies.\n\n```bash\npython -m pip install wikibaseintegrator\n```\n\nYou can also clone the repo and run it with administrator rights or install it in a virtualenv.\n\n```bash\ngit clone https://github.com/LeMyst/WikibaseIntegrator.git\n\ncd WikibaseIntegrator\n\npython -m pip install --upgrade pip setuptools\n\npython -m pip install .\n```\n\nYou can also use Poetry:\n\n```bash\npython -m pip install --upgrade poetry\n\npython -m poetry install\n```\n\nTo check that the installation is correct, launch a Python console and run the following code (which will retrieve the\nWikidata element for [Human](https://www.wikidata.org/entity/Q5)):\n\n```python\nfrom wikibaseintegrator import WikibaseIntegrator\n\nwbi = WikibaseIntegrator()\nmy_first_wikidata_item = wbi.item.get(entity_id='Q5')\n\n# to check successful installation and retrieval of the data, you can print the json representation of the item\nprint(my_first_wikidata_item.get_json())\n```\n\n# Using a Wikibase instance #\n\nWikibaseIntegrator uses Wikidata as default endpoint. To use another instance of Wikibase instead, you can override the\nwbi_config module.\n\nAn example for a Wikibase instance installed\nwith [wikibase-docker](https://github.com/wmde/wikibase-release-pipeline/tree/main/example), add this to the top of your\nscript:\n\n```python\nfrom wikibaseintegrator.wbi_config import config as wbi_config\n\nwbi_config['MEDIAWIKI_API_URL'] = 'http://localhost/api.php'\nwbi_config['SPARQL_ENDPOINT_URL'] = 'http://localhost:8834/proxy/wdqs/bigdata/namespace/wdq/sparql'\nwbi_config['WIKIBASE_URL'] = 'http://wikibase.svc'\n```\n\nYou can find more default settings in the file wbi_config.py\n\n## Wikimedia Foundation User-Agent policy ##\n\nIf you interact with a Wikibase instance hosted by the Wikimedia Foundation (like Wikidata, Wikimedia Commons, etc.),\nit's highly advised to follow the User-Agent policy that you can find on the\npage [User-Agent policy](https://meta.wikimedia.org/wiki/User-Agent_policy)\nof the Wikimedia Meta-Wiki.\n\nYou can set a complementary User-Agent by modifying the variable `wbi_config['USER_AGENT']` in wbi_config.\n\nFor example, with your library name and contact information:\n\n```python\nfrom wikibaseintegrator.wbi_config import config as wbi_config\n\nwbi_config['USER_AGENT'] = 'MyWikibaseBot/1.0 (https://www.wikidata.org/wiki/User:MyUsername)'\n```\n\n# The Core Parts #\n\nWikibaseIntegrator supports two modes in which it can be used, a normal mode, updating each item at a time, and a fast\nrun mode, which preloads some data locally and then just updates items if the new data provided differs from Wikidata.\nThe latter mode allows for great speedups when tens of thousands of Wikidata elements need to be checked for updates,\nbut only a small number will eventually be updated, a situation typically encountered when synchronising Wikidata with\nan external resource.\n\n## Entity manipulation ##\n\nWikibaseIntegrator supports the manipulation of Item, Property, Lexeme and MediaInfo entities through these classes:\n\n* wikibaseintegrator.entities.item.Item\n* wikibaseintegrator.entities.property.Property\n* wikibaseintegrator.entities.lexeme.Lexeme\n* wikibaseintegrator.entities.mediainfo.MediaInfo\n\nFeatures:\n\n* Loading a Wikibase entity based on its Wikibase entity ID.\n* All Wikibase data types are implemented (and some data types implemented by extensions).\n* Full access to the entire Wikibase entity in the form of a JSON dict representation.\n\n## wbi_login ##\n\n`wbi_login` provides the login functionality and also stores the cookies and edit tokens required (For security reasons,\nevery MediaWiki edit requires an edit token). There is multiple methods to login:\n\n* `wbi_login.OAuth2(consumer_token, consumer_secret)` (recommended)\n* `wbi_login.OAuth1(consumer_token, consumer_secret, access_token, access_secret)`\n* `wbi_login.Clientlogin(user, password)`\n* `wbi_login.Login(user, password)`\n\nThere is more parameters available. If you want to authenticate on another instance than Wikidata, you can set the\nmediawiki_api_url, mediawiki_rest_url or mediawiki_index_url. Read the documentation for more information.\n\n### Login using OAuth1 or OAuth2 ###\n\nOAuth is the authentication method recommended by the MediaWiki developers. It can be used to authenticate a bot or to\nuse WBI as a backend for an application.\n\n#### As a bot ####\n\nIf you want to use WBI with a bot account, you should use OAuth as\nan [Owner-only consumer](https://www.mediawiki.org/wiki/OAuth/Owner-only_consumers). This allows to use the\nauthentication without the \"continue oauth\" step.\n\nThe first step is to request a new OAuth consumer on your MediaWiki instance on the page\n\"Special:OAuthConsumerRegistration\", the \"Owner-only\" (or \"This consumer is for use only by ...\") has to be checked and\nthe correct version of the OAuth protocol must be set (OAuth 2.0). You will get a consumer token and consumer secret\n(and an access token and access secret if you chose OAuth 1.0a). For a Wikimedia instance (like Wikidata), you need to\nuse the [Meta-Wiki website](https://meta.wikimedia.org/wiki/Special:OAuthConsumerRegistration).\n\nExample if you use OAuth 2.0:\n\n```python\nfrom wikibaseintegrator import wbi_login\n\nlogin_instance = wbi_login.OAuth2(consumer_token='<your_client_app_key>', consumer_secret='<your_client_app_secret>')\n```\n\nExample if you use OAuth 1.0a:\n\n```python\nfrom wikibaseintegrator import wbi_login\n\nlogin_instance = wbi_login.OAuth1(consumer_token='<your_consumer_key>', consumer_secret='<your_consumer_secret>',\n                                  access_token='<your_access_token>', access_secret='<your_access_secret>')\n```\n\n#### To impersonate a user (OAuth 1.0a) ####\n\nIf WBI is to be used as a backend for a web application, the script must use OAuth for authentication, WBI supports\nthis, you just need to specify consumer key and consumer secret when instantiating `wbi_login.Login`. Unlike login by\nusername and password, OAuth is a 2-step process, as manual confirmation of the user for the OAuth login is required.\nThis means that the `wbi_login.OAuth1.continue_oauth()` method must be called after creating the `wbi_login.Login`\ninstance.\n\nExample:\n\n```python\nfrom wikibaseintegrator import wbi_login\n\nlogin_instance = wbi_login.OAuth1(consumer_token='<your_consumer_key>', consumer_secret='<your_consumer_secret>')\nlogin_instance.continue_oauth(oauth_callback_data='<the_callback_url_returned>')\n```\n\nThe `wbi_login.OAuth1.continue_oauth()` method will either ask the user for a callback URL (normal bot execution) or\ntake a parameter. Thus, in the case where WBI is used as a backend for a web application for example, the callback will\nprovide the authentication information directly to the backend and thus no copy and paste of the callback URL is needed.\n\n### Login with a bot password ###\n\nIt's a good practice to use [Bot password](https://www.mediawiki.org/wiki/Manual:Bot_passwords) instead of simple\nusername and password, this allows limiting the permissions given to the bot.\n\n```python\nfrom wikibaseintegrator import wbi_login\n\nlogin_instance = wbi_login.Login(user='<bot user name>', password='<bot password>')\n```\n\n### Login with a username and a password ###\n\nIf you want to log in with your user account, you can use the \"clientlogin\" authentication method. This method is not\nrecommended.\n\n```python\nfrom wikibaseintegrator import wbi_login\n\nlogin_instance = wbi_login.Clientlogin(user='<user name>', password='<password>')\n```\n\n## Wikibase Data Types ##\n\nCurrently, Wikibase supports 17 different data types. The data types are represented as their own classes in\nwikibaseintegrator.datatypes. Each datatype has its own peculiarities, which means that some of them require special\nparameters (e.g. Globe Coordinates). They are available under the namespace `wikibase.datatypes`.\n\nThe data types currently implemented:\n\n* CommonsMedia\n* ExternalID\n* Form\n* GeoShape\n* GlobeCoordinate\n* Item\n* Lexeme\n* Math\n* MonolingualText\n* MusicalNotation\n* Property\n* Quantity\n* Sense\n* String\n* TabularData\n* Time\n* URL\n\nTwo additional data types are also implemented but require the installation of the MediaWiki extension to work properly:\n\n* extra.EDTF ([Wikibase EDTF](https://www.mediawiki.org/wiki/Extension:Wikibase_EDTF))\n* extra.LocalMedia ([Wikibase Local Media](https://www.mediawiki.org/wiki/Extension:Wikibase_Local_Media))\n\nFor details of how to create values (=instances) with these data types, please (for now) consult the docstrings in the\nsource code or the documentation website. Of note, these data type instances hold the values and, if specified, data\ntype instances for references and qualifiers.\n\n## Structured Data on Commons ##\n\nWikibaseIntegrator supports SDC (Structured Data on Commons) to update a media file hosted on Wikimedia Commons.\n\n### Retrieve data ###\n\n```python\nfrom wikibaseintegrator import WikibaseIntegrator\n\nwbi = WikibaseIntegrator()\nmedia = wbi.mediainfo.get('M16431477')\n\n# Retrieve the first \"depicts\" (P180) claim\nprint(media.claims.get('P180')[0].mainsnak.datavalue['value']['id'])\n```\n\n### Write data ###\n\n```python\nfrom wikibaseintegrator import WikibaseIntegrator\nfrom wikibaseintegrator.datatypes import Item\n\nwbi = WikibaseIntegrator()\nmedia = wbi.mediainfo.get('M16431477')\n\n# Add the \"depicts\" (P180) claim\nmedia.claims.add(Item(prop_nr='P180', value='Q3146211'))\n\nmedia.write()\n```\n\n# More than Wikibase #\n\nWikibaseIntegrator natively supports some extensions:\n\n* MediaInfo entity - [WikibaseMediaInfo](https://www.mediawiki.org/wiki/Extension:WikibaseMediaInfo)\n* EDTF datatype - [Wikibase EDTF](https://www.mediawiki.org/wiki/Extension:Wikibase_EDTF)\n* LocalMedia datatype - [Wikibase Local Media](https://www.mediawiki.org/wiki/Extension:Wikibase_Local_Media)\n* Lexeme entity and datatype - [WikibaseLexeme](https://www.mediawiki.org/wiki/Extension:WikibaseLexeme)\n\n# Helper Methods #\n\n## Use MediaWiki API ##\n\nThe method `wbi_helpers.mediawiki_api_call_helper()` allows you to execute MediaWiki API POST call. It takes a mandatory\ndata array (data) and multiple optionals parameters like a login object of type wbi_login.Login, a mediawiki_api_url\nstring if the MediaWiki is not Wikidata, a user_agent string to set a custom HTTP User Agent header, and an\nallow_anonymous boolean to force authentication.\n\nExample:\n\nRetrieve last 10 revisions from Wikidata element Q2 (Earth):\n\n```python\nfrom wikibaseintegrator import wbi_helpers\n\ndata = {\n    'action': 'query',\n    'prop': 'revisions',\n    'titles': 'Q2',\n    'rvlimit': 10\n}\n\nprint(wbi_helpers.mediawiki_api_call_helper(data=data, allow_anonymous=True))\n```\n\n## Execute SPARQL queries ##\n\nThe method `wbi_helpers.execute_sparql_query()` allows you to execute SPARQL queries without a hassle. It takes the\nactual query string (query), optional prefixes (prefix) if you do not want to use the standard prefixes of Wikidata, the\nactual endpoint URL (endpoint), and you can also specify a user agent for the http header sent to the SPARQL server (\nuser_agent). The latter is very useful to let the operators of the endpoint know who you are, especially if you execute\nmany queries on the endpoint. This allows the operators of the endpoint to contact you (e.g. specify an email address,\nor the URL to your bot code repository.)\n\n## Wikibase search entities ##\n\nThe method `wbi_helpers.search_entities()` allows for string search in a Wikibase instance. This means that labels,\ndescriptions and aliases can be searched for a string of interest. The method takes five arguments: The actual search\nstring (search_string), an optional server (mediawiki_api_url, in case the Wikibase instance used is not Wikidata), an\noptional user_agent, an optional max_results (default 500), an optional language (default 'en'), and an option\ndict_id_label to return a dict of item id and label as a result.\n\n## Merge Wikibase items ##\n\nSometimes, Wikibase items need to be merged. An API call exists for that, and wbi_core implements a method accordingly.\n`wbi_helpers.merge_items()` takes five arguments:\n\n* the QID of the item which should be merged into another item (from_id)\n* the QID of the item the first item should be merged into (to_id)\n* a login object of type wbi_login.Login to provide the API call with the required authentication information\n* a boolean if the changes need to be marked as made by a bot (is_bot)\n* a flag for ignoring merge conflicts (ignore_conflicts), will do a partial merge for all statements which do not\n  conflict. This should generally be avoided because it leaves a crippled item in Wikibase. Before a merge, any\n  potential conflicts should be resolved first.\n\n# Examples (in \"normal\" mode) #\n\nIn order to create a minimal bot based on wbi_core, two things are required:\n\n* A datatype object containing a value.\n* An entity object (Item/Property/Lexeme/...) which takes the data, does the checks and performs write.\n\nAn optional Login object can be used to be authenticated on the Wikibase instance.\n\n## Create a new Item ##\n\n```python\nfrom wikibaseintegrator import wbi_login, WikibaseIntegrator\nfrom wikibaseintegrator.datatypes import ExternalID\nfrom wikibaseintegrator.wbi_config import config as wbi_config\n\nwbi_config['USER_AGENT'] = 'MyWikibaseBot/1.0 (https://www.wikidata.org/wiki/User:MyUsername)'\n\n# login object\nlogin_instance = wbi_login.OAuth2(consumer_token='<consumer_token>', consumer_secret='<consumer_secret>')\n\nwbi = WikibaseIntegrator(login=login_instance)\n\n# data type object, e.g. for a NCBI gene entrez ID\nentrez_gene_id = ExternalID(value='<some_entrez_id>', prop_nr='P351')\n\n# data goes into a list, because many data objects can be provided to\ndata = [entrez_gene_id]\n\n# Create a new item\nitem = wbi.item.new()\n\n# Set an english label\nitem.labels.set(language='en', value='Newly created item')\n\n# Set a French description\nitem.descriptions.set(language='fr', value='Une description un peu longue')\n\nitem.claims.add(data)\nitem.write()\n```\n\n## Modify an existing item ##\n\n```python\nfrom wikibaseintegrator import wbi_login, WikibaseIntegrator\nfrom wikibaseintegrator.datatypes import ExternalID\nfrom wikibaseintegrator.wbi_enums import ActionIfExists\nfrom wikibaseintegrator.wbi_config import config as wbi_config\n\nwbi_config['USER_AGENT'] = 'MyWikibaseBot/1.0 (https://www.wikidata.org/wiki/User:MyUsername)'\n\n# login object\nlogin_instance = wbi_login.OAuth2(consumer_token='<consumer_token>', consumer_secret='<consumer_secret>')\n\nwbi = WikibaseIntegrator(login=login_instance)\n\n# data type object, e.g. for a NCBI gene entrez ID\nentrez_gene_id = ExternalID(value='<some_entrez_id>', prop_nr='P351')\n\n# data goes into a list, because many data objects can be provided to\ndata = [entrez_gene_id]\n\n# Search and then edit an Item\nitem = wbi.item.get(entity_id='Q141806')\n\n# Set an english label but don't modify it if there is already an entry\nitem.labels.set(language='en', value='An updated item', action_if_exists=ActionIfExists.KEEP)\n\n# Set a French description and replace the existing one\nitem.descriptions.set(language='fr', value='Une description un peu longue', action_if_exists=ActionIfExists.REPLACE_ALL)\n\nitem.claims.add(data)\nitem.write()\n```\n\n## A bot for Mass Import ##\n\nAn enhanced example of the previous bot just puts two of the three things into a 'for loop' and so allows mass creation,\nor modification of items.\n\n```python\nfrom wikibaseintegrator import WikibaseIntegrator, wbi_login\nfrom wikibaseintegrator.datatypes import ExternalID, Item, String, Time\nfrom wikibaseintegrator.wbi_config import config as wbi_config\nfrom wikibaseintegrator.wbi_enums import WikibaseDatePrecision\n\nwbi_config['USER_AGENT'] = 'MyWikibaseBot/1.0 (https://www.wikidata.org/wiki/User:MyUsername)'\n\n# login object\nlogin_instance = wbi_login.OAuth2(consumer_token='<consumer_token>', consumer_secret='<consumer_secret>')\n\n# We have raw data, which should be written to Wikidata, namely two human NCBI entrez gene IDs mapped to two Ensembl Gene IDs\nraw_data = {\n    '50943': 'ENST00000376197',\n    '1029': 'ENST00000498124'\n}\n\nwbi = WikibaseIntegrator(login=login_instance)\n\nfor entrez_id, ensembl in raw_data.items():\n    # add some references\n    references = [\n        [\n            Item(value='Q20641742', prop_nr='P248'),\n            Time(time='+2020-02-08T00:00:00Z', prop_nr='P813', precision=WikibaseDatePrecision.DAY),\n            ExternalID(value='1017', prop_nr='P351')\n        ]\n    ]\n\n    # data type object\n    entrez_gene_id = String(value=entrez_id, prop_nr='P351', references=references)\n    ensembl_transcript_id = String(value=ensembl, prop_nr='P704', references=references)\n\n    # data goes into a list, because many data objects can be provided to\n    data = [entrez_gene_id, ensembl_transcript_id]\n\n    # Search for and then edit/create new item\n    item = wbi.item.new()\n    item.claims.add(data)\n    item.write()\n```\n\n# Examples (in \"fast run\" mode) #\n\nIn order to use the fast run mode, you need to know the property/value combination which determines the data corpus you\nwould like to operate on. E.g. for operating on human genes, you need to know\nthat [P351](https://www.wikidata.org/entity/P351) is the NCBI Entrez Gene ID and you also need to know that you are\ndealing with humans, best represented by the [found in taxon property (P703)](https://www.wikidata.org/entity/P703) with\nthe value [Q15978631](https://www.wikidata.org/entity/Q15978631) for Homo sapiens.\n\nIMPORTANT: In order for the fast run mode to work, the data you provide in the constructor must contain at least one\nunique value/id only present on one Wikidata element, e.g. an NCBI entrez gene ID, Uniprot ID, etc. Usually, these would\nbe the same unique core properties used for defining domains in wbi_core, e.g. for genes, proteins, drugs or your custom\ndomains.\n\nBelow, the normal mode run example from above, slightly modified, to meet the requirements for the fast run mode. To\nenable it, ItemEngine requires two parameters, fast_run=True/False and fast_run_base_filter which is a dictionary\nholding the properties to filter for as keys, and the item QIDs as dict values. If the value is not a QID but a literal,\njust provide an empty string. For the above example, the dictionary looks like this:\n\n```python\nfrom wikibaseintegrator.datatypes import ExternalID, Item\n\nfast_run_base_filter = [ExternalID(prop_nr='P351'), Item(prop_nr='P703', value='Q15978631')]\n```\n\nThe full example:\n\n```python\nfrom wikibaseintegrator import WikibaseIntegrator, wbi_login\nfrom wikibaseintegrator.datatypes import ExternalID, Item, String, Time\nfrom wikibaseintegrator.wbi_enums import WikibaseDatePrecision\n\n# login object\nlogin = wbi_login.OAuth2(consumer_token='<consumer_token>', consumer_secret='<consumer_secret>')\n\nfast_run_base_filter = [ExternalID(prop_nr='P351'), Item(prop_nr='P703', value='Q15978631')]\nfast_run = True\n\n# We have raw data, which should be written to Wikidata, namely two human NCBI entrez gene IDs mapped to two Ensembl Gene IDs\n# You can iterate over any data source as long as you can map the values to Wikidata properties.\nraw_data = {\n    '50943': 'ENST00000376197',\n    '1029': 'ENST00000498124'\n}\n\nfor entrez_id, ensembl in raw_data.items():\n    # add some references\n    references = [\n        [\n            Item(value='Q20641742', prop_nr='P248')\n        ],\n        [\n            Time(time='+2020-02-08T00:00:00Z', prop_nr='P813', precision=WikibaseDatePrecision.DAY),\n            ExternalID(value='1017', prop_nr='P351')\n        ]\n    ]\n\n    # data type object\n    entrez_gene_id = String(value=entrez_id, prop_nr='P351', references=references)\n    ensembl_transcript_id = String(value=ensembl, prop_nr='P704', references=references)\n\n    # data goes into a list, because many data objects can be provided to\n    data = [entrez_gene_id, ensembl_transcript_id]\n\n    # Search for and then edit/create new item\n    wb_item = WikibaseIntegrator(login=login).item.new()\n    wb_item.add_claims(claims=data)\n    wb_item.init_fastrun(base_filter=fast_run_base_filter)\n    wb_item.write()\n```\n\nNote: Fastrun mode checks for equality of property/value pairs, qualifiers (not including qualifier attributes), labels,\naliases and description, but it ignores references by default!\nReferences can be checked in fast run mode by setting `use_refs` to `True`.\n\n# Debugging #\n\nYou can enable debugging by adding this piece of code to the top of your project:\n\n```python\nimport logging\n\nlogging.basicConfig(level=logging.DEBUG)\n```\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python package for reading from and writing to a Wikibase instance",
    "version": "0.12.5",
    "project_urls": {
        "Bug Tracker": "https://github.com/LeMyst/WikibaseIntegrator/issues",
        "Changelog": "https://github.com/LeMyst/WikibaseIntegrator/releases",
        "Documentation": "https://wikibaseintegrator.readthedocs.io",
        "Homepage": "https://github.com/LeMyst/WikibaseIntegrator",
        "Repository": "https://github.com/LeMyst/WikibaseIntegrator"
    },
    "split_keywords": [
        "wikibase",
        "wikidata",
        "mediawiki",
        "sparql"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "97e44a48b49e9ae2c9c4ac09ca3c60311fe50031040dc9883c115f0a93c79eb9",
                "md5": "399f4ff787775d494327cd6c7010d706",
                "sha256": "ef0c05bd21e65bcf79d6b3dc158e8c671370f5f54f37285f417a7cdf120e94cc"
            },
            "downloads": -1,
            "filename": "wikibaseintegrator-0.12.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "399f4ff787775d494327cd6c7010d706",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 81818,
            "upload_time": "2024-01-07T15:38:51",
            "upload_time_iso_8601": "2024-01-07T15:38:51.036953Z",
            "url": "https://files.pythonhosted.org/packages/97/e4/4a48b49e9ae2c9c4ac09ca3c60311fe50031040dc9883c115f0a93c79eb9/wikibaseintegrator-0.12.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b068b808b346ffa7c902b0165f00e408dcd03ca5141cb9610df91ab42cc9485d",
                "md5": "01cb90e0f030057e54121fb9f03b5357",
                "sha256": "2f218224ffcd7a3b574d9c6a83657b249118cc0a60d70befb54b8b9536073dcf"
            },
            "downloads": -1,
            "filename": "wikibaseintegrator-0.12.5.tar.gz",
            "has_sig": false,
            "md5_digest": "01cb90e0f030057e54121fb9f03b5357",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 64629,
            "upload_time": "2024-01-07T15:38:53",
            "upload_time_iso_8601": "2024-01-07T15:38:53.142066Z",
            "url": "https://files.pythonhosted.org/packages/b0/68/b808b346ffa7c902b0165f00e408dcd03ca5141cb9610df91ab42cc9485d/wikibaseintegrator-0.12.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-07 15:38:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "LeMyst",
    "github_project": "WikibaseIntegrator",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "wikibaseintegrator"
}
        
Elapsed time: 0.17134s