python-epo-ops-client


Namepython-epo-ops-client JSON
Version 4.1.0 PyPI version JSON
download
home_pagehttps://github.com/ip-tools/python-epo-ops-client
SummaryPython client for EPO OPS, the European Patent Office's Open Patent Services API.
upload_time2024-01-25 03:07:51
maintainerAndreas Motl
docs_urlNone
authorGeorge Song
requires_python
license
keywords ops epo epo-ops patent-data patent-office patent-data-api european patent office open patent services
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # python-epo-ops-client

[![PyPI](https://img.shields.io/pypi/v/python-epo-ops-client)](https://pypi.org/project/python-epo-ops-client/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/python-epo-ops-client)](https://pypi.org/project/python-epo-ops-client/)
[![GHA](https://github.com/ip-tools/python-epo-ops-client/actions/workflows/main.yml/badge.svg)](https://github.com/ip-tools/python-epo-ops-client/actions/workflows/main.yml)
[![Codecov](https://codecov.io/gh/ip-tools/python-epo-ops-client/branch/main/graph/badge.svg)](https://codecov.io/gh/ip-tools/python-epo-ops-client)

python-epo-ops-client is an [Apache2 licensed][apache license] client library
for accessing the [European Patent Office][epo]'s ("EPO") [Open Patent
Services][ops] ("OPS") v.3.2 (based on [v 1.3.16 of the reference guide][ops guide]).

```python
import epo_ops

client = epo_ops.Client(key='abc', secret='xyz')  # Instantiate client
response = client.published_data(  # Retrieve bibliography data
  reference_type = 'publication',  # publication, application, priority
  input = epo_ops.models.Docdb('1000000', 'EP', 'A1'),  # original, docdb, epodoc
  endpoint = 'biblio',  # optional, defaults to biblio in case of published_data
  constituents = []  # optional, list of constituents
)
```

---

## Features

`python-epo-ops-client` abstracts away the complexities of accessing EPO OPS:

- Format the requests properly
- Bubble up quota problems as proper HTTP errors
- Handle token authentication and renewals automatically
- Handle throttling properly
- Add optional caching to minimize impact on the OPS servers

There are two main layers to `python-epo-ops-client`: Client and Middleware.

### Client

The Client contains all the formatting and token handling logic and is what
you'll interact with mostly.

When you issue a request, the response is a [requests.Response][] object. If
`response.status_code != 200` then a `requests.HTTPError` exception will be
raised — it's your responsibility to handle those exceptions if you want to. The
one case that's handled is when the access token has expired: in this case, the
client will automatically handle the HTTP 400 status and renew the token.

Note that the Client does not attempt to interpret the data supplied by OPS, so
it's your responsibility to parse the XML or JSON payload for your own purpose.

The following custom exceptions are raised for cases when OPS quotas are
exceeded, they are all in the `epo_ops.exceptions` module and are subclasses of
`requests.HTTPError`, and therefore offer the same behaviors:

- IndividualQuotaPerHourExceeded
- RegisteredQuotaPerWeekExceeded

Again, it's up to you to parse the response and decide what to do.

Currently the Client knows how to issue request for the following services:

| Client method                                                                 | API end point         | throttle  |
| ----------------------------------------------------------------------------- | --------------------- | --------- |
| `family(reference_type, input, endpoint=None, constituents=None)`             | family                | inpadoc   |
| `image(path, range=1, extension='tiff')`                                      | published-data/images | images    |
| `number(reference_type, input, output_format)`                                | number-service        | other     |
| `published_data(reference_type, input, endpoint='biblio', constituents=None)` | published-data        | retrieval |
| `published_data_search(cql, range_begin=1, range_end=25, constituents=None)`  | published-data/search | search    |
| `register(reference_type, input, constituents=['biblio'])`                    | register              | other     |
| `register_search(cql, range_begin=1, range_end=25)`                           | register/search       | other     |
| `register_search(cql, range_begin=1, range_end=25)`                           | register/search       | other     |

Bulk operations can be achieved by passing a list of valid models to the
published_data input field.

See the [OPS guide][] or use the [Developer's Area][] for more information on
how to use each service.

Please submit pull requests for the following services by enhancing the
`epo_ops.api.Client` class:

- Legal service

### Middleware

All requests and responses are passed through each middleware object listed in
`client.middlewares`. Requests are processed in the order listed, and responses
are processed in the _reverse_ order.

Each middleware should subclass `middlewares.Middleware` and implement the
`process_request` and `process_response` methods.

There are two middleware classes out of the box: Throttler and Dogpile.
Throttler is in charge of the OPS throttling rules and will delay requests
accordingly. Dogpile is an optional cache which will cache all HTTP status 200,
404, 405, and 413 responses.

By default, only the Throttler middleware is enabled, if you want to enable
caching:

```python
import epo_ops

middlewares = [
    epo_ops.middlewares.Dogpile(),
    epo_ops.middlewares.Throttler(),
]
client = epo_ops.Client(
    key='key',
    secret='secret',
    middlewares=middlewares,
)
```

You'll also need to install caching dependencies in your projects, such as `pip install dogpile.cache`.

_Note that caching middleware should be first in most cases._

#### Dogpile

Dogpile is based on (surprise) [dogpile.cache][]. By default it is instantiated
with a DBMBackend region with timeout of 2 weeks.

Dogpile takes three optional instantiation parameters:

- `region`: You can pass whatever valid [dogpile.cache Region][] you want to
  backend the cache
- `kwargs_handlers`: A list of keyword argument handlers, which it will use to
  process the kwargs passed to the request object in order to extract elements
  for generating the cache key. Currently one handler is implemented (and
  instantiated by default) to make sure that the range request header is part of
  the cache key.
- `http_status_codes`: A list of HTTP status codes that you would like to have
  cached. By default 200, 404, 405, and 413 responses are cached.

**Note**: dogpile.cache is not installed by default, if you want to use it, `pip install dogpile.cache` in your project.

#### Throttler

Throttler contains all the logic for handling different throttling scenarios.
Since OPS throttling is based on a one minute rolling window, we must persist
historical (at least for the past minute) throtting data in order to know what
the proper request frequency is. Each Throttler must be instantiated with a
Storage object.

##### Storage

The Storage object is responsible for:

1.  Knowing how to update the historical record with each request
    (`Storage.update()`), making sure to observe the one minute rolling window
    rule.
2.  Calculating how long to wait before issuing the next request
    (`Storage.delay_for()`).

Currently the only Storage backend provided is SQLite, but you can easily write
your own Storage backend (such as file, Redis, etc.). To use a custom Storage
type, just pass the Storage object when you're instantiating a Throttler object.
See `epo_ops.middlewares.throttle.storages.Storage` for more implementation
details.

[apache license]: http://www.apache.org/licenses/LICENSE-2.0
[developer's area]: https://developers.epo.org/ops-v3-2/apis
[dogpile.cache region]: http://dogpilecache.readthedocs.org/en/latest/api.html#module-dogpile.cache.region
[dogpile.cache]: https://bitbucket.org/zzzeek/dogpile.cache
[epo]: http://epo.org
[ops guide]: https://link.epo.org/web/ops_v3.2_documentation_-_version_1.3.19_en.pdf
[ops]: https://www.epo.org/searching-for-patents/data/web-services/ops.html
[requests.response]: http://requests.readthedocs.org/en/latest/user/advanced/#request-and-response-objects

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ip-tools/python-epo-ops-client",
    "name": "python-epo-ops-client",
    "maintainer": "Andreas Motl",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "andreas.motl@ip-tools.org",
    "keywords": "ops,epo,epo-ops,patent-data,patent-office,patent-data-api,european patent office,open patent services",
    "author": "George Song",
    "author_email": "george@monozuku.com",
    "download_url": "https://files.pythonhosted.org/packages/ce/cf/2cc1f43d7d32e3c616178088ca4f7a6a724033f509f84545a913f4614011/python-epo-ops-client-4.1.0.tar.gz",
    "platform": null,
    "description": "# python-epo-ops-client\n\n[![PyPI](https://img.shields.io/pypi/v/python-epo-ops-client)](https://pypi.org/project/python-epo-ops-client/)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/python-epo-ops-client)](https://pypi.org/project/python-epo-ops-client/)\n[![GHA](https://github.com/ip-tools/python-epo-ops-client/actions/workflows/main.yml/badge.svg)](https://github.com/ip-tools/python-epo-ops-client/actions/workflows/main.yml)\n[![Codecov](https://codecov.io/gh/ip-tools/python-epo-ops-client/branch/main/graph/badge.svg)](https://codecov.io/gh/ip-tools/python-epo-ops-client)\n\npython-epo-ops-client is an [Apache2 licensed][apache license] client library\nfor accessing the [European Patent Office][epo]'s (\"EPO\") [Open Patent\nServices][ops] (\"OPS\") v.3.2 (based on [v 1.3.16 of the reference guide][ops guide]).\n\n```python\nimport epo_ops\n\nclient = epo_ops.Client(key='abc', secret='xyz')  # Instantiate client\nresponse = client.published_data(  # Retrieve bibliography data\n  reference_type = 'publication',  # publication, application, priority\n  input = epo_ops.models.Docdb('1000000', 'EP', 'A1'),  # original, docdb, epodoc\n  endpoint = 'biblio',  # optional, defaults to biblio in case of published_data\n  constituents = []  # optional, list of constituents\n)\n```\n\n---\n\n## Features\n\n`python-epo-ops-client` abstracts away the complexities of accessing EPO OPS:\n\n- Format the requests properly\n- Bubble up quota problems as proper HTTP errors\n- Handle token authentication and renewals automatically\n- Handle throttling properly\n- Add optional caching to minimize impact on the OPS servers\n\nThere are two main layers to `python-epo-ops-client`: Client and Middleware.\n\n### Client\n\nThe Client contains all the formatting and token handling logic and is what\nyou'll interact with mostly.\n\nWhen you issue a request, the response is a [requests.Response][] object. If\n`response.status_code != 200` then a `requests.HTTPError` exception will be\nraised \u2014 it's your responsibility to handle those exceptions if you want to. The\none case that's handled is when the access token has expired: in this case, the\nclient will automatically handle the HTTP 400 status and renew the token.\n\nNote that the Client does not attempt to interpret the data supplied by OPS, so\nit's your responsibility to parse the XML or JSON payload for your own purpose.\n\nThe following custom exceptions are raised for cases when OPS quotas are\nexceeded, they are all in the `epo_ops.exceptions` module and are subclasses of\n`requests.HTTPError`, and therefore offer the same behaviors:\n\n- IndividualQuotaPerHourExceeded\n- RegisteredQuotaPerWeekExceeded\n\nAgain, it's up to you to parse the response and decide what to do.\n\nCurrently the Client knows how to issue request for the following services:\n\n| Client method                                                                 | API end point         | throttle  |\n| ----------------------------------------------------------------------------- | --------------------- | --------- |\n| `family(reference_type, input, endpoint=None, constituents=None)`             | family                | inpadoc   |\n| `image(path, range=1, extension='tiff')`                                      | published-data/images | images    |\n| `number(reference_type, input, output_format)`                                | number-service        | other     |\n| `published_data(reference_type, input, endpoint='biblio', constituents=None)` | published-data        | retrieval |\n| `published_data_search(cql, range_begin=1, range_end=25, constituents=None)`  | published-data/search | search    |\n| `register(reference_type, input, constituents=['biblio'])`                    | register              | other     |\n| `register_search(cql, range_begin=1, range_end=25)`                           | register/search       | other     |\n| `register_search(cql, range_begin=1, range_end=25)`                           | register/search       | other     |\n\nBulk operations can be achieved by passing a list of valid models to the\npublished_data input field.\n\nSee the [OPS guide][] or use the [Developer's Area][] for more information on\nhow to use each service.\n\nPlease submit pull requests for the following services by enhancing the\n`epo_ops.api.Client` class:\n\n- Legal service\n\n### Middleware\n\nAll requests and responses are passed through each middleware object listed in\n`client.middlewares`. Requests are processed in the order listed, and responses\nare processed in the _reverse_ order.\n\nEach middleware should subclass `middlewares.Middleware` and implement the\n`process_request` and `process_response` methods.\n\nThere are two middleware classes out of the box: Throttler and Dogpile.\nThrottler is in charge of the OPS throttling rules and will delay requests\naccordingly. Dogpile is an optional cache which will cache all HTTP status 200,\n404, 405, and 413 responses.\n\nBy default, only the Throttler middleware is enabled, if you want to enable\ncaching:\n\n```python\nimport epo_ops\n\nmiddlewares = [\n    epo_ops.middlewares.Dogpile(),\n    epo_ops.middlewares.Throttler(),\n]\nclient = epo_ops.Client(\n    key='key',\n    secret='secret',\n    middlewares=middlewares,\n)\n```\n\nYou'll also need to install caching dependencies in your projects, such as `pip install dogpile.cache`.\n\n_Note that caching middleware should be first in most cases._\n\n#### Dogpile\n\nDogpile is based on (surprise) [dogpile.cache][]. By default it is instantiated\nwith a DBMBackend region with timeout of 2 weeks.\n\nDogpile takes three optional instantiation parameters:\n\n- `region`: You can pass whatever valid [dogpile.cache Region][] you want to\n  backend the cache\n- `kwargs_handlers`: A list of keyword argument handlers, which it will use to\n  process the kwargs passed to the request object in order to extract elements\n  for generating the cache key. Currently one handler is implemented (and\n  instantiated by default) to make sure that the range request header is part of\n  the cache key.\n- `http_status_codes`: A list of HTTP status codes that you would like to have\n  cached. By default 200, 404, 405, and 413 responses are cached.\n\n**Note**: dogpile.cache is not installed by default, if you want to use it, `pip install dogpile.cache` in your project.\n\n#### Throttler\n\nThrottler contains all the logic for handling different throttling scenarios.\nSince OPS throttling is based on a one minute rolling window, we must persist\nhistorical (at least for the past minute) throtting data in order to know what\nthe proper request frequency is. Each Throttler must be instantiated with a\nStorage object.\n\n##### Storage\n\nThe Storage object is responsible for:\n\n1.  Knowing how to update the historical record with each request\n    (`Storage.update()`), making sure to observe the one minute rolling window\n    rule.\n2.  Calculating how long to wait before issuing the next request\n    (`Storage.delay_for()`).\n\nCurrently the only Storage backend provided is SQLite, but you can easily write\nyour own Storage backend (such as file, Redis, etc.). To use a custom Storage\ntype, just pass the Storage object when you're instantiating a Throttler object.\nSee `epo_ops.middlewares.throttle.storages.Storage` for more implementation\ndetails.\n\n[apache license]: http://www.apache.org/licenses/LICENSE-2.0\n[developer's area]: https://developers.epo.org/ops-v3-2/apis\n[dogpile.cache region]: http://dogpilecache.readthedocs.org/en/latest/api.html#module-dogpile.cache.region\n[dogpile.cache]: https://bitbucket.org/zzzeek/dogpile.cache\n[epo]: http://epo.org\n[ops guide]: https://link.epo.org/web/ops_v3.2_documentation_-_version_1.3.19_en.pdf\n[ops]: https://www.epo.org/searching-for-patents/data/web-services/ops.html\n[requests.response]: http://requests.readthedocs.org/en/latest/user/advanced/#request-and-response-objects\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Python client for EPO OPS, the European Patent Office's Open Patent Services API.",
    "version": "4.1.0",
    "project_urls": {
        "Download": "https://pypi.org/project/python-epo-ops-client/#files",
        "Homepage": "https://github.com/ip-tools/python-epo-ops-client"
    },
    "split_keywords": [
        "ops",
        "epo",
        "epo-ops",
        "patent-data",
        "patent-office",
        "patent-data-api",
        "european patent office",
        "open patent services"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3eeea28416ed4f1a0bea0c655700d9ead92c8d9d491513914a7624805a4a5cd5",
                "md5": "a2d8c203e73947542edbb427e3b180fb",
                "sha256": "dcb436f1131bb09cb928100ca35ed40e896e7cdc0f11ac7824fb2e1f7d2b151f"
            },
            "downloads": -1,
            "filename": "python_epo_ops_client-4.1.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a2d8c203e73947542edbb427e3b180fb",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 20719,
            "upload_time": "2024-01-25T03:07:49",
            "upload_time_iso_8601": "2024-01-25T03:07:49.205984Z",
            "url": "https://files.pythonhosted.org/packages/3e/ee/a28416ed4f1a0bea0c655700d9ead92c8d9d491513914a7624805a4a5cd5/python_epo_ops_client-4.1.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cecf2cc1f43d7d32e3c616178088ca4f7a6a724033f509f84545a913f4614011",
                "md5": "93562b5baf1def68f96964421c273513",
                "sha256": "41f7e2fe950275922f0f5f7185627cc2fa796267b2b4da515a4e67b2eec53038"
            },
            "downloads": -1,
            "filename": "python-epo-ops-client-4.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "93562b5baf1def68f96964421c273513",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 28747,
            "upload_time": "2024-01-25T03:07:51",
            "upload_time_iso_8601": "2024-01-25T03:07:51.310681Z",
            "url": "https://files.pythonhosted.org/packages/ce/cf/2cc1f43d7d32e3c616178088ca4f7a6a724033f509f84545a913f4614011/python-epo-ops-client-4.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-25 03:07:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ip-tools",
    "github_project": "python-epo-ops-client",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "python-epo-ops-client"
}
        
Elapsed time: 0.21182s