nlp-kafka-rest-api

Name	nlp-kafka-rest-api JSON
Version	1.0.0 JSON
	download
home_page
Summary	This package is a wrapper for REST API requests to Kafka Proxy.
upload_time	2023-03-22 15:42:23
maintainer
docs_url	None
author
requires_python	>=3.6
license	Copyright (c) 2023 Merck KGaA, Darmstadt, Germany Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords	nlp kafka rest proxy
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Kafka REST API

## Installation
If you are installing from a remote or public repository, run: `pip install kafka-rest-api`.
If you are installing from a *wheel* file in the local directory, run: `pip install {filename}.whl`, 
and replace `{filename}` with the name of the *.whl* file.

## Getting Started
Interactions with a Kafka cluster can be performed on a Producer/Consumer paradigm. As such there are two classes that
can be imported and used to publish and subscribe to topics in Kafka: **Producer** and **Consumer**.

## Configuration
When using this package to access Merck API Gateway, you can define the following environment variables:
- **KAFKA_REST_API_URL**: Target Kafka REST API URL. In alternative, you can pass the argument `kafka_rest_api_url`
to the Producer and the Consumer constructor.
- **X_API_KEY**: The authorization token to validate API requests to API Gateway. In alternative, you can pass a dictionary
with the format `{"x-api-key": "your-api-key", "other-header-key": "other-header-value", etc...}` to the key parameter `auth_headers`
in both the Producer and the Consumer constructors.
- **TOPIC_ID**: MSK Topic ID assigned to the user. In alternative, you can pass a string with the topic ID 
to the key argument `topic_id` in both the Producer and the Consumer constructors.

### Producer
#### Produce json data
In the snippet below the topic *pke* is used as example. The pattern for the producer is the following:
```python
from kafka_rest import Producer
producer = Producer()
new_keys = producer.produce(messages_to_pke_endpoint, "pke")
```
Please note that each message in the list of messages to the target endpoint should correspond to the payload that
is expected by that endpoint that would otherwise be a JSON object.

For example, a valid message to the *pke* endpoint is:
```python
{
  "text": "Genome engineering is a powerful tool.", 
  "algorithm": "TopicRank", 
  "n_best": 10
}
```
To know which message format you should use for each endpoint, please consult the documentation for NLP API.

The `Producer.produce` method automatically generates a unique key (UUID) for each message. 
Optionally, you can provide your unique keys as well, passing a list of keys (strings) to the argument `keys`.

#### Produce files
To produce files as inputs to a given endpoint, you can use the method `produce_files`. The required arguments
for this method are:
- `files` which consists of a list of absolute or relative paths to the input files;
- `endpoint` target endpoint (*pdf2text*, for example);
```python
from kafka_rest import Producer
producer = Producer()
new_keys = producer.produce_files(files=list_of_files, endpoint="pdf2text")
```

### Consumer
#### Pattern 1 - Iterator
Arguably, the most useful way of consuming messages with the Consumer class is as follows:
```python
from kafka_rest import Consumer
with Consumer() as consumer:
    for data, remaining_keys in consumer.consume(keys):
        print((data, remaining_keys))
```
#### Pattern 2 - Step-by-step instantiation (Chain)
You can also opt to do a step-by-step instantiation and have a finer control of each request sent by the Consumer to the NLP API:
```python
from kafka_rest import Consumer
consumer = Consumer()
consumer.create()
consumer.subscribe()
# or consumer.create().subscribe().consume(keys)

for data, remaining_keys in consumer.consume(keys):
        print((data, remaining_keys))

consumer.delete()
```
#### Pattern 3 - Consume all
Optionally, the Consumer can just return when all keys were exhausted, i.e.: when all messages were consumed.
For that, please use the `consume_all` method.
```python
from kafka_rest import Consumer
with Consumer() as consumer:
    data = consumer.consume_all(keys)
```

#### Full example
Produce files as inputs and consume outputs.

```python
from kafka_rest import Producer, Consumer

producer = Producer()
new_keys = producer.produce_files(["files/file1.pdf", "files/file2.pdf"], "pdf2text")

with Consumer() as consumer:
    for data, remaining_keys in enumerate(consumer.consume(new_keys)):
        if data:
            print(f"Data: {data} | Remaining Keys: {remaining_keys}")
```

For more snippets, please check the example in the file `kafka_rest/snippets` in this repo.

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "nlp-kafka-rest-api",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "nlp,kafka,rest,proxy",
    "author": "",
    "author_email": "",
    "download_url": "",
    "platform": null,
    "description": "# Kafka REST API\n\n## Installation\nIf you are installing from a remote or public repository, run: `pip install kafka-rest-api`.\nIf you are installing from a *wheel* file in the local directory, run: `pip install {filename}.whl`, \nand replace `{filename}` with the name of the *.whl* file.\n\n## Getting Started\nInteractions with a Kafka cluster can be performed on a Producer/Consumer paradigm. As such there are two classes that\ncan be imported and used to publish and subscribe to topics in Kafka: **Producer** and **Consumer**.\n\n## Configuration\nWhen using this package to access Merck API Gateway, you can define the following environment variables:\n- **KAFKA_REST_API_URL**: Target Kafka REST API URL. In alternative, you can pass the argument `kafka_rest_api_url`\nto the Producer and the Consumer constructor.\n- **X_API_KEY**: The authorization token to validate API requests to API Gateway. In alternative, you can pass a dictionary\nwith the format `{\"x-api-key\": \"your-api-key\", \"other-header-key\": \"other-header-value\", etc...}` to the key parameter `auth_headers`\nin both the Producer and the Consumer constructors.\n- **TOPIC_ID**: MSK Topic ID assigned to the user. In alternative, you can pass a string with the topic ID \nto the key argument `topic_id` in both the Producer and the Consumer constructors.\n\n### Producer\n#### Produce json data\nIn the snippet below the topic *pke* is used as example. The pattern for the producer is the following:\n```python\nfrom kafka_rest import Producer\nproducer = Producer()\nnew_keys = producer.produce(messages_to_pke_endpoint, \"pke\")\n```\nPlease note that each message in the list of messages to the target endpoint should correspond to the payload that\nis expected by that endpoint that would otherwise be a JSON object.\n\nFor example, a valid message to the *pke* endpoint is:\n```python\n{\n  \"text\": \"Genome engineering is a powerful tool.\", \n  \"algorithm\": \"TopicRank\", \n  \"n_best\": 10\n}\n```\nTo know which message format you should use for each endpoint, please consult the documentation for NLP API.\n\nThe `Producer.produce` method automatically generates a unique key (UUID) for each message. \nOptionally, you can provide your unique keys as well, passing a list of keys (strings) to the argument `keys`.\n\n#### Produce files\nTo produce files as inputs to a given endpoint, you can use the method `produce_files`. The required arguments\nfor this method are:\n- `files` which consists of a list of absolute or relative paths to the input files;\n- `endpoint` target endpoint (*pdf2text*, for example);\n```python\nfrom kafka_rest import Producer\nproducer = Producer()\nnew_keys = producer.produce_files(files=list_of_files, endpoint=\"pdf2text\")\n```\n\n### Consumer\n#### Pattern 1 - Iterator\nArguably, the most useful way of consuming messages with the Consumer class is as follows:\n```python\nfrom kafka_rest import Consumer\nwith Consumer() as consumer:\n    for data, remaining_keys in consumer.consume(keys):\n        print((data, remaining_keys))\n```\n#### Pattern 2 - Step-by-step instantiation (Chain)\nYou can also opt to do a step-by-step instantiation and have a finer control of each request sent by the Consumer to the NLP API:\n```python\nfrom kafka_rest import Consumer\nconsumer = Consumer()\nconsumer.create()\nconsumer.subscribe()\n# or consumer.create().subscribe().consume(keys)\n\nfor data, remaining_keys in consumer.consume(keys):\n        print((data, remaining_keys))\n\nconsumer.delete()\n```\n#### Pattern 3 - Consume all\nOptionally, the Consumer can just return when all keys were exhausted, i.e.: when all messages were consumed.\nFor that, please use the `consume_all` method.\n```python\nfrom kafka_rest import Consumer\nwith Consumer() as consumer:\n    data = consumer.consume_all(keys)\n```\n\n#### Full example\nProduce files as inputs and consume outputs.\n\n```python\nfrom kafka_rest import Producer, Consumer\n\nproducer = Producer()\nnew_keys = producer.produce_files([\"files/file1.pdf\", \"files/file2.pdf\"], \"pdf2text\")\n\nwith Consumer() as consumer:\n    for data, remaining_keys in enumerate(consumer.consume(new_keys)):\n        if data:\n            print(f\"Data: {data} | Remaining Keys: {remaining_keys}\")\n```\n\nFor more snippets, please check the example in the file `kafka_rest/snippets` in this repo.\n",
    "bugtrack_url": null,
    "license": "Copyright (c) 2023 Merck KGaA, Darmstadt, Germany  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "This package is a wrapper for REST API requests to Kafka Proxy.",
    "version": "1.0.0",
    "split_keywords": [
        "nlp",
        "kafka",
        "rest",
        "proxy"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d56429860f76471739aaf1039a84b6e8dc6ec236513adfc7e508f9501a1da4a2",
                "md5": "7f4d54987214e2cdc4bcf7a42d533679",
                "sha256": "6f1db36b36a5bc103f83f4f48bc78f320405b3a35137a57baea43d228dd0bf8d"
            },
            "downloads": -1,
            "filename": "nlp_kafka_rest_api-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7f4d54987214e2cdc4bcf7a42d533679",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 9499,
            "upload_time": "2023-03-22T15:42:23",
            "upload_time_iso_8601": "2023-03-22T15:42:23.832754Z",
            "url": "https://files.pythonhosted.org/packages/d5/64/29860f76471739aaf1039a84b6e8dc6ec236513adfc7e508f9501a1da4a2/nlp_kafka_rest_api-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-22 15:42:23",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "nlp-kafka-rest-api"
}