aws-json-dataset


Nameaws-json-dataset JSON
Version 0.1.0 PyPI version JSON
download
home_page
SummarySend JSON datasets to various AWS services.
upload_time2024-02-03 22:56:45
maintainer
docs_urlNone
author
requires_python>=3.10
licenseMIT License Copyright (c) 2023 Gregory Lindsey Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords aws json dataset s3 kinesis firehose sqs data streaming data engineering
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # aws-json-dataset
[![Build](https://github.com/chrisammon3000/aws-json-dataset/actions/workflows/run_tests.yml/badge.svg?style=for-the-badge)](https://github.com/chrisammon3000/aws-json-dataset/actions/workflows/run_tests.yml) [![codecov](https://codecov.io/github/chrisammon3000/aws-json-dataset/branch/main/graph/badge.svg?token=QSZLP51RWJ)](https://codecov.io/github/chrisammon3000/aws-json-dataset?style=for-the-badge) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/downloads/release/python-3100/)

<!-- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](https://opensource.org/licenses/MIT) [![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg?style=for-the-badge)](https://www.python.org/downloads/release/python-3100/) -->

Lightweight and simple Python package to quickly send batches of JSON data to various AWS services.

## Description
The idea behind this library is to create an easy, quick way to send JSON data to AWS services.
- SQS
- SNS
- Kinesis Firehose
- Kinesis Data Streams (coming soon)

JSON is an extremely common format and each AWS service has it's own API with different requirements for how to send data. 

This library includes functionality for:
- Automatically handling batch API calls to SNS, SQS and Kinesis Firehose
- Manages available services based on record size
- Base64 conversion for Kinesis streams

### Roadmap
- [ ] Support for Kinesis Data Streams
- [ ] Support for DynamoDB inserts, updates and deletes
- [ ] Support for S3, including gzip compression and JSON lines format
- [ ] Support for FIFO SQS queues
- [ ] Support for SNS topics

## Quickstart
Set up your [AWS credentials](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html) and environment variables and export them to the environment.
```bash
export AWS_PROFILE=<profile>
export AWS_REGION=<region>
```


Install the library using pip.
```bash
pip install -i https://test.pypi.org/simple/ aws-json-dataset
```

Send JSON data to various AWS services.
```python
from awsjsondataset import AwsJsonDataset

# create a list of JSON objects
data = [ {"id": idx, "data": "<data>"} for idx in range(100) ]

# Wrap using AwsJsonDataset
dataset = AwsJsonDataset(data=data)

# Send to SQS queue
dataset.sqs("<sqs_queue_url>").send_messages()

# Send to SNS topic
dataset.sns("<sns_topic_arn>").publish_messages()

# Send to Kinesis Firehose stream
dataset.firehose("<delivery_stream_name>").put_records()
```

## Local Development
Follow the steps to set up the deployment environment.

### Prerequisites
* AWS credentials
* Python 3.10

### Creating a Python Virtual Environment
When developing locally, create a Python virtual environment to manage dependencies:
```bash
python3.10 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install .[dev,test]
```

### Environment Variables
Create a `.env` file in the project root.
```bash
AWS_REGION=<region>
```

***Important:*** *Always use a `.env` file or AWS SSM Parameter Store or Secrets Manager for sensitive variables like credentials and API keys. Never hard-code them, including when developing. AWS will quarantine an account if any credentials get accidentally exposed and this will cause problems.* &rarr; ***Make sure that `.env` is listed in `.gitignore`***

### AWS Credentials
Valid AWS credentials must be available to AWS CLI and SAM CLI. The easiest way to do this is running `aws configure`, or by adding them to `~/.aws/credentials` and exporting the `AWS_PROFILE` variable to the environment.

For more information visit the documentation page:
[Configuration and credential file settings](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html)

## Unit Tests
Follow the steps above to create a Python virtual environment. Run tests with the following command.
```bash
coverage run -m pytest
```

## Troubleshooting
* Check your AWS credentials in `~/.aws/credentials`
* Check that the environment variables are available to the services that need them
* Check that the correct environment or interpreter is being used for Python

<!-- ## References & Links -->

## Authors
**Primary Contact:** Gregory Christopher Lindsey (@chrisammon3000)

## License
This library is licensed under the MIT-0 License. See the LICENSE file.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "aws-json-dataset",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "aws,json,dataset,s3,kinesis,firehose,sqs,data,streaming,data engineering",
    "author": "",
    "author_email": "Gregory Lindsey <gclindsey@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/ee/9a/ec3b20e8429b5a52f0b54829318b040a27512f04aabf5203152843b79efd/aws-json-dataset-0.1.0.tar.gz",
    "platform": null,
    "description": "# aws-json-dataset\n[![Build](https://github.com/chrisammon3000/aws-json-dataset/actions/workflows/run_tests.yml/badge.svg?style=for-the-badge)](https://github.com/chrisammon3000/aws-json-dataset/actions/workflows/run_tests.yml) [![codecov](https://codecov.io/github/chrisammon3000/aws-json-dataset/branch/main/graph/badge.svg?token=QSZLP51RWJ)](https://codecov.io/github/chrisammon3000/aws-json-dataset?style=for-the-badge) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/downloads/release/python-3100/)\n\n<!-- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](https://opensource.org/licenses/MIT) [![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg?style=for-the-badge)](https://www.python.org/downloads/release/python-3100/) -->\n\nLightweight and simple Python package to quickly send batches of JSON data to various AWS services.\n\n## Description\nThe idea behind this library is to create an easy, quick way to send JSON data to AWS services.\n- SQS\n- SNS\n- Kinesis Firehose\n- Kinesis Data Streams (coming soon)\n\nJSON is an extremely common format and each AWS service has it's own API with different requirements for how to send data. \n\nThis library includes functionality for:\n- Automatically handling batch API calls to SNS, SQS and Kinesis Firehose\n- Manages available services based on record size\n- Base64 conversion for Kinesis streams\n\n### Roadmap\n- [ ] Support for Kinesis Data Streams\n- [ ] Support for DynamoDB inserts, updates and deletes\n- [ ] Support for S3, including gzip compression and JSON lines format\n- [ ] Support for FIFO SQS queues\n- [ ] Support for SNS topics\n\n## Quickstart\nSet up your [AWS credentials](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html) and environment variables and export them to the environment.\n```bash\nexport AWS_PROFILE=<profile>\nexport AWS_REGION=<region>\n```\n\n\nInstall the library using pip.\n```bash\npip install -i https://test.pypi.org/simple/ aws-json-dataset\n```\n\nSend JSON data to various AWS services.\n```python\nfrom awsjsondataset import AwsJsonDataset\n\n# create a list of JSON objects\ndata = [ {\"id\": idx, \"data\": \"<data>\"} for idx in range(100) ]\n\n# Wrap using AwsJsonDataset\ndataset = AwsJsonDataset(data=data)\n\n# Send to SQS queue\ndataset.sqs(\"<sqs_queue_url>\").send_messages()\n\n# Send to SNS topic\ndataset.sns(\"<sns_topic_arn>\").publish_messages()\n\n# Send to Kinesis Firehose stream\ndataset.firehose(\"<delivery_stream_name>\").put_records()\n```\n\n## Local Development\nFollow the steps to set up the deployment environment.\n\n### Prerequisites\n* AWS credentials\n* Python 3.10\n\n### Creating a Python Virtual Environment\nWhen developing locally, create a Python virtual environment to manage dependencies:\n```bash\npython3.10 -m venv .venv\nsource .venv/bin/activate\npip install -U pip\npip install .[dev,test]\n```\n\n### Environment Variables\nCreate a `.env` file in the project root.\n```bash\nAWS_REGION=<region>\n```\n\n***Important:*** *Always use a `.env` file or AWS SSM Parameter Store or Secrets Manager for sensitive variables like credentials and API keys. Never hard-code them, including when developing. AWS will quarantine an account if any credentials get accidentally exposed and this will cause problems.* &rarr; ***Make sure that `.env` is listed in `.gitignore`***\n\n### AWS Credentials\nValid AWS credentials must be available to AWS CLI and SAM CLI. The easiest way to do this is running `aws configure`, or by adding them to `~/.aws/credentials` and exporting the `AWS_PROFILE` variable to the environment.\n\nFor more information visit the documentation page:\n[Configuration and credential file settings](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html)\n\n## Unit Tests\nFollow the steps above to create a Python virtual environment. Run tests with the following command.\n```bash\ncoverage run -m pytest\n```\n\n## Troubleshooting\n* Check your AWS credentials in `~/.aws/credentials`\n* Check that the environment variables are available to the services that need them\n* Check that the correct environment or interpreter is being used for Python\n\n<!-- ## References & Links -->\n\n## Authors\n**Primary Contact:** Gregory Christopher Lindsey (@chrisammon3000)\n\n## License\nThis library is licensed under the MIT-0 License. See the LICENSE file.\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2023 Gregory Lindsey  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Send JSON datasets to various AWS services.",
    "version": "0.1.0",
    "project_urls": null,
    "split_keywords": [
        "aws",
        "json",
        "dataset",
        "s3",
        "kinesis",
        "firehose",
        "sqs",
        "data",
        "streaming",
        "data engineering"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "efcb8fcd2319790c01947f947d98fc6714af9faace2dc453844281c042c9c13b",
                "md5": "2ac8459e1ec35f9c8b55cd0543bee2f6",
                "sha256": "5c7eb20585b2122485b675e2a845130efbd843b4ad2114a192bc4110b59e0c9b"
            },
            "downloads": -1,
            "filename": "aws_json_dataset-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2ac8459e1ec35f9c8b55cd0543bee2f6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 12782,
            "upload_time": "2024-02-03T22:56:43",
            "upload_time_iso_8601": "2024-02-03T22:56:43.555319Z",
            "url": "https://files.pythonhosted.org/packages/ef/cb/8fcd2319790c01947f947d98fc6714af9faace2dc453844281c042c9c13b/aws_json_dataset-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ee9aec3b20e8429b5a52f0b54829318b040a27512f04aabf5203152843b79efd",
                "md5": "bbe8d7d8e04044e0aec2bec49e4eaeb2",
                "sha256": "17826833609c0b978a6101be1682c17115992400f36cebd2c6d0365a1918492b"
            },
            "downloads": -1,
            "filename": "aws-json-dataset-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "bbe8d7d8e04044e0aec2bec49e4eaeb2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 50383,
            "upload_time": "2024-02-03T22:56:45",
            "upload_time_iso_8601": "2024-02-03T22:56:45.447667Z",
            "url": "https://files.pythonhosted.org/packages/ee/9a/ec3b20e8429b5a52f0b54829318b040a27512f04aabf5203152843b79efd/aws-json-dataset-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-03 22:56:45",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "aws-json-dataset"
}
        
Elapsed time: 0.17333s