| Name | aws-dynamodb-parallel-scan JSON |
| Version |
1.1.0
JSON |
| download |
| home_page | None |
| Summary | Amazon DynamoDB Parallel Scan Paginator for boto3. |
| upload_time | 2024-10-19 07:50:34 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | >=3.8 |
| license | None |
| keywords |
|
| VCS |
|
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# aws-dynamodb-parallel-scan
Amazon DynamoDB parallel scan paginator for boto3.
## Installation
Install from PyPI with pip
```
pip install aws-dynamodb-parallel-scan
```
or with the package manager of choice.
## Usage
The library is a drop-in replacement for [boto3 DynamoDB Scan Paginator](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Paginator.Scan). Example:
```python
import aws_dynamodb_parallel_scan
import boto3
# Create DynamoDB client to use for scan operations
client = boto3.resource("dynamodb").meta.client
# Create the parallel scan paginator with the client
paginator = aws_dynamodb_parallel_scan.get_paginator(client)
# Scan "mytable" in five segments. Each segment is scanned in parallel.
for page in paginator.paginate(TableName="mytable", TotalSegments=5):
items = page.get("Items", [])
```
Notes:
* `paginate()` accepts the same arguments as boto3 `DynamoDB.Client.scan()` method. Arguments
are passed to `DynamoDB.Client.scan()` as-is.
* `paginate()` uses the value of `TotalSegments` argument as parallelism level. Each segment
is scanned in parallel in a separate thread.
* `paginate()` yields DynamoDB Scan API responses in the same format as boto3
`DynamoDB.Client.scan()` method.
See boto3 [DynamoDB.Client.scan() documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Client.scan)
for details on supported arguments and the response format.
## CLI
This package also provides a CLI tool (`aws-dynamodb-parallel-scan`) to scan a DynamoDB table
with parallel scan. The tool supports all non-deprecated arguments of DynamoDB Scan API. Execute
`aws-dynamodb-parallel-scan -h` for details
Here's some examples:
```bash
# Scan "mytable" sequentially
$ aws-dynamodb-parallel-scan --table-name mytable
{"Items": [...], "Count": 10256, "ScannedCount": 10256, "ResponseMetadata": {}}
{"Items": [...], "Count": 12, "ScannedCount": 12, "ResponseMetadata": {}}
# Scan "mytable" in parallel (5 parallel segments)
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5
{"Items": [...], "Count":32, "ScannedCount":32, "ResponseMetadata": {}}
{"Items": [...], "Count":47, "ScannedCount":47, "ResponseMetadata": {}}
{"Items": [...], "Count":52, "ScannedCount":52, "ResponseMetadata": {}}
{"Items": [...], "Count":34, "ScannedCount":34, "ResponseMetadata": {}}
{"Items": [...], "Count":40, "ScannedCount":40, "ResponseMetadata": {}}
# Scan "mytable" in parallel and return items, not Scan API responses (--output-items flag)
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--output-items
{"pk": {"S": "item1"}, "quantity": {"N": "99"}}
{"pk": {"S": "item24"}, "quantity": {"N": "25"}}
...
# Scan "mytable" in parallel, return items with native types, not DynamoDB types (--use-document-client flag)
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--output-items --use-document-client
{"pk": "item1", "quantity": 99}
{"pk": "item24", "quantity": 25}
...
# Scan "mytable" with a filter expression, return items
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--filter-expression "quantity < :value" \
--expression-attribute-values '{":value": {"N": "5"}}' \
--output-items
{"pk": {"S": "item142"}, "quantity": {"N": "4"}}
{"pk": {"S": "item874"}, "quantity": {"N": "1"}}
# Scan "mytable" with a filter expression using native types, return items
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--filter-expression "quantity < :value" \
--expression-attribute-values '{":value": 5}' \
--use-document-client --output-items
{"pk": "item142", "quantity": 4}
{"pk": "item874", "quantity": 1}
```
## Development
Requires Python 3 and uv. Useful commands:
```bash
# Run tests (integration test requires rights to create, delete and use DynamoDB tables)
make test
# Run linters
make -k lint
# Format code
make format
```
## License
MIT
## Credits
* Alex Chan, [Getting every item from a DynamoDB table with Python](https://alexwlchan.net/2020/05/getting-every-item-from-a-dynamodb-table-with-python/)
Raw data
{
"_id": null,
"home_page": null,
"name": "aws-dynamodb-parallel-scan",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Sami Jaktholm <sjakthol@outlook.com>",
"download_url": "https://files.pythonhosted.org/packages/d7/08/d6c59e0811afe382b6c0d4ec6c139d9ce909cbbc694ac3d8f438da8270bf/aws_dynamodb_parallel_scan-1.1.0.tar.gz",
"platform": null,
"description": "# aws-dynamodb-parallel-scan\n\nAmazon DynamoDB parallel scan paginator for boto3.\n\n## Installation\n\nInstall from PyPI with pip\n\n```\npip install aws-dynamodb-parallel-scan\n```\n\nor with the package manager of choice.\n\n## Usage\n\nThe library is a drop-in replacement for [boto3 DynamoDB Scan Paginator](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Paginator.Scan). Example:\n\n```python\nimport aws_dynamodb_parallel_scan\nimport boto3\n\n# Create DynamoDB client to use for scan operations\nclient = boto3.resource(\"dynamodb\").meta.client\n\n# Create the parallel scan paginator with the client\npaginator = aws_dynamodb_parallel_scan.get_paginator(client)\n\n# Scan \"mytable\" in five segments. Each segment is scanned in parallel.\nfor page in paginator.paginate(TableName=\"mytable\", TotalSegments=5):\n items = page.get(\"Items\", [])\n```\n\nNotes:\n\n* `paginate()` accepts the same arguments as boto3 `DynamoDB.Client.scan()` method. Arguments\n are passed to `DynamoDB.Client.scan()` as-is.\n\n* `paginate()` uses the value of `TotalSegments` argument as parallelism level. Each segment\n is scanned in parallel in a separate thread.\n\n* `paginate()` yields DynamoDB Scan API responses in the same format as boto3\n `DynamoDB.Client.scan()` method.\n\nSee boto3 [DynamoDB.Client.scan() documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Client.scan)\nfor details on supported arguments and the response format.\n\n## CLI\n\nThis package also provides a CLI tool (`aws-dynamodb-parallel-scan`) to scan a DynamoDB table\nwith parallel scan. The tool supports all non-deprecated arguments of DynamoDB Scan API. Execute\n`aws-dynamodb-parallel-scan -h` for details\n\nHere's some examples:\n\n```bash\n# Scan \"mytable\" sequentially\n$ aws-dynamodb-parallel-scan --table-name mytable\n{\"Items\": [...], \"Count\": 10256, \"ScannedCount\": 10256, \"ResponseMetadata\": {}}\n{\"Items\": [...], \"Count\": 12, \"ScannedCount\": 12, \"ResponseMetadata\": {}}\n\n# Scan \"mytable\" in parallel (5 parallel segments)\n$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5\n{\"Items\": [...], \"Count\":32, \"ScannedCount\":32, \"ResponseMetadata\": {}}\n{\"Items\": [...], \"Count\":47, \"ScannedCount\":47, \"ResponseMetadata\": {}}\n{\"Items\": [...], \"Count\":52, \"ScannedCount\":52, \"ResponseMetadata\": {}}\n{\"Items\": [...], \"Count\":34, \"ScannedCount\":34, \"ResponseMetadata\": {}}\n{\"Items\": [...], \"Count\":40, \"ScannedCount\":40, \"ResponseMetadata\": {}}\n\n# Scan \"mytable\" in parallel and return items, not Scan API responses (--output-items flag)\n$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \\\n --output-items\n{\"pk\": {\"S\": \"item1\"}, \"quantity\": {\"N\": \"99\"}}\n{\"pk\": {\"S\": \"item24\"}, \"quantity\": {\"N\": \"25\"}}\n...\n\n# Scan \"mytable\" in parallel, return items with native types, not DynamoDB types (--use-document-client flag)\n$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \\\n --output-items --use-document-client\n{\"pk\": \"item1\", \"quantity\": 99}\n{\"pk\": \"item24\", \"quantity\": 25}\n...\n\n# Scan \"mytable\" with a filter expression, return items\n$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \\\n --filter-expression \"quantity < :value\" \\\n --expression-attribute-values '{\":value\": {\"N\": \"5\"}}' \\\n --output-items\n{\"pk\": {\"S\": \"item142\"}, \"quantity\": {\"N\": \"4\"}}\n{\"pk\": {\"S\": \"item874\"}, \"quantity\": {\"N\": \"1\"}}\n\n# Scan \"mytable\" with a filter expression using native types, return items\n$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \\\n --filter-expression \"quantity < :value\" \\\n --expression-attribute-values '{\":value\": 5}' \\\n --use-document-client --output-items\n{\"pk\": \"item142\", \"quantity\": 4}\n{\"pk\": \"item874\", \"quantity\": 1}\n```\n\n## Development\n\nRequires Python 3 and uv. Useful commands:\n\n```bash\n# Run tests (integration test requires rights to create, delete and use DynamoDB tables)\nmake test\n\n# Run linters\nmake -k lint\n\n# Format code\nmake format\n```\n\n## License\n\nMIT\n\n## Credits\n\n* Alex Chan, [Getting every item from a DynamoDB table with Python](https://alexwlchan.net/2020/05/getting-every-item-from-a-dynamodb-table-with-python/)\n",
"bugtrack_url": null,
"license": null,
"summary": "Amazon DynamoDB Parallel Scan Paginator for boto3.",
"version": "1.1.0",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "78523baa49cfe1fbb603fcf5e3d3ff69a48d33b5ab81c45dfa37c57a835e1458",
"md5": "b5df03f51a363346383b9e0f80ea660d",
"sha256": "ae5e08c84b76ab7822bbc05beabbdb0f56f5269fc9cd182863299c4b4b49919a"
},
"downloads": -1,
"filename": "aws_dynamodb_parallel_scan-1.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b5df03f51a363346383b9e0f80ea660d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 6144,
"upload_time": "2024-10-19T07:50:32",
"upload_time_iso_8601": "2024-10-19T07:50:32.929496Z",
"url": "https://files.pythonhosted.org/packages/78/52/3baa49cfe1fbb603fcf5e3d3ff69a48d33b5ab81c45dfa37c57a835e1458/aws_dynamodb_parallel_scan-1.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d708d6c59e0811afe382b6c0d4ec6c139d9ce909cbbc694ac3d8f438da8270bf",
"md5": "c3699e9c6a6a3dcb94c2fdd70b00abe0",
"sha256": "5113179aeb2bb476864ec98cd716f32950085d6d00c29a0608f8d3318106102e"
},
"downloads": -1,
"filename": "aws_dynamodb_parallel_scan-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "c3699e9c6a6a3dcb94c2fdd70b00abe0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 69586,
"upload_time": "2024-10-19T07:50:34",
"upload_time_iso_8601": "2024-10-19T07:50:34.806341Z",
"url": "https://files.pythonhosted.org/packages/d7/08/d6c59e0811afe382b6c0d4ec6c139d9ce909cbbc694ac3d8f438da8270bf/aws_dynamodb_parallel_scan-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-19 07:50:34",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "aws-dynamodb-parallel-scan"
}