# object-store-python
[![CI][ci-img]][ci-link]
[![code style: black][black-img]][black-link]
![PyPI](https://img.shields.io/pypi/v/object-store-python)
[![PyPI - Downloads][pypi-img]][pypi-link]
Python bindings and integrations for the excellent [`object_store`][object-store] crate.
The main idea is to provide a common interface to various storage backends including the
objects stores from most major cloud providers. The APIs are very focussed and taylored
towards modern cloud native applications by hiding away many features (and complexities)
encountered in full fledges file systems.
Among the included backend are:
- Amazon S3 and S3 compliant APIs
- Google Cloud Storage Buckets
- Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
- local storage
- in-memory store
## Installation
The `object-store-python` package is available on PyPI and can be installed via
```sh
poetry add object-store-python
```
or using pip
```sh
pip install object-store-python
```
## Usage
The main [`ObjectStore`](#object-store-python) API mirrors the native [`object_store`][object-store]
implementation, with some slight adjustments for ease of use in python programs.
### `ObjectStore` api
```py
from object_store import ObjectStore, ObjectMeta
# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")
store.put("data", b"some data")
data = store.get("data")
assert data == b"some data"
blobs = store.list()
meta: ObjectMeta = store.head("data")
range = store.get_range("data", start=0, length=4)
assert range == b"some"
store.copy("data", "copied")
copied = store.get("copied")
assert copied == data
```
### Configuration
As much as possible we aim to make access to various storage backends dependent
only on runtime configuration. The kind of service is always derived from the
url used to specifiy the storage location. Some basic configuration can also be
derived from the url string, dependent on the chosen url format.
```py
from object_store import ObjectStore
storage_options = {
"azure_storage_account_name": "<my-account-name>",
"azure_client_id": "<my-client-id>",
"azure_client_secret": "<my-client-secret>",
"azure_tenant_id": "<my-tenant-id>"
}
store = ObjectStore("az://<container-name>", storage_options)
```
We can provide the same configuration via the environment.
```py
import os
from object_store import ObjectStore
os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"
store = ObjectStore("az://<container-name>")
```
#### Azure
The recommended url format is `az://<container>/<path>` and Azure always requieres
`azure_storage_account_name` to be configured.
- [shared key][azure-key]
- `azure_storage_account_key`
- [service principal][azure-ad]
- `azure_client_id`
- `azure_client_secret`
- `azure_tenant_id`
- [shared access signature][azure-sas]
- `azure_storage_sas_key` (as provided by StorageExplorer)
- bearer token
- `azure_storage_token`
- [managed identity][azure-managed]
- if using user assigned identity one of `azure_client_id`, `azure_object_id`, `azure_msi_resource_id`
- if no other credential can be created, managed identity will be tried
- [workload identity][azure-workload]
- `azure_client_id`
- `azure_tenant_id`
- `azure_federated_token_file`
#### S3
The recommended url format is `s3://<bucket>/<path>` S3 storage always requires a
region to be specified via one of `aws_region` or `aws_default_region`.
- [access key][aws-key]
- `aws_access_key_id`
- `aws_secret_access_key`
- [session token][aws-sts]
- `aws_session_token`
- [imds instance metadata][aws-imds]
- `aws_metadata_endpoint`
- [profile][aws-profile]
- `aws_profile`
AWS supports [virtual hosting of buckets][aws-virtual], which can be configured by setting
`aws_virtual_hosted_style_request` to "true".
When an alternative implementation or a mocked service like localstack is used, the service
endpoint needs to be explicitly specified via `aws_endpoint`.
#### GCS
The recommended url format is `gs://<bucket>/<path>`.
- service account
- `google_service_account`
### with `pyarrow`
```py
from pathlib import Path
import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq
from object_store import ArrowFileSystemHandler
table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})
base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))
pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)
dataset = ds.dataset("data", format="parquet", filesystem=store)
```
## Development
### Prerequisites
- [poetry](https://python-poetry.org/docs/)
- [Rust toolchain](https://www.rust-lang.org/tools/install)
- [just](https://github.com/casey/just#readme)
### Running tests
If you do not have [`just`](<(https://github.com/casey/just#readme)>) installed and do not wish to install it,
have a look at the [`justfile`](https://github.com/roeap/object-store-python/blob/main/justfile) to see the raw commands.
To set up the development environment, and install a dev version of the native package just run:
```sh
just init
```
This will also configure [`pre-commit`](https://pre-commit.com/) hooks in the repository.
To run the rust as well as python tests:
```sh
just test
```
[object-store]: https://crates.io/crates/object_store
[pypi-img]: https://img.shields.io/pypi/dm/object-store-python
[pypi-link]: https://pypi.org/project/object-store-python/
[ci-img]: https://github.com/roeap/object-store-python/actions/workflows/ci.yaml/badge.svg
[ci-link]: https://github.com/roeap/object-store-python/actions/workflows/ci.yaml
[black-img]: https://img.shields.io/badge/code%20style-black-000000.svg
[black-link]: https://github.com/psf/black
[aws-virtual]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html
[azure-managed]: https://learn.microsoft.com/en-gb/azure/app-service/overview-managed-identity
[azure-sas]: https://learn.microsoft.com/en-us/azure/storage/common/storage-sas-overview
[azure-ad]: https://learn.microsoft.com/en-us/azure/storage/blobs/authorize-access-azure-active-directory
[azure-key]: https://learn.microsoft.com/en-us/rest/api/storageservices/authorize-with-shared-key
[azure-workload]: https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview
[aws-imds]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html
[aws-profile]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html
[aws-sts]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_request.html
[aws-key]: https://docs.aws.amazon.com/accounts/latest/reference/credentials-access-keys-best-practices.html
Raw data
{
"_id": null,
"home_page": null,
"name": "object-store-python",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "object-store,azure,aws,gcp",
"author": "Robert Pack <robstar.pack@gmail.com>",
"author_email": "Robert Pack <robstar.pack@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/0d/bb/b372b4b36926487b3b2ca6efb12261d0eaf0828ba1db0bba49f1e66b9862/object_store_python-0.1.10.tar.gz",
"platform": null,
"description": "# object-store-python\n\n[![CI][ci-img]][ci-link]\n[![code style: black][black-img]][black-link]\n![PyPI](https://img.shields.io/pypi/v/object-store-python)\n[![PyPI - Downloads][pypi-img]][pypi-link]\n\nPython bindings and integrations for the excellent [`object_store`][object-store] crate.\nThe main idea is to provide a common interface to various storage backends including the\nobjects stores from most major cloud providers. The APIs are very focussed and taylored\ntowards modern cloud native applications by hiding away many features (and complexities)\nencountered in full fledges file systems.\n\nAmong the included backend are:\n\n- Amazon S3 and S3 compliant APIs\n- Google Cloud Storage Buckets\n- Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)\n- local storage\n- in-memory store\n\n## Installation\n\nThe `object-store-python` package is available on PyPI and can be installed via\n\n```sh\npoetry add object-store-python\n```\n\nor using pip\n\n```sh\npip install object-store-python\n```\n\n## Usage\n\nThe main [`ObjectStore`](#object-store-python) API mirrors the native [`object_store`][object-store]\nimplementation, with some slight adjustments for ease of use in python programs.\n\n### `ObjectStore` api\n\n```py\nfrom object_store import ObjectStore, ObjectMeta\n\n# we use an in-memory store for demonstration purposes.\n# data will not be persisted and is not shared across store instances\nstore = ObjectStore(\"memory://\")\n\nstore.put(\"data\", b\"some data\")\n\ndata = store.get(\"data\")\nassert data == b\"some data\"\n\nblobs = store.list()\n\nmeta: ObjectMeta = store.head(\"data\")\n\nrange = store.get_range(\"data\", start=0, length=4)\nassert range == b\"some\"\n\nstore.copy(\"data\", \"copied\")\ncopied = store.get(\"copied\")\nassert copied == data\n```\n\n### Configuration\n\nAs much as possible we aim to make access to various storage backends dependent\nonly on runtime configuration. The kind of service is always derived from the\nurl used to specifiy the storage location. Some basic configuration can also be\nderived from the url string, dependent on the chosen url format.\n\n```py\nfrom object_store import ObjectStore\n\nstorage_options = {\n \"azure_storage_account_name\": \"<my-account-name>\",\n \"azure_client_id\": \"<my-client-id>\",\n \"azure_client_secret\": \"<my-client-secret>\",\n \"azure_tenant_id\": \"<my-tenant-id>\"\n}\n\nstore = ObjectStore(\"az://<container-name>\", storage_options)\n```\n\nWe can provide the same configuration via the environment.\n\n```py\nimport os\nfrom object_store import ObjectStore\n\nos.environ[\"AZURE_STORAGE_ACCOUNT_NAME\"] = \"<my-account-name>\"\nos.environ[\"AZURE_CLIENT_ID\"] = \"<my-client-id>\"\nos.environ[\"AZURE_CLIENT_SECRET\"] = \"<my-client-secret>\"\nos.environ[\"AZURE_TENANT_ID\"] = \"<my-tenant-id>\"\n\nstore = ObjectStore(\"az://<container-name>\")\n```\n\n#### Azure\n\nThe recommended url format is `az://<container>/<path>` and Azure always requieres\n`azure_storage_account_name` to be configured.\n\n- [shared key][azure-key]\n - `azure_storage_account_key`\n- [service principal][azure-ad]\n - `azure_client_id`\n - `azure_client_secret`\n - `azure_tenant_id`\n- [shared access signature][azure-sas]\n - `azure_storage_sas_key` (as provided by StorageExplorer)\n- bearer token\n - `azure_storage_token`\n- [managed identity][azure-managed]\n - if using user assigned identity one of `azure_client_id`, `azure_object_id`, `azure_msi_resource_id`\n - if no other credential can be created, managed identity will be tried\n- [workload identity][azure-workload]\n - `azure_client_id`\n - `azure_tenant_id`\n - `azure_federated_token_file`\n\n#### S3\n\nThe recommended url format is `s3://<bucket>/<path>` S3 storage always requires a\nregion to be specified via one of `aws_region` or `aws_default_region`.\n\n- [access key][aws-key]\n - `aws_access_key_id`\n - `aws_secret_access_key`\n- [session token][aws-sts]\n - `aws_session_token`\n- [imds instance metadata][aws-imds]\n - `aws_metadata_endpoint`\n- [profile][aws-profile]\n - `aws_profile`\n\nAWS supports [virtual hosting of buckets][aws-virtual], which can be configured by setting\n`aws_virtual_hosted_style_request` to \"true\".\n\nWhen an alternative implementation or a mocked service like localstack is used, the service\nendpoint needs to be explicitly specified via `aws_endpoint`.\n\n#### GCS\n\nThe recommended url format is `gs://<bucket>/<path>`.\n\n- service account\n - `google_service_account`\n\n### with `pyarrow`\n\n```py\nfrom pathlib import Path\n\nimport numpy as np\nimport pyarrow as pa\nimport pyarrow.fs as fs\nimport pyarrow.dataset as ds\nimport pyarrow.parquet as pq\n\nfrom object_store import ArrowFileSystemHandler\n\ntable = pa.table({\"a\": range(10), \"b\": np.random.randn(10), \"c\": [1, 2] * 5})\n\nbase = Path.cwd()\nstore = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))\n\npq.write_table(table.slice(0, 5), \"data/data1.parquet\", filesystem=store)\npq.write_table(table.slice(5, 10), \"data/data2.parquet\", filesystem=store)\n\ndataset = ds.dataset(\"data\", format=\"parquet\", filesystem=store)\n```\n\n## Development\n\n### Prerequisites\n\n- [poetry](https://python-poetry.org/docs/)\n- [Rust toolchain](https://www.rust-lang.org/tools/install)\n- [just](https://github.com/casey/just#readme)\n\n### Running tests\n\nIf you do not have [`just`](<(https://github.com/casey/just#readme)>) installed and do not wish to install it,\nhave a look at the [`justfile`](https://github.com/roeap/object-store-python/blob/main/justfile) to see the raw commands.\n\nTo set up the development environment, and install a dev version of the native package just run:\n\n```sh\njust init\n```\n\nThis will also configure [`pre-commit`](https://pre-commit.com/) hooks in the repository.\n\nTo run the rust as well as python tests:\n\n```sh\njust test\n```\n\n[object-store]: https://crates.io/crates/object_store\n[pypi-img]: https://img.shields.io/pypi/dm/object-store-python\n[pypi-link]: https://pypi.org/project/object-store-python/\n[ci-img]: https://github.com/roeap/object-store-python/actions/workflows/ci.yaml/badge.svg\n[ci-link]: https://github.com/roeap/object-store-python/actions/workflows/ci.yaml\n[black-img]: https://img.shields.io/badge/code%20style-black-000000.svg\n[black-link]: https://github.com/psf/black\n[aws-virtual]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html\n[azure-managed]: https://learn.microsoft.com/en-gb/azure/app-service/overview-managed-identity\n[azure-sas]: https://learn.microsoft.com/en-us/azure/storage/common/storage-sas-overview\n[azure-ad]: https://learn.microsoft.com/en-us/azure/storage/blobs/authorize-access-azure-active-directory\n[azure-key]: https://learn.microsoft.com/en-us/rest/api/storageservices/authorize-with-shared-key\n[azure-workload]: https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview\n[aws-imds]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html\n[aws-profile]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html\n[aws-sts]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_request.html\n[aws-key]: https://docs.aws.amazon.com/accounts/latest/reference/credentials-access-keys-best-practices.html\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.",
"version": "0.1.10",
"split_keywords": [
"object-store",
"azure",
"aws",
"gcp"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2ee3e6a2ca93484ea190d1a0f5c866af3b5415f7594ac4c9adc7d7a6b1534f1a",
"md5": "e8d444804b355781cd35d8f0e87eadb7",
"sha256": "e84f719d06773fdef17e82d7d8ad48aee2a82a04b97b4c50f7a2f5c6da4223a3"
},
"downloads": -1,
"filename": "object_store_python-0.1.10-cp38-abi3-macosx_10_7_x86_64.whl",
"has_sig": false,
"md5_digest": "e8d444804b355781cd35d8f0e87eadb7",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 4636986,
"upload_time": "2023-04-10T11:15:26",
"upload_time_iso_8601": "2023-04-10T11:15:26.955613Z",
"url": "https://files.pythonhosted.org/packages/2e/e3/e6a2ca93484ea190d1a0f5c866af3b5415f7594ac4c9adc7d7a6b1534f1a/object_store_python-0.1.10-cp38-abi3-macosx_10_7_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "057a2b4d984f794bfd98b3be0e252ef22682e85f00e96be292d6c6bfb6fc600d",
"md5": "c0dbae44cd119f946fe4e1263843f7d3",
"sha256": "33f1f2fd11bcf2c1d6936dd2a09f538530c99b8735f9be5d2f4bf56b83f2f386"
},
"downloads": -1,
"filename": "object_store_python-0.1.10-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl",
"has_sig": false,
"md5_digest": "c0dbae44cd119f946fe4e1263843f7d3",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 8817235,
"upload_time": "2023-04-10T11:20:44",
"upload_time_iso_8601": "2023-04-10T11:20:44.364030Z",
"url": "https://files.pythonhosted.org/packages/05/7a/2b4d984f794bfd98b3be0e252ef22682e85f00e96be292d6c6bfb6fc600d/object_store_python-0.1.10-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "24ed854395abe96bfea30a92ae2eff77b6f446257f1e4eba65be5ef71a345c2a",
"md5": "2cad7d9bce44ab0843939109783db5a8",
"sha256": "bf37ba9f9c79a5a2385f72109fb89c1155a6ffde1338fb6e8fb8d029de5b0b95"
},
"downloads": -1,
"filename": "object_store_python-0.1.10-cp38-abi3-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "2cad7d9bce44ab0843939109783db5a8",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 4241918,
"upload_time": "2023-04-10T11:15:09",
"upload_time_iso_8601": "2023-04-10T11:15:09.462983Z",
"url": "https://files.pythonhosted.org/packages/24/ed/854395abe96bfea30a92ae2eff77b6f446257f1e4eba65be5ef71a345c2a/object_store_python-0.1.10-cp38-abi3-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "77cd4da4d5d16bc1c99d21e0f94a21400c8ba1f1fbf4cfba10b086b070e5c1ee",
"md5": "61cbf455c60a1cbeee0f0c595892a2bd",
"sha256": "3ca4d1b85e99af1fb476f6a7d4dae0d11a86a5df36f8d00d1b4a10362ad444cd"
},
"downloads": -1,
"filename": "object_store_python-0.1.10-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl",
"has_sig": false,
"md5_digest": "61cbf455c60a1cbeee0f0c595892a2bd",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 5370504,
"upload_time": "2023-04-10T11:20:46",
"upload_time_iso_8601": "2023-04-10T11:20:46.658210Z",
"url": "https://files.pythonhosted.org/packages/77/cd/4da4d5d16bc1c99d21e0f94a21400c8ba1f1fbf4cfba10b086b070e5c1ee/object_store_python-0.1.10-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "57bcd44f9f9e651fbf4a048a345ab20bd860b946f47c5d0cca30cb8af622f3bc",
"md5": "b008f91b41212ecb700f7cd27808c92d",
"sha256": "7cbd3ec5172eb1a234d9ec9901d8298bbd84db486d2620a73249eb1d3f8586e3"
},
"downloads": -1,
"filename": "object_store_python-0.1.10-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "b008f91b41212ecb700f7cd27808c92d",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 5889735,
"upload_time": "2023-04-10T11:14:39",
"upload_time_iso_8601": "2023-04-10T11:14:39.750532Z",
"url": "https://files.pythonhosted.org/packages/57/bc/d44f9f9e651fbf4a048a345ab20bd860b946f47c5d0cca30cb8af622f3bc/object_store_python-0.1.10-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "373ef710554442df82eccce5c6e11c1c9a073b80cf5b25bd79ed72063a503da3",
"md5": "16a570933119d840ed5c3b81c56470cd",
"sha256": "1758cb6348f5a1cbcaa3400c0f91f2affee35f567d4bae9c590e711f74d9b18f"
},
"downloads": -1,
"filename": "object_store_python-0.1.10-cp38-abi3-win_amd64.whl",
"has_sig": false,
"md5_digest": "16a570933119d840ed5c3b81c56470cd",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 4453920,
"upload_time": "2023-04-10T11:17:09",
"upload_time_iso_8601": "2023-04-10T11:17:09.878953Z",
"url": "https://files.pythonhosted.org/packages/37/3e/f710554442df82eccce5c6e11c1c9a073b80cf5b25bd79ed72063a503da3/object_store_python-0.1.10-cp38-abi3-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "0dbbb372b4b36926487b3b2ca6efb12261d0eaf0828ba1db0bba49f1e66b9862",
"md5": "bd591649f36205ca995c2556dc7c8f88",
"sha256": "451b22f89d15c5558c6b7ddbf9e773f47295931342a801ed1ff9f11d74857dc2"
},
"downloads": -1,
"filename": "object_store_python-0.1.10.tar.gz",
"has_sig": false,
"md5_digest": "bd591649f36205ca995c2556dc7c8f88",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 36763,
"upload_time": "2023-04-10T11:14:41",
"upload_time_iso_8601": "2023-04-10T11:14:41.655534Z",
"url": "https://files.pythonhosted.org/packages/0d/bb/b372b4b36926487b3b2ca6efb12261d0eaf0828ba1db0bba49f1e66b9862/object_store_python-0.1.10.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-10 11:14:41",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "object-store-python"
}