cloud-mappings

Name	cloud-mappings JSON
Version	2.1.0 JSON
	download
home_page	https://github.com/JJ11teen/cloud-mappings
Summary	MutableMapping interfaces for common cloud storage providers
upload_time	2023-06-02 01:15:02
maintainer
docs_url	None
author	Lucas Sargent
requires_python	>=3.7
license
keywords	mutable dict aws s3 azure gcp
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # cloud-mappings
MutableMapping implementations for common cloud storage providers - easily store things in the cloud through a simple dictionary interface!

[![Build](https://github.com/JJ11teen/cloud-mappings/actions/workflows/build.yaml/badge.svg)](https://github.com/JJ11teen/cloud-mappings/actions/workflows/build.yaml)
[![PyPI version](https://badge.fury.io/py/cloud-mappings.svg)](https://pypi.org/project/cloud-mappings/)

For now [Azure Blob Storage](https://azure.microsoft.com/en-au/services/storage/blobs), [Azure Table Storage](https://azure.microsoft.com/en-au/services/storage/tables), [Google Cloud Storage](https://cloud.google.com/storage/), and [AWS S3](https://aws.amazon.com/s3/) are implemented. Contributions of new providers are welcome.

## Use Cases

* Easily read/write to a bucket without needing to learn boto3 (same for Azure, GCP)
* Ensure consistent serialisation/deserialisation with a simple interface
* Safely use a cloud storage concurrently, for example multiple data scientists working in notebooks or multiple nodes in a distributed compute cluster
* Built a robust multi-layer data caching solution by combining with [zict](https://zict.readthedocs.io/en/latest/index.html)

## Installation

with pip:
```
pip install cloud-mappings
```

By default, `cloud-mappings` doesn't install any of the required storage providers dependencies, allowing you to version them as you see fit. If you would like to install the versions with `cloud-mappings` you may run any combination of:
```
pip install cloud-mappings[azureblob,azuretable,gcpstorage,awss3]
```

## Usage

Use it just like a standard python dictionary! (One you would create with `dict()` or `{}`). Under the hood each write to the dictionary is serialised and saved to the cloud, and each read is downloaded and deserialised. Iterating keys, values and items all work as expected.
```python
cm["key"] = 1000
cm["key"] # returns 1000
del cm["key"]
"key" in cm # returns false
for k, v in cm.items():
    print(k, v) # prints: key 1000
```

## Initialisation

### AzureBlobStorage:
```python
from azure.identity import AzureDefaultCredential
from cloudmappings import AzureBlobStorage

cm = AzureBlobStorage(
    account_url="BLOB_ACCOUNT_URL",
    container_name="CONTAINER_NAME",
    credential=AzureDefaultCredential(),
).create_mapping()
```

### AzureTableStorage:
```python
from azure.identity import AzureDefaultCredential
from cloudmappings import AzureTableStorage

cm = AzureTableStorage(
    table_name="TABLE_NAME",
    endpoint="AZURE_TABLE_ENDPOINT",
    credential=AzureDefaultCredential(),
).create_mapping()
```
Note that Azure Table Storage has a 1MB size limit per entity.

### GoogleCloudStorage:
```python
from cloudmappings import GoogleCloudStorage

cm = GoogleCloudStorage(
    bucket_name="BUCKET_NAME",
    project="GCP_PROJECT",
).create_mappings()
```

### AWSS3Storage:
```python
from cloudmappings import AWSS3Storage

cm = AWSS3Storage(
    bucket_name="AWS_BUCKET_NAME",
    silence_warning=False,
).create_mapping()
```
Note that AWS S3 does not support server-side atomic requests, so it is not recommended for concurrent use. A warning is printed out by default but may be silenced by passing `silence_warning=True`.

# API Docs

## CloudStorage class

A `CloudStorage` object is the entrypoint for this library. You create one but instantiating one for the cloud storage provider you wish to use, currently `AWSS3Storage`, `AzureBlobStorage`, `AzureTableStorage`, `GoogleCloudStorage`. The parameters vary for each, and map to the details required for locating and authenticating the cloud resource they represent. A simple example for each is provided above. From a `CloudStorage` instance, (multiple) `CloudMapping[T]`s may be created by calling `.create_mapping()`:

```python
CloudStorage.create_mapping(
    sync_initially: bool = True,
    read_blindly: bool = False,
    read_blindly_error: bool = False,
    read_blindly_default: Any = None,
    serialisation: CloudMappingSerialisation[T] = pickle(),
    key_prefix: Optional[str] = None,
) -> CloudMapping[T]:
```
Parameters:
* `sync_initially: bool = True`
  * Whether to call `sync_with_cloud` initially
* `read_blindly: bool = False`
  * Whether the `CloudMapping` will read from the cloud without synchronising.
  * When `read_blindly=False`, a `CloudMapping` will raise a `KeyError` unless a key has been previously written using the same `CloudMapping` instance, or `.sync_with_cloud` has been called and the key was in the cloud. If the value in the cloud has changed since being written or synchronised, a `cloudmappings.errors.KeySyncError` will be raised.
  * When `read_blindly=True`, a `CloudMapping` will directly query the cloud for any key accessed, regardless of if it has previously written a value to that key. It will always get the latest value from the cloud, and never raise a `cloudmappings.errors.KeySyncError` for read operations. If there is no value for a key in the cloud, and `read_blindly_error=True`, a `KeyError` will be raised. If there is no value for a key in the cloud and `read_blindly_error=False`, `read_blindly_default` will be returned.
* `read_blindly_error : bool = False`
  * Whether to raise a `KeyValue` error when `read_blindly=True` and a key does not have a value in the cloud. If `True`, this takes prescedence over `read_blindly_default`.
* `read_blindly_default : Any = None`
  * The value to return when `read_blindly=True`, a key does not have a value in the cloud, and `read_blindly_error=False`.
* `serialiser: CloudMappingSerialiser = pickle()`
  * CloudMappingSerialiser to use, defaults to `pickle`. Is also used to determine the type hint for the `CloudMapping[T]`.
* `key_prefix: Optional[str] = None`
  * Prefix to apply to keys in cloud storage. Enables `CloudMapping`s to map to a subdirectory within a cloud storage service, as opposed to the whole resource.

When no arguments are passed, the created `CloudMapping[T]` will:
* Have a type of `CloudMapping[Any]`, equivalent to `dict[str, Any]`
* Sync initially, meaning it will query the cloud and fetch a list of keys that exist
* Raise `KeyError`s if a key is read before being written (unless explicitly synchronised)
* Use `pickle` for serialisation (both reads and writes)
* Apply no prefix to keys, meaning the keys used in python map 1:1 with keys in the cloud


## CloudMapping class

The `CloudMapping[T]` object is the primary construct of this library and is returned from `.create_mapping()`. It implements the `MutableMapping[str, T]` interface (meaning you can use it as a dictionary), but additionally it provides a few extra cloud-specific options and functions.

### Mutable Properties:
See the parameters of `CloudStorage.create_mapping()` above for their descriptions.
* `read_blindly: bool`
* `read_blindly_error: bool`
* `read_blindly_default: Any`

### Immutable Properties:
* `storage_provider: StorageProvider`
  * An object that provides a consistent interface to the underlying storage provider (eg methods to read and write bytes to specific paths).
* `etags: dict[str, str]`
  * An internal dictionary of etags used to ensure the `CloudMapping` is in sync with the cloud storage resource. The dict maps keys to their last synchronised etags.
  * This dictionary is used as the `CloudMapping's expected view of the cloud. It is used to determine if a key exists, and ensure that the value of each key is expected.
  * See: https://en.wikipedia.org/wiki/HTTP_ETag
* `serialisation: CloudMappingSerialisation[T]`
  * Gets the serialiser configured to use for serialising and deserialising values.
* `key_prefix: Optional[str]`
  * Gets the key prefix configured to prepend to keys in the cloud. It is also used to filter what is synchronised, resulting in the `CloudMapping` mapping to a subset of the cloud resource.
### Methods:
* `sync_with_cloud(self, key_prefix: str = None) -> None`
  * Synchronise this `CloudMapping` with the cloud.
  * This allows a `CloudMapping` to reflect the most recent updates to the cloud resource, including those made by other instances or users. This can allow destructive operations as a user may synchronise to get the latest updates, and then overwrite or delete values.
  * Consider calling this if you are encountering a `cloudmappings.errors.KeySyncError`, and you are sure you would like to force the operation anyway.
  * This is called by default on instantiation of a `CloudMapping`.
  * Parameters:
    * `key_prefix : str, optional`
      * Only sync keys beginning with the specified prefix, the key_prefix configured on the mapping is prepended in combination with this parameter.

## CloudMappingSerialisation class

The `CloudMappingSerialisation` class is simple dataclass that combines serialisation and deserialisation. It has two properties, a dumps and a loads function. Values are passed through the dumps function when being written to the `CloudMapping` and saved to the cloud, and values are passed through the loads function when being loaded from the cloud and read from the `CloudMapping`.

A `CloudMappingSerialisation` may be created directly with singular dumps and loads functions. A `CloudMappingSerialisation.from_chain()` helper method exists for when you would like multiple functions to be chained during serialisation and deserialisation.

Some common `CloudMappingSerialisation`s are also provided out of the box.

### Immutable Properties
* `dumps: Callable`
  * Function to dump values through when writing to the cloud.
  * Must return a bytes-like object.
* `loads: Callable`
  * Function to load values through when reading from the cloud.
  * Must accept a bytes-like object as its input.

### Static Methods
* `from_chain(ordered_dumps_funcs: List[Callable], ordered_loads_funcs: List[Callable]) -> CloudMappingSerialisation[T]`
  * Creates a CloudMappingSerialisation by chaining consecutive dumps and loads functions together
  * Parameters:
    * `ordered_dumps_funcs: List[Callable]`
      * An ordered list of functions to pass values through before saving bytes to the cloud.
      * The last function must return a bytes-like object.
    * `ordered_loads_funcs: List[Callable]`
      * An ordered list of functions to pass values through before saving bytes to the cloud.
      * The first function must accept a bytes-like object as its input.

### Common Serialisations Provided
* `cloudmappings.serialisers.core`
  * Provides functions for serialisers that have no additional dependencies
  * `none() -> CloudMappingSerialisation[bytes]`
    * This serialiser performs no serialisation, and just passes raw bytes
    * It is implemented as `None`, and `None` can be used directly, however
    using this serialiser enables type hints to correctly determine the that the mapping should be `CloudMapping[bytes]`.
  * `pickle(protocol: int = None) -> CloudMappingSerialisation[Any]`
    * Serialiser that pickles values using pythons `pickle`
    * Parameters:
      * `protocol: int = None`
      *  The pickle protocol to use, defaults to None which internally default to `pickle.DEFAULT_PROTOCOL`
  * `raw_string(encoding: str = "utf-8") -> CloudMappingSerialisation[str]`
    * Serialiser that only encodes raw string values
    * Parameters:
      * `encoding: str = "utf-8"`
        * The string encoding to use, passed to bytes() and str() for dumps and loads respectively
  * `json(encoding: str = "utf-8") -> CloudMappingSerialisation[Any]`
    * Serialiser that saves objects as JSON strings
    * Parameters:
      * `encoding: str = "utf-8"`
        * The string encoding to use, passed to bytes() and str() for dumps and loads respectively
  * `json_zlib(encoding: str = "utf-8") -> CloudMappingSerialisation[Any]`
    * Serialiser that saves values as compressed JSON strings, it uses zlib to compress values after serialising them as JSON strings.
    * Parameters:
      * `encoding: str = "utf-8"`
        * The string encoding to use, passed to bytes() and str() for dumps and loads respectively
* `cloudmappings.serialisers.pandas`
  * Provides functions for serialisers that use [pandas](https://pandas.pydata.org/) as an additional dependency
  * `csv() -> CloudMappingSerialisation[DataFrame]`
    * Serialiser that uses pandas to serialise DataFrames as csvs

## Concurrent Use

Being able to upload/download easily without learning the various cloud sdks is only one benefit of cloud-mappings! `cloud-mappings` is also designed to support concurrent use providing safety and functionality not provided by the cloud sdks.

| | Session 1 | Session 2 |
| --- | --- | --- |
| Both sessions create their cloud mappings, | `cm = ....` | `cm = ....` |
| referencing the same cloud storage | | |
| Session 1 writes some data to a key | `cm["key] = "Session 1 data"` | |
| Session 2 attempts to write over it | | `cm["key] = "Session 2 data"` |
| Session 2 gets a Error | | `KeySyncError` |

# Development

[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

This project uses `.devcontainer` to describe the environment to use for development. You may use the environment described in this directory (it integrates automatically with vscode's 'remote containers' extension), or you may create your own environment with the same dependencies.

## Dependencies
Install development dependencies with:

`pip install -e .[azureblob,azuretable,gcpstorage,awss3,tests]`

## Tests
Set environment variables for each provider:
* Azure Blob: `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`, `AZURE_BLOB_STORAGE_ACCOUNT_URL`, `AZURE_BLOB_STORAGE_HIERARCHICAL_ACCOUNT_URL` (the tests assume the same secret is used for both)
* Azure Table: `AZURE_TABLE_STORAGE_CONNECTION_STRING`
* GCP Storage: `GOOGLE_APPLICATION_CREDENTIALS` (path to credentials file), `GOOGLE_CLOUD_STORAGE_PROJECT`
* AWS S3: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`

Run tests with:
```bash
pytest --test-container-id <container-suffix-to-use-for-tests>
```
The testing container will be prefixed by "pytest", and the commit sha is used within build & release workflows. Note that if the container specified already exists one test will fail.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/JJ11teen/cloud-mappings",
    "name": "cloud-mappings",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "mutable dict aws s3 azure gcp",
    "author": "Lucas Sargent",
    "author_email": "lucas.sargent@outlook.com",
    "download_url": "https://files.pythonhosted.org/packages/c1/de/d5883c9a59bf0c53bce24623028fc856d00dffd97d0624a7af1882fefce3/cloud-mappings-2.1.0.tar.gz",
    "platform": null,
    "description": "# cloud-mappings\nMutableMapping implementations for common cloud storage providers - easily store things in the cloud through a simple dictionary interface!\n\n[![Build](https://github.com/JJ11teen/cloud-mappings/actions/workflows/build.yaml/badge.svg)](https://github.com/JJ11teen/cloud-mappings/actions/workflows/build.yaml)\n[![PyPI version](https://badge.fury.io/py/cloud-mappings.svg)](https://pypi.org/project/cloud-mappings/)\n\nFor now [Azure Blob Storage](https://azure.microsoft.com/en-au/services/storage/blobs), [Azure Table Storage](https://azure.microsoft.com/en-au/services/storage/tables), [Google Cloud Storage](https://cloud.google.com/storage/), and [AWS S3](https://aws.amazon.com/s3/) are implemented. Contributions of new providers are welcome.\n\n## Use Cases\n\n* Easily read/write to a bucket without needing to learn boto3 (same for Azure, GCP)\n* Ensure consistent serialisation/deserialisation with a simple interface\n* Safely use a cloud storage concurrently, for example multiple data scientists working in notebooks or multiple nodes in a distributed compute cluster\n* Built a robust multi-layer data caching solution by combining with [zict](https://zict.readthedocs.io/en/latest/index.html)\n\n## Installation\n\nwith pip:\n```\npip install cloud-mappings\n```\n\nBy default, `cloud-mappings` doesn't install any of the required storage providers dependencies, allowing you to version them as you see fit. If you would like to install the versions with `cloud-mappings` you may run any combination of:\n```\npip install cloud-mappings[azureblob,azuretable,gcpstorage,awss3]\n```\n\n## Usage\n\nUse it just like a standard python dictionary! (One you would create with `dict()` or `{}`). Under the hood each write to the dictionary is serialised and saved to the cloud, and each read is downloaded and deserialised. Iterating keys, values and items all work as expected.\n```python\ncm[\"key\"] = 1000\ncm[\"key\"] # returns 1000\ndel cm[\"key\"]\n\"key\" in cm # returns false\nfor k, v in cm.items():\n    print(k, v) # prints: key 1000\n```\n\n## Initialisation\n\n### AzureBlobStorage:\n```python\nfrom azure.identity import AzureDefaultCredential\nfrom cloudmappings import AzureBlobStorage\n\ncm = AzureBlobStorage(\n    account_url=\"BLOB_ACCOUNT_URL\",\n    container_name=\"CONTAINER_NAME\",\n    credential=AzureDefaultCredential(),\n).create_mapping()\n```\n\n### AzureTableStorage:\n```python\nfrom azure.identity import AzureDefaultCredential\nfrom cloudmappings import AzureTableStorage\n\ncm = AzureTableStorage(\n    table_name=\"TABLE_NAME\",\n    endpoint=\"AZURE_TABLE_ENDPOINT\",\n    credential=AzureDefaultCredential(),\n).create_mapping()\n```\nNote that Azure Table Storage has a 1MB size limit per entity.\n\n### GoogleCloudStorage:\n```python\nfrom cloudmappings import GoogleCloudStorage\n\ncm = GoogleCloudStorage(\n    bucket_name=\"BUCKET_NAME\",\n    project=\"GCP_PROJECT\",\n).create_mappings()\n```\n\n### AWSS3Storage:\n```python\nfrom cloudmappings import AWSS3Storage\n\ncm = AWSS3Storage(\n    bucket_name=\"AWS_BUCKET_NAME\",\n    silence_warning=False,\n).create_mapping()\n```\nNote that AWS S3 does not support server-side atomic requests, so it is not recommended for concurrent use. A warning is printed out by default but may be silenced by passing `silence_warning=True`.\n\n# API Docs\n\n## CloudStorage class\n\nA `CloudStorage` object is the entrypoint for this library. You create one but instantiating one for the cloud storage provider you wish to use, currently `AWSS3Storage`, `AzureBlobStorage`, `AzureTableStorage`, `GoogleCloudStorage`. The parameters vary for each, and map to the details required for locating and authenticating the cloud resource they represent. A simple example for each is provided above. From a `CloudStorage` instance, (multiple) `CloudMapping[T]`s may be created by calling `.create_mapping()`:\n\n```python\nCloudStorage.create_mapping(\n    sync_initially: bool = True,\n    read_blindly: bool = False,\n    read_blindly_error: bool = False,\n    read_blindly_default: Any = None,\n    serialisation: CloudMappingSerialisation[T] = pickle(),\n    key_prefix: Optional[str] = None,\n) -> CloudMapping[T]:\n```\nParameters:\n* `sync_initially: bool = True`\n  * Whether to call `sync_with_cloud` initially\n* `read_blindly: bool = False`\n  * Whether the `CloudMapping` will read from the cloud without synchronising.\n  * When `read_blindly=False`, a `CloudMapping` will raise a `KeyError` unless a key has been previously written using the same `CloudMapping` instance, or `.sync_with_cloud` has been called and the key was in the cloud. If the value in the cloud has changed since being written or synchronised, a `cloudmappings.errors.KeySyncError` will be raised.\n  * When `read_blindly=True`, a `CloudMapping` will directly query the cloud for any key accessed, regardless of if it has previously written a value to that key. It will always get the latest value from the cloud, and never raise a `cloudmappings.errors.KeySyncError` for read operations. If there is no value for a key in the cloud, and `read_blindly_error=True`, a `KeyError` will be raised. If there is no value for a key in the cloud and `read_blindly_error=False`, `read_blindly_default` will be returned.\n* `read_blindly_error : bool = False`\n  * Whether to raise a `KeyValue` error when `read_blindly=True` and a key does not have a value in the cloud. If `True`, this takes prescedence over `read_blindly_default`.\n* `read_blindly_default : Any = None`\n  * The value to return when `read_blindly=True`, a key does not have a value in the cloud, and `read_blindly_error=False`.\n* `serialiser: CloudMappingSerialiser = pickle()`\n  * CloudMappingSerialiser to use, defaults to `pickle`. Is also used to determine the type hint for the `CloudMapping[T]`.\n* `key_prefix: Optional[str] = None`\n  * Prefix to apply to keys in cloud storage. Enables `CloudMapping`s to map to a subdirectory within a cloud storage service, as opposed to the whole resource.\n\nWhen no arguments are passed, the created `CloudMapping[T]` will:\n* Have a type of `CloudMapping[Any]`, equivalent to `dict[str, Any]`\n* Sync initially, meaning it will query the cloud and fetch a list of keys that exist\n* Raise `KeyError`s if a key is read before being written (unless explicitly synchronised)\n* Use `pickle` for serialisation (both reads and writes)\n* Apply no prefix to keys, meaning the keys used in python map 1:1 with keys in the cloud\n\n\n## CloudMapping class\n\nThe `CloudMapping[T]` object is the primary construct of this library and is returned from `.create_mapping()`. It implements the `MutableMapping[str, T]` interface (meaning you can use it as a dictionary), but additionally it provides a few extra cloud-specific options and functions.\n\n### Mutable Properties:\nSee the parameters of `CloudStorage.create_mapping()` above for their descriptions.\n* `read_blindly: bool`\n* `read_blindly_error: bool`\n* `read_blindly_default: Any`\n\n### Immutable Properties:\n* `storage_provider: StorageProvider`\n  * An object that provides a consistent interface to the underlying storage provider (eg methods to read and write bytes to specific paths).\n* `etags: dict[str, str]`\n  * An internal dictionary of etags used to ensure the `CloudMapping` is in sync with the cloud storage resource. The dict maps keys to their last synchronised etags.\n  * This dictionary is used as the `CloudMapping's expected view of the cloud. It is used to determine if a key exists, and ensure that the value of each key is expected.\n  * See: https://en.wikipedia.org/wiki/HTTP_ETag\n* `serialisation: CloudMappingSerialisation[T]`\n  * Gets the serialiser configured to use for serialising and deserialising values.\n* `key_prefix: Optional[str]`\n  * Gets the key prefix configured to prepend to keys in the cloud. It is also used to filter what is synchronised, resulting in the `CloudMapping` mapping to a subset of the cloud resource.\n### Methods:\n* `sync_with_cloud(self, key_prefix: str = None) -> None`\n  * Synchronise this `CloudMapping` with the cloud.\n  * This allows a `CloudMapping` to reflect the most recent updates to the cloud resource, including those made by other instances or users. This can allow destructive operations as a user may synchronise to get the latest updates, and then overwrite or delete values.\n  * Consider calling this if you are encountering a `cloudmappings.errors.KeySyncError`, and you are sure you would like to force the operation anyway.\n  * This is called by default on instantiation of a `CloudMapping`.\n  * Parameters:\n    * `key_prefix : str, optional`\n      * Only sync keys beginning with the specified prefix, the key_prefix configured on the mapping is prepended in combination with this parameter.\n\n## CloudMappingSerialisation class\n\nThe `CloudMappingSerialisation` class is simple dataclass that combines serialisation and deserialisation. It has two properties, a dumps and a loads function. Values are passed through the dumps function when being written to the `CloudMapping` and saved to the cloud, and values are passed through the loads function when being loaded from the cloud and read from the `CloudMapping`.\n\nA `CloudMappingSerialisation` may be created directly with singular dumps and loads functions. A `CloudMappingSerialisation.from_chain()` helper method exists for when you would like multiple functions to be chained during serialisation and deserialisation.\n\nSome common `CloudMappingSerialisation`s are also provided out of the box.\n\n### Immutable Properties\n* `dumps: Callable`\n  * Function to dump values through when writing to the cloud.\n  * Must return a bytes-like object.\n* `loads: Callable`\n  * Function to load values through when reading from the cloud.\n  * Must accept a bytes-like object as its input.\n\n### Static Methods\n* `from_chain(ordered_dumps_funcs: List[Callable], ordered_loads_funcs: List[Callable]) -> CloudMappingSerialisation[T]`\n  * Creates a CloudMappingSerialisation by chaining consecutive dumps and loads functions together\n  * Parameters:\n    * `ordered_dumps_funcs: List[Callable]`\n      * An ordered list of functions to pass values through before saving bytes to the cloud.\n      * The last function must return a bytes-like object.\n    * `ordered_loads_funcs: List[Callable]`\n      * An ordered list of functions to pass values through before saving bytes to the cloud.\n      * The first function must accept a bytes-like object as its input.\n\n### Common Serialisations Provided\n* `cloudmappings.serialisers.core`\n  * Provides functions for serialisers that have no additional dependencies\n  * `none() -> CloudMappingSerialisation[bytes]`\n    * This serialiser performs no serialisation, and just passes raw bytes\n    * It is implemented as `None`, and `None` can be used directly, however\n    using this serialiser enables type hints to correctly determine the that the mapping should be `CloudMapping[bytes]`.\n  * `pickle(protocol: int = None) -> CloudMappingSerialisation[Any]`\n    * Serialiser that pickles values using pythons `pickle`\n    * Parameters:\n      * `protocol: int = None`\n      *  The pickle protocol to use, defaults to None which internally default to `pickle.DEFAULT_PROTOCOL`\n  * `raw_string(encoding: str = \"utf-8\") -> CloudMappingSerialisation[str]`\n    * Serialiser that only encodes raw string values\n    * Parameters:\n      * `encoding: str = \"utf-8\"`\n        * The string encoding to use, passed to bytes() and str() for dumps and loads respectively\n  * `json(encoding: str = \"utf-8\") -> CloudMappingSerialisation[Any]`\n    * Serialiser that saves objects as JSON strings\n    * Parameters:\n      * `encoding: str = \"utf-8\"`\n        * The string encoding to use, passed to bytes() and str() for dumps and loads respectively\n  * `json_zlib(encoding: str = \"utf-8\") -> CloudMappingSerialisation[Any]`\n    * Serialiser that saves values as compressed JSON strings, it uses zlib to compress values after serialising them as JSON strings.\n    * Parameters:\n      * `encoding: str = \"utf-8\"`\n        * The string encoding to use, passed to bytes() and str() for dumps and loads respectively\n* `cloudmappings.serialisers.pandas`\n  * Provides functions for serialisers that use [pandas](https://pandas.pydata.org/) as an additional dependency\n  * `csv() -> CloudMappingSerialisation[DataFrame]`\n    * Serialiser that uses pandas to serialise DataFrames as csvs\n\n## Concurrent Use\n\nBeing able to upload/download easily without learning the various cloud sdks is only one benefit of cloud-mappings! `cloud-mappings` is also designed to support concurrent use providing safety and functionality not provided by the cloud sdks.\n\n| | Session 1 | Session 2 |\n| --- | --- | --- |\n| Both sessions create their cloud mappings, | `cm = ....` | `cm = ....` |\n| referencing the same cloud storage | | |\n| Session 1 writes some data to a key | `cm[\"key] = \"Session 1 data\"` | |\n| Session 2 attempts to write over it | | `cm[\"key] = \"Session 2 data\"` |\n| Session 2 gets a Error | | `KeySyncError` |\n\n# Development\n\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\nThis project uses `.devcontainer` to describe the environment to use for development. You may use the environment described in this directory (it integrates automatically with vscode's 'remote containers' extension), or you may create your own environment with the same dependencies.\n\n## Dependencies\nInstall development dependencies with:\n\n`pip install -e .[azureblob,azuretable,gcpstorage,awss3,tests]`\n\n## Tests\nSet environment variables for each provider:\n* Azure Blob: `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`, `AZURE_BLOB_STORAGE_ACCOUNT_URL`, `AZURE_BLOB_STORAGE_HIERARCHICAL_ACCOUNT_URL` (the tests assume the same secret is used for both)\n* Azure Table: `AZURE_TABLE_STORAGE_CONNECTION_STRING`\n* GCP Storage: `GOOGLE_APPLICATION_CREDENTIALS` (path to credentials file), `GOOGLE_CLOUD_STORAGE_PROJECT`\n* AWS S3: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`\n\nRun tests with:\n```bash\npytest --test-container-id <container-suffix-to-use-for-tests>\n```\nThe testing container will be prefixed by \"pytest\", and the commit sha is used within build & release workflows. Note that if the container specified already exists one test will fail.\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "MutableMapping interfaces for common cloud storage providers",
    "version": "2.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/JJ11teen/cloud-mappings/issues",
        "Homepage": "https://github.com/JJ11teen/cloud-mappings"
    },
    "split_keywords": [
        "mutable",
        "dict",
        "aws",
        "s3",
        "azure",
        "gcp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3594700917677b3c5e5f7edd6374cbb5ab1b5dfae965eeb7cae5199bf35d5e08",
                "md5": "0ba85d17014b1ebbc91db201178114d3",
                "sha256": "2210a8162df3572e617a68f33a4b959a5e9082511a4806ca3c1981711d7d42aa"
            },
            "downloads": -1,
            "filename": "cloud_mappings-2.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0ba85d17014b1ebbc91db201178114d3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 23476,
            "upload_time": "2023-06-02T01:15:00",
            "upload_time_iso_8601": "2023-06-02T01:15:00.009211Z",
            "url": "https://files.pythonhosted.org/packages/35/94/700917677b3c5e5f7edd6374cbb5ab1b5dfae965eeb7cae5199bf35d5e08/cloud_mappings-2.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c1ded5883c9a59bf0c53bce24623028fc856d00dffd97d0624a7af1882fefce3",
                "md5": "f6001ee17c833aeecad6db953847d6d8",
                "sha256": "374c83374d47f5351ddd6d36bba5b24c3ca9b0704f3a25850c6806d40ac7b16a"
            },
            "downloads": -1,
            "filename": "cloud-mappings-2.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f6001ee17c833aeecad6db953847d6d8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 17249,
            "upload_time": "2023-06-02T01:15:02",
            "upload_time_iso_8601": "2023-06-02T01:15:02.448909Z",
            "url": "https://files.pythonhosted.org/packages/c1/de/d5883c9a59bf0c53bce24623028fc856d00dffd97d0624a7af1882fefce3/cloud-mappings-2.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-02 01:15:02",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "JJ11teen",
    "github_project": "cloud-mappings",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "cloud-mappings"
}

Lucas Sargent