urlblob


Nameurlblob JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryAgnostic access for presigned URLs at different cloud providers
upload_time2025-07-11 03:34:05
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseApache-2.0
keywords azure blob storage cloud storage gcs object storage presigned urls s3 signed urls
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # URL Blob

A library for providing agnostic access to presigned URLs living at different cloud providers.

This library implements an agnostic way of working with pre-signed URLs from different cloud providers, in order to support stat, get, and put. Support for multi-part upload and delete is also planned.

All the major cloud providers offer a way to hand out URLs to objects in a bucket (or bucket-equivalent), allowing URL users to work with these objects without having to authenticate themselves. This would be great for cloud-agnostic processing applications, except that different cloud providers sometimes do things slightly differently. This library papers over those differences to provide you with a truly cloud-agnostic way of working with blobs behind these URLs.

## Library Usage

The UrlBlob library provides a consistent interface for working with files stored in various cloud providers through presigned URLs.

### Basic Usage (Async API)

```python
import asyncio
from urlblob.manager import UrlBlobManager

async def main():
    # Create a manager
    manager = UrlBlobManager()
    
    # Get a blob from a URL
    blob = manager.from_url("https://example.com/path/to/file")
    
    # Get file stats
    stats = await blob.stat()
    print(f"File size: {stats.size()} bytes")
    print(f"Content type: {stats.content_type()}")
    
    # Download content
    content = await blob.get()
    
    # Upload content
    await blob.put("Hello, world!", content_type="text/plain")

asyncio.run(main())
```

### Synchronous API

For applications that don't use async/await, UrlBlob provides a synchronous API with identical functionality:

```python
from urlblob import SyncUrlBlobManager

# Create a manager
manager = SyncUrlBlobManager()

# Get a blob from a URL
blob = manager.from_url("https://example.com/path/to/file")

# Get file stats
stats = blob.stat()
print(f"File size: {stats.size()} bytes")
print(f"Content type: {stats.content_type()}")

# Download content
content = blob.get()

# Upload content
blob.put("Hello, world!", content_type="text/plain")

# Use context manager for automatic cleanup
with SyncUrlBlobManager() as manager:
    blob = manager.from_url("https://example.com/path/to/file")
    # Work with blob...
```

### API Reference

#### UrlBlobManager

```python
# Initialize a manager
manager = UrlBlobManager()

# Create a blob from a URL with optional explicit URL type
blob = manager.from_url(url, url_type=None)
```

#### UrlBlob

```python
# Get file metadata
stats = await blob.stat()

# Download entire file
content = await blob.get()

# Download a byte range
# Note: byte_range is end-exclusive (like Python's range)
content = await blob.get(byte_range=range(0, 1024))  # Gets bytes 0-1023 (1024 bytes)

# While start/end parameters are end-inclusive
content = await blob.get(start=0, end=1023)  # Also gets bytes 0-1023 (1024 bytes)

# Stream file content
async for chunk in blob.stream():
    process(chunk)

# Stream file as lines of text
async for line in blob.stream_lines():
    process(line)

# Upload content (supports str, bytes, file objects, iterators)
await blob.put(content, content_type="text/plain")

# Upload content as lines
await blob.put_lines(["line1", "line2", "line3"], content_type="text/plain")
```

#### UrlBlobStats

```python
# Get file size in bytes
size = stats.size()
# or with None for missing size
size = stats.size_or_none()

# Get content type
content_type = stats.content_type()
# or with None for missing content type
content_type = stats.content_type_or_none()

# Get last modified timestamp
last_modified = stats.last_modified()
# or with None for missing timestamp
last_modified = stats.last_modified_or_none()

# Get all stats as a dictionary
stats_dict = stats.to_dict()
```

## Command Line Interface

The library also includes a CLI for convenient access to its functionality:

### Global Options

```bash
# Override URL type detection
urlblob --url-type s3 [command] [args]
```

Available URL types: `s3` (aliases: `aws`, `aws_s3`), `gcp` (aliases: `google`), `azure` (alias: `az`), `generic`

### Upload content to a URL

```bash
# Upload content directly from command line
urlblob put https://example.com/path "Hello, world!"

# Upload content from stdin
echo "Hello, world!" | urlblob put https://example.com/path

# Specify content type
urlblob put https://example.com/path "{'key': 'value'}" --content-type application/json

# Process input as lines of text
urlblob put https://example.com/path "line1
line2
line3" --lines
```

Options:

- `--content-type TEXT`, `-t TEXT`: Content type of the data (default: text/plain)
- `--lines`, `-l`: Process content as lines of text

### Download a file

```bash
# Download entire file
urlblob get https://example.com/path/to/file

# Download with byte range (ranges are inclusive for CLI, e.g. 0-1024 gets 1025 bytes)
urlblob get https://example.com/path/to/file 0-1024
urlblob get https://example.com/path/to/file 1024-
urlblob get https://example.com/path/to/file -1024

# Save to file instead of stdout
urlblob get https://example.com/path/to/file -o output.txt

# Process output as lines of text
urlblob get https://example.com/path/to/file --lines

# Download entire file at once (no streaming)
urlblob get https://example.com/path/to/file --no-stream
```

Options:

- `--output`, `-o`: Output file (default: stdout)
- `--lines`, `-l`: Process content as lines of text
- `--no-stream`: Download entire file at once instead of streaming
- `--start`: Start byte position
- `--end`: End byte position

### Get file information

```bash
# Get file stats in table format
urlblob stat https://example.com/path/to/file

# Get file stats in JSON format
urlblob stat https://example.com/path/to/file --json
```

Options:

- `--json`, `-j`: Output in JSON format

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "urlblob",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Scidonia Limited <team@scidonia.com>",
    "keywords": "azure, blob storage, cloud storage, gcs, object storage, presigned urls, s3, signed urls",
    "author": null,
    "author_email": "Maren van Otterdijk <maren@scidonia.ai>",
    "download_url": "https://files.pythonhosted.org/packages/2b/f9/b9e2041cf69a031ea93cadfb2751a945fcf87a6afd5c8d8013dce26435c1/urlblob-0.1.1.tar.gz",
    "platform": null,
    "description": "# URL Blob\n\nA library for providing agnostic access to presigned URLs living at different cloud providers.\n\nThis library implements an agnostic way of working with pre-signed URLs from different cloud providers, in order to support stat, get, and put. Support for multi-part upload and delete is also planned.\n\nAll the major cloud providers offer a way to hand out URLs to objects in a bucket (or bucket-equivalent), allowing URL users to work with these objects without having to authenticate themselves. This would be great for cloud-agnostic processing applications, except that different cloud providers sometimes do things slightly differently. This library papers over those differences to provide you with a truly cloud-agnostic way of working with blobs behind these URLs.\n\n## Library Usage\n\nThe UrlBlob library provides a consistent interface for working with files stored in various cloud providers through presigned URLs.\n\n### Basic Usage (Async API)\n\n```python\nimport asyncio\nfrom urlblob.manager import UrlBlobManager\n\nasync def main():\n    # Create a manager\n    manager = UrlBlobManager()\n    \n    # Get a blob from a URL\n    blob = manager.from_url(\"https://example.com/path/to/file\")\n    \n    # Get file stats\n    stats = await blob.stat()\n    print(f\"File size: {stats.size()} bytes\")\n    print(f\"Content type: {stats.content_type()}\")\n    \n    # Download content\n    content = await blob.get()\n    \n    # Upload content\n    await blob.put(\"Hello, world!\", content_type=\"text/plain\")\n\nasyncio.run(main())\n```\n\n### Synchronous API\n\nFor applications that don't use async/await, UrlBlob provides a synchronous API with identical functionality:\n\n```python\nfrom urlblob import SyncUrlBlobManager\n\n# Create a manager\nmanager = SyncUrlBlobManager()\n\n# Get a blob from a URL\nblob = manager.from_url(\"https://example.com/path/to/file\")\n\n# Get file stats\nstats = blob.stat()\nprint(f\"File size: {stats.size()} bytes\")\nprint(f\"Content type: {stats.content_type()}\")\n\n# Download content\ncontent = blob.get()\n\n# Upload content\nblob.put(\"Hello, world!\", content_type=\"text/plain\")\n\n# Use context manager for automatic cleanup\nwith SyncUrlBlobManager() as manager:\n    blob = manager.from_url(\"https://example.com/path/to/file\")\n    # Work with blob...\n```\n\n### API Reference\n\n#### UrlBlobManager\n\n```python\n# Initialize a manager\nmanager = UrlBlobManager()\n\n# Create a blob from a URL with optional explicit URL type\nblob = manager.from_url(url, url_type=None)\n```\n\n#### UrlBlob\n\n```python\n# Get file metadata\nstats = await blob.stat()\n\n# Download entire file\ncontent = await blob.get()\n\n# Download a byte range\n# Note: byte_range is end-exclusive (like Python's range)\ncontent = await blob.get(byte_range=range(0, 1024))  # Gets bytes 0-1023 (1024 bytes)\n\n# While start/end parameters are end-inclusive\ncontent = await blob.get(start=0, end=1023)  # Also gets bytes 0-1023 (1024 bytes)\n\n# Stream file content\nasync for chunk in blob.stream():\n    process(chunk)\n\n# Stream file as lines of text\nasync for line in blob.stream_lines():\n    process(line)\n\n# Upload content (supports str, bytes, file objects, iterators)\nawait blob.put(content, content_type=\"text/plain\")\n\n# Upload content as lines\nawait blob.put_lines([\"line1\", \"line2\", \"line3\"], content_type=\"text/plain\")\n```\n\n#### UrlBlobStats\n\n```python\n# Get file size in bytes\nsize = stats.size()\n# or with None for missing size\nsize = stats.size_or_none()\n\n# Get content type\ncontent_type = stats.content_type()\n# or with None for missing content type\ncontent_type = stats.content_type_or_none()\n\n# Get last modified timestamp\nlast_modified = stats.last_modified()\n# or with None for missing timestamp\nlast_modified = stats.last_modified_or_none()\n\n# Get all stats as a dictionary\nstats_dict = stats.to_dict()\n```\n\n## Command Line Interface\n\nThe library also includes a CLI for convenient access to its functionality:\n\n### Global Options\n\n```bash\n# Override URL type detection\nurlblob --url-type s3 [command] [args]\n```\n\nAvailable URL types: `s3` (aliases: `aws`, `aws_s3`), `gcp` (aliases: `google`), `azure` (alias: `az`), `generic`\n\n### Upload content to a URL\n\n```bash\n# Upload content directly from command line\nurlblob put https://example.com/path \"Hello, world!\"\n\n# Upload content from stdin\necho \"Hello, world!\" | urlblob put https://example.com/path\n\n# Specify content type\nurlblob put https://example.com/path \"{'key': 'value'}\" --content-type application/json\n\n# Process input as lines of text\nurlblob put https://example.com/path \"line1\nline2\nline3\" --lines\n```\n\nOptions:\n\n- `--content-type TEXT`, `-t TEXT`: Content type of the data (default: text/plain)\n- `--lines`, `-l`: Process content as lines of text\n\n### Download a file\n\n```bash\n# Download entire file\nurlblob get https://example.com/path/to/file\n\n# Download with byte range (ranges are inclusive for CLI, e.g. 0-1024 gets 1025 bytes)\nurlblob get https://example.com/path/to/file 0-1024\nurlblob get https://example.com/path/to/file 1024-\nurlblob get https://example.com/path/to/file -1024\n\n# Save to file instead of stdout\nurlblob get https://example.com/path/to/file -o output.txt\n\n# Process output as lines of text\nurlblob get https://example.com/path/to/file --lines\n\n# Download entire file at once (no streaming)\nurlblob get https://example.com/path/to/file --no-stream\n```\n\nOptions:\n\n- `--output`, `-o`: Output file (default: stdout)\n- `--lines`, `-l`: Process content as lines of text\n- `--no-stream`: Download entire file at once instead of streaming\n- `--start`: Start byte position\n- `--end`: End byte position\n\n### Get file information\n\n```bash\n# Get file stats in table format\nurlblob stat https://example.com/path/to/file\n\n# Get file stats in JSON format\nurlblob stat https://example.com/path/to/file --json\n```\n\nOptions:\n\n- `--json`, `-j`: Output in JSON format\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Agnostic access for presigned URLs at different cloud providers",
    "version": "0.1.1",
    "project_urls": null,
    "split_keywords": [
        "azure",
        " blob storage",
        " cloud storage",
        " gcs",
        " object storage",
        " presigned urls",
        " s3",
        " signed urls"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d23782bf4ec67a69ff9c25483d9135257b1d30025f1b8a6458bd10bcc774f132",
                "md5": "4e52cdfaf2baa1e3922c7b4d9f69482d",
                "sha256": "7de6719c3b0700c0e3c4bb61eb42e11b7611565ed3db4cc14603e7cf7ae3d16b"
            },
            "downloads": -1,
            "filename": "urlblob-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4e52cdfaf2baa1e3922c7b4d9f69482d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 29016,
            "upload_time": "2025-07-11T03:34:04",
            "upload_time_iso_8601": "2025-07-11T03:34:04.828927Z",
            "url": "https://files.pythonhosted.org/packages/d2/37/82bf4ec67a69ff9c25483d9135257b1d30025f1b8a6458bd10bcc774f132/urlblob-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2bf9b9e2041cf69a031ea93cadfb2751a945fcf87a6afd5c8d8013dce26435c1",
                "md5": "1fa41f2f5b1e98589ec997e5fac01367",
                "sha256": "6e9ade7e683ee010fea23fcef6968788624fbe87268e4bda97624119145dcaf6"
            },
            "downloads": -1,
            "filename": "urlblob-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "1fa41f2f5b1e98589ec997e5fac01367",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 17736,
            "upload_time": "2025-07-11T03:34:05",
            "upload_time_iso_8601": "2025-07-11T03:34:05.908466Z",
            "url": "https://files.pythonhosted.org/packages/2b/f9/b9e2041cf69a031ea93cadfb2751a945fcf87a6afd5c8d8013dce26435c1/urlblob-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-11 03:34:05",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "urlblob"
}
        
Elapsed time: 1.56332s