# URL Blob
A library for providing agnostic access to presigned URLs living at different cloud providers.
This library implements an agnostic way of working with pre-signed URLs from different cloud providers, in order to support stat, get, and put. Support for multi-part upload and delete is also planned.
All the major cloud providers offer a way to hand out URLs to objects in a bucket (or bucket-equivalent), allowing URL users to work with these objects without having to authenticate themselves. This would be great for cloud-agnostic processing applications, except that different cloud providers sometimes do things slightly differently. This library papers over those differences to provide you with a truly cloud-agnostic way of working with blobs behind these URLs.
## Library Usage
The UrlBlob library provides a consistent interface for working with files stored in various cloud providers through presigned URLs.
### Basic Usage (Async API)
```python
import asyncio
from urlblob.manager import UrlBlobManager
async def main():
# Create a manager
manager = UrlBlobManager()
# Get a blob from a URL
blob = manager.from_url("https://example.com/path/to/file")
# Get file stats
stats = await blob.stat()
print(f"File size: {stats.size()} bytes")
print(f"Content type: {stats.content_type()}")
# Download content
content = await blob.get()
# Upload content
await blob.put("Hello, world!", content_type="text/plain")
asyncio.run(main())
```
### Synchronous API
For applications that don't use async/await, UrlBlob provides a synchronous API with identical functionality:
```python
from urlblob import SyncUrlBlobManager
# Create a manager
manager = SyncUrlBlobManager()
# Get a blob from a URL
blob = manager.from_url("https://example.com/path/to/file")
# Get file stats
stats = blob.stat()
print(f"File size: {stats.size()} bytes")
print(f"Content type: {stats.content_type()}")
# Download content
content = blob.get()
# Upload content
blob.put("Hello, world!", content_type="text/plain")
# Use context manager for automatic cleanup
with SyncUrlBlobManager() as manager:
blob = manager.from_url("https://example.com/path/to/file")
# Work with blob...
```
### API Reference
#### UrlBlobManager
```python
# Initialize a manager
manager = UrlBlobManager()
# Create a blob from a URL with optional explicit URL type
blob = manager.from_url(url, url_type=None)
```
#### UrlBlob
```python
# Get file metadata
stats = await blob.stat()
# Download entire file
content = await blob.get()
# Download a byte range
# Note: byte_range is end-exclusive (like Python's range)
content = await blob.get(byte_range=range(0, 1024)) # Gets bytes 0-1023 (1024 bytes)
# While start/end parameters are end-inclusive
content = await blob.get(start=0, end=1023) # Also gets bytes 0-1023 (1024 bytes)
# Stream file content
async for chunk in blob.stream():
process(chunk)
# Stream file as lines of text
async for line in blob.stream_lines():
process(line)
# Upload content (supports str, bytes, file objects, iterators)
await blob.put(content, content_type="text/plain")
# Upload content as lines
await blob.put_lines(["line1", "line2", "line3"], content_type="text/plain")
```
#### UrlBlobStats
```python
# Get file size in bytes
size = stats.size()
# or with None for missing size
size = stats.size_or_none()
# Get content type
content_type = stats.content_type()
# or with None for missing content type
content_type = stats.content_type_or_none()
# Get last modified timestamp
last_modified = stats.last_modified()
# or with None for missing timestamp
last_modified = stats.last_modified_or_none()
# Get all stats as a dictionary
stats_dict = stats.to_dict()
```
## Command Line Interface
The library also includes a CLI for convenient access to its functionality:
### Global Options
```bash
# Override URL type detection
urlblob --url-type s3 [command] [args]
```
Available URL types: `s3` (aliases: `aws`, `aws_s3`), `gcp` (aliases: `google`), `azure` (alias: `az`), `generic`
### Upload content to a URL
```bash
# Upload content directly from command line
urlblob put https://example.com/path "Hello, world!"
# Upload content from stdin
echo "Hello, world!" | urlblob put https://example.com/path
# Specify content type
urlblob put https://example.com/path "{'key': 'value'}" --content-type application/json
# Process input as lines of text
urlblob put https://example.com/path "line1
line2
line3" --lines
```
Options:
- `--content-type TEXT`, `-t TEXT`: Content type of the data (default: text/plain)
- `--lines`, `-l`: Process content as lines of text
### Download a file
```bash
# Download entire file
urlblob get https://example.com/path/to/file
# Download with byte range (ranges are inclusive for CLI, e.g. 0-1024 gets 1025 bytes)
urlblob get https://example.com/path/to/file 0-1024
urlblob get https://example.com/path/to/file 1024-
urlblob get https://example.com/path/to/file -1024
# Save to file instead of stdout
urlblob get https://example.com/path/to/file -o output.txt
# Process output as lines of text
urlblob get https://example.com/path/to/file --lines
# Download entire file at once (no streaming)
urlblob get https://example.com/path/to/file --no-stream
```
Options:
- `--output`, `-o`: Output file (default: stdout)
- `--lines`, `-l`: Process content as lines of text
- `--no-stream`: Download entire file at once instead of streaming
- `--start`: Start byte position
- `--end`: End byte position
### Get file information
```bash
# Get file stats in table format
urlblob stat https://example.com/path/to/file
# Get file stats in JSON format
urlblob stat https://example.com/path/to/file --json
```
Options:
- `--json`, `-j`: Output in JSON format
Raw data
{
"_id": null,
"home_page": null,
"name": "urlblob",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Scidonia Limited <team@scidonia.com>",
"keywords": "azure, blob storage, cloud storage, gcs, object storage, presigned urls, s3, signed urls",
"author": null,
"author_email": "Maren van Otterdijk <maren@scidonia.ai>",
"download_url": "https://files.pythonhosted.org/packages/2b/f9/b9e2041cf69a031ea93cadfb2751a945fcf87a6afd5c8d8013dce26435c1/urlblob-0.1.1.tar.gz",
"platform": null,
"description": "# URL Blob\n\nA library for providing agnostic access to presigned URLs living at different cloud providers.\n\nThis library implements an agnostic way of working with pre-signed URLs from different cloud providers, in order to support stat, get, and put. Support for multi-part upload and delete is also planned.\n\nAll the major cloud providers offer a way to hand out URLs to objects in a bucket (or bucket-equivalent), allowing URL users to work with these objects without having to authenticate themselves. This would be great for cloud-agnostic processing applications, except that different cloud providers sometimes do things slightly differently. This library papers over those differences to provide you with a truly cloud-agnostic way of working with blobs behind these URLs.\n\n## Library Usage\n\nThe UrlBlob library provides a consistent interface for working with files stored in various cloud providers through presigned URLs.\n\n### Basic Usage (Async API)\n\n```python\nimport asyncio\nfrom urlblob.manager import UrlBlobManager\n\nasync def main():\n # Create a manager\n manager = UrlBlobManager()\n \n # Get a blob from a URL\n blob = manager.from_url(\"https://example.com/path/to/file\")\n \n # Get file stats\n stats = await blob.stat()\n print(f\"File size: {stats.size()} bytes\")\n print(f\"Content type: {stats.content_type()}\")\n \n # Download content\n content = await blob.get()\n \n # Upload content\n await blob.put(\"Hello, world!\", content_type=\"text/plain\")\n\nasyncio.run(main())\n```\n\n### Synchronous API\n\nFor applications that don't use async/await, UrlBlob provides a synchronous API with identical functionality:\n\n```python\nfrom urlblob import SyncUrlBlobManager\n\n# Create a manager\nmanager = SyncUrlBlobManager()\n\n# Get a blob from a URL\nblob = manager.from_url(\"https://example.com/path/to/file\")\n\n# Get file stats\nstats = blob.stat()\nprint(f\"File size: {stats.size()} bytes\")\nprint(f\"Content type: {stats.content_type()}\")\n\n# Download content\ncontent = blob.get()\n\n# Upload content\nblob.put(\"Hello, world!\", content_type=\"text/plain\")\n\n# Use context manager for automatic cleanup\nwith SyncUrlBlobManager() as manager:\n blob = manager.from_url(\"https://example.com/path/to/file\")\n # Work with blob...\n```\n\n### API Reference\n\n#### UrlBlobManager\n\n```python\n# Initialize a manager\nmanager = UrlBlobManager()\n\n# Create a blob from a URL with optional explicit URL type\nblob = manager.from_url(url, url_type=None)\n```\n\n#### UrlBlob\n\n```python\n# Get file metadata\nstats = await blob.stat()\n\n# Download entire file\ncontent = await blob.get()\n\n# Download a byte range\n# Note: byte_range is end-exclusive (like Python's range)\ncontent = await blob.get(byte_range=range(0, 1024)) # Gets bytes 0-1023 (1024 bytes)\n\n# While start/end parameters are end-inclusive\ncontent = await blob.get(start=0, end=1023) # Also gets bytes 0-1023 (1024 bytes)\n\n# Stream file content\nasync for chunk in blob.stream():\n process(chunk)\n\n# Stream file as lines of text\nasync for line in blob.stream_lines():\n process(line)\n\n# Upload content (supports str, bytes, file objects, iterators)\nawait blob.put(content, content_type=\"text/plain\")\n\n# Upload content as lines\nawait blob.put_lines([\"line1\", \"line2\", \"line3\"], content_type=\"text/plain\")\n```\n\n#### UrlBlobStats\n\n```python\n# Get file size in bytes\nsize = stats.size()\n# or with None for missing size\nsize = stats.size_or_none()\n\n# Get content type\ncontent_type = stats.content_type()\n# or with None for missing content type\ncontent_type = stats.content_type_or_none()\n\n# Get last modified timestamp\nlast_modified = stats.last_modified()\n# or with None for missing timestamp\nlast_modified = stats.last_modified_or_none()\n\n# Get all stats as a dictionary\nstats_dict = stats.to_dict()\n```\n\n## Command Line Interface\n\nThe library also includes a CLI for convenient access to its functionality:\n\n### Global Options\n\n```bash\n# Override URL type detection\nurlblob --url-type s3 [command] [args]\n```\n\nAvailable URL types: `s3` (aliases: `aws`, `aws_s3`), `gcp` (aliases: `google`), `azure` (alias: `az`), `generic`\n\n### Upload content to a URL\n\n```bash\n# Upload content directly from command line\nurlblob put https://example.com/path \"Hello, world!\"\n\n# Upload content from stdin\necho \"Hello, world!\" | urlblob put https://example.com/path\n\n# Specify content type\nurlblob put https://example.com/path \"{'key': 'value'}\" --content-type application/json\n\n# Process input as lines of text\nurlblob put https://example.com/path \"line1\nline2\nline3\" --lines\n```\n\nOptions:\n\n- `--content-type TEXT`, `-t TEXT`: Content type of the data (default: text/plain)\n- `--lines`, `-l`: Process content as lines of text\n\n### Download a file\n\n```bash\n# Download entire file\nurlblob get https://example.com/path/to/file\n\n# Download with byte range (ranges are inclusive for CLI, e.g. 0-1024 gets 1025 bytes)\nurlblob get https://example.com/path/to/file 0-1024\nurlblob get https://example.com/path/to/file 1024-\nurlblob get https://example.com/path/to/file -1024\n\n# Save to file instead of stdout\nurlblob get https://example.com/path/to/file -o output.txt\n\n# Process output as lines of text\nurlblob get https://example.com/path/to/file --lines\n\n# Download entire file at once (no streaming)\nurlblob get https://example.com/path/to/file --no-stream\n```\n\nOptions:\n\n- `--output`, `-o`: Output file (default: stdout)\n- `--lines`, `-l`: Process content as lines of text\n- `--no-stream`: Download entire file at once instead of streaming\n- `--start`: Start byte position\n- `--end`: End byte position\n\n### Get file information\n\n```bash\n# Get file stats in table format\nurlblob stat https://example.com/path/to/file\n\n# Get file stats in JSON format\nurlblob stat https://example.com/path/to/file --json\n```\n\nOptions:\n\n- `--json`, `-j`: Output in JSON format\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Agnostic access for presigned URLs at different cloud providers",
"version": "0.1.1",
"project_urls": null,
"split_keywords": [
"azure",
" blob storage",
" cloud storage",
" gcs",
" object storage",
" presigned urls",
" s3",
" signed urls"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d23782bf4ec67a69ff9c25483d9135257b1d30025f1b8a6458bd10bcc774f132",
"md5": "4e52cdfaf2baa1e3922c7b4d9f69482d",
"sha256": "7de6719c3b0700c0e3c4bb61eb42e11b7611565ed3db4cc14603e7cf7ae3d16b"
},
"downloads": -1,
"filename": "urlblob-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4e52cdfaf2baa1e3922c7b4d9f69482d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 29016,
"upload_time": "2025-07-11T03:34:04",
"upload_time_iso_8601": "2025-07-11T03:34:04.828927Z",
"url": "https://files.pythonhosted.org/packages/d2/37/82bf4ec67a69ff9c25483d9135257b1d30025f1b8a6458bd10bcc774f132/urlblob-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "2bf9b9e2041cf69a031ea93cadfb2751a945fcf87a6afd5c8d8013dce26435c1",
"md5": "1fa41f2f5b1e98589ec997e5fac01367",
"sha256": "6e9ade7e683ee010fea23fcef6968788624fbe87268e4bda97624119145dcaf6"
},
"downloads": -1,
"filename": "urlblob-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "1fa41f2f5b1e98589ec997e5fac01367",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 17736,
"upload_time": "2025-07-11T03:34:05",
"upload_time_iso_8601": "2025-07-11T03:34:05.908466Z",
"url": "https://files.pythonhosted.org/packages/2b/f9/b9e2041cf69a031ea93cadfb2751a945fcf87a6afd5c8d8013dce26435c1/urlblob-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-11 03:34:05",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "urlblob"
}