zipstream-ng
============
[](https://github.com/pR0Ps/zipstream-ng/actions/workflows/tests.yml?query=branch%3Amaster)
[](https://pypi.org/project/zipstream-ng/)

A modern and easy to use streamable zip file generator. It can package and stream many files and
folders into a zip on the fly without needing temporary files or excessive memory. It can also
calculate the final size of the zip file before streaming it.
### Features:
- Generates zip data on the fly as it's requested.
- Can calculate the total size of the resulting zip file before generation even begins.
- Low memory usage: Since the zip is generated as it's requested, very little has to be kept in
memory (peak usage of less than 20MB is typical, even for TBs of files).
- Performant: On-par or faster than using the standard library to create non-streamed zip files.
- Flexible API: Typical use cases are simple, complicated ones are possible.
- Supports zipping data from files, bytes, strings, and any other iterable objects.
- Keeps track of the date of the most recently modified file added to the zip file.
- Threadsafe: Won't mangle data if multiple threads concurrently add data to the same stream.
- Includes a clone of Python's `http.server` module with zip support added. Try `python -m zipstream.server`.
- Automatically uses Zip64 extensions, but only if they are required.
- No external dependencies.
### Ideal for web backends:
- Generating zip data on the fly requires very little memory, no disk usage, and starts producing
data with less latency than creating the entire zip up-front. This means faster responses, no
temporary files, and very low memory usage.
- The ability to calculate the total size of the stream before any data is actually generated
(provided no compression is used) means web backends can provide a `Content-Length` header in
their responses. This allows clients to show a progress bar as the stream is transferred.
- By keeping track of the date of the most recently modified file added to the zip, web
backends can provide a `Last-Modified` header. This allows clients to check if they have the most
up-to-date version of the zip with just a HEAD request instead of having to download the entire
thing.
Installation
------------
```
pip install zipstream-ng
```
Examples
--------
### Create a local zip file (simple example)
Make an archive named `files.zip` in the current directory that contains all files under
`/path/to/files`.
```python
from zipstream import ZipStream
zs = ZipStream.from_path("/path/to/files/")
with open("files.zip", "wb") as f:
f.writelines(zs)
```
### Create a local zip file (demos more of the API)
```python
from zipstream import ZipStream, ZIP_DEFLATED
# Create a ZipStream that uses the maximum level of Deflate compression.
zs = ZipStream(compress_type=ZIP_DEFLATED, compress_level=9)
# Set the zip file's comment.
zs.comment = "Contains compressed important files"
# Add all the files under a path.
# Will add all files under a top-level folder called "files" in the zip.
zs.add_path("/path/to/files/")
# Add another file (will be added as "data.txt" in the zip file).
zs.add_path("/path/to/file.txt", "data.txt")
# Add some random data from an iterable.
# This generator will only be run when the stream is generated.
def random_data():
import random
for _ in range(10):
yield random.randbytes(1024)
zs.add(random_data(), "random.bin")
# Add a file containing some static text.
# Will automatically be encoded to bytes before being added (uses utf-8).
zs.add("This is some text", "README.txt")
# Write out the zip file as it's being generated.
# At this point the data in the files will be read in and the generator
# will be iterated over.
with open("files.zip", "wb") as f:
f.writelines(zs)
```
### zipserver (included)
A fully-functional and useful example can be found in the included
[`zipstream.server`](zipstream/server.py) module. It's a clone of Python's built in `http.server`
with the added ability to serve multiple files and folders as a single zip file. Try it out by
installing the package and running `zipserver --help` or `python -m zipstream.server --help`.

### Integration with a Flask webapp
A very basic [Flask](https://flask.palletsprojects.com/)-based file server that streams all the
files under the requested path to the client as a zip file. It provides the total size of the stream
in the `Content-Length` header so the client can show a progress bar as the stream is downloaded. It
also provides a `Last-Modified` header so the client can check if it already has the most recent
copy of the zipped data with a `HEAD` request instead of having to download the file and check.
Note that while this example works, it's not a good idea to deploy it as-is due to the lack of input
validation and other checks.
```python
import os.path
from flask import Flask, Response
from zipstream import ZipStream
app = Flask(__name__)
@app.route("/", defaults={"path": "."})
@app.route("/<path:path>")
def stream_zip(path):
name = os.path.basename(os.path.abspath(path))
zs = ZipStream.from_path(path)
return Response(
zs,
mimetype="application/zip",
headers={
"Content-Disposition": f"attachment; filename={name}.zip",
"Content-Length": len(zs),
"Last-Modified": zs.last_modified,
}
)
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
```
### Partial generation and last-minute file additions
It's possible to generate the zip stream, but stop before finalizing it. This enables adding
something like a file manifest or compression log after all the files have been added.
`ZipStream` provides a `info_list` method that returns information on all the files added to the
stream. In this example, all that information will be added to the zip in a file named
"manifest.json" before finalizing it.
```python
from zipstream import ZipStream
import json
def gen_zipfile():
zs = ZipStream.from_path("/path/to/files")
yield from zs.all_files()
zs.add(
json.dumps(
zs.info_list(),
indent=2
),
"manifest.json"
)
yield from zs.finalize()
```
Comparison to stdlib
--------------------
Since Python 3.6 it has actually been possible to generate zip files as a stream using just the
standard library, it just hasn't been very ergonomic or efficient. Consider the typical use case of
zipping up a directory of files while streaming it over a network connection:
(note that the size of the stream is not pre-calculated in this case as this would make the stdlib
example way too long).
Using ZipStream:
```python
from zipstream import ZipStream
send_stream(
ZipStream.from_path("/path/to/files/")
)
```
<details>
<summary>The same(ish) functionality using just the stdlib:</summary>
```python
import os
import io
from zipfile import ZipFile, ZipInfo
class Stream(io.RawIOBase):
"""An unseekable stream for the ZipFile to write to"""
def __init__(self):
self._buffer = bytearray()
self._closed = False
def close(self):
self._closed = True
def write(self, b):
if self._closed:
raise ValueError("Can't write to a closed stream")
self._buffer += b
return len(b)
def readall(self):
chunk = bytes(self._buffer)
self._buffer.clear()
return chunk
def iter_files(path):
for dirpath, _, files in os.walk(path, followlinks=True):
if not files:
yield dirpath # Preserve empty directories
for f in files:
yield os.path.join(dirpath, f)
def read_file(path):
with open(path, "rb") as fp:
while True:
buf = fp.read(1024 * 64)
if not buf:
break
yield buf
def generate_zipstream(path):
stream = Stream()
with ZipFile(stream, mode="w") as zf:
toplevel = os.path.basename(os.path.normpath(path))
for f in iter_files(path):
# Use the basename of the path to set the arcname
arcname = os.path.join(toplevel, os.path.relpath(f, path))
zinfo = ZipInfo.from_file(f, arcname)
# Write data to the zip file then yield the stream content
with zf.open(zinfo, mode="w") as fp:
if zinfo.is_dir():
continue
for buf in read_file(f):
fp.write(buf)
yield stream.readall()
yield stream.readall()
send_stream(
generate_zipstream("/path/to/files/")
)
```
</details>
Tests
-----
This package contains extensive tests. To run them, install `pytest` (`pip install pytest`) and run
`py.test` in the project directory.
License
-------
Licensed under the [GNU LGPLv3](https://www.gnu.org/licenses/lgpl-3.0.html).
Raw data
{
"_id": null,
"home_page": "https://github.com/pR0Ps/zipstream-ng",
"name": "zipstream-ng",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.5.0",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/11/f2/690a35762cf8366ce6f3b644805de970bd6a897ca44ce74184c7b2bc94e7/zipstream_ng-1.9.0.tar.gz",
"platform": null,
"description": "zipstream-ng\n============\n[](https://github.com/pR0Ps/zipstream-ng/actions/workflows/tests.yml?query=branch%3Amaster)\n[](https://pypi.org/project/zipstream-ng/)\n\n\nA modern and easy to use streamable zip file generator. It can package and stream many files and\nfolders into a zip on the fly without needing temporary files or excessive memory. It can also\ncalculate the final size of the zip file before streaming it.\n\n\n### Features:\n - Generates zip data on the fly as it's requested.\n - Can calculate the total size of the resulting zip file before generation even begins.\n - Low memory usage: Since the zip is generated as it's requested, very little has to be kept in\n memory (peak usage of less than 20MB is typical, even for TBs of files).\n - Performant: On-par or faster than using the standard library to create non-streamed zip files.\n - Flexible API: Typical use cases are simple, complicated ones are possible.\n - Supports zipping data from files, bytes, strings, and any other iterable objects.\n - Keeps track of the date of the most recently modified file added to the zip file.\n - Threadsafe: Won't mangle data if multiple threads concurrently add data to the same stream.\n - Includes a clone of Python's `http.server` module with zip support added. Try `python -m zipstream.server`.\n - Automatically uses Zip64 extensions, but only if they are required.\n - No external dependencies.\n\n\n### Ideal for web backends:\n - Generating zip data on the fly requires very little memory, no disk usage, and starts producing\n data with less latency than creating the entire zip up-front. This means faster responses, no\n temporary files, and very low memory usage.\n - The ability to calculate the total size of the stream before any data is actually generated\n (provided no compression is used) means web backends can provide a `Content-Length` header in\n their responses. This allows clients to show a progress bar as the stream is transferred.\n - By keeping track of the date of the most recently modified file added to the zip, web\n backends can provide a `Last-Modified` header. This allows clients to check if they have the most\n up-to-date version of the zip with just a HEAD request instead of having to download the entire\n thing.\n\n\nInstallation\n------------\n```\npip install zipstream-ng\n```\n\n\nExamples\n--------\n\n### Create a local zip file (simple example)\n\nMake an archive named `files.zip` in the current directory that contains all files under\n`/path/to/files`.\n\n```python\nfrom zipstream import ZipStream\n\nzs = ZipStream.from_path(\"/path/to/files/\")\n\nwith open(\"files.zip\", \"wb\") as f:\n f.writelines(zs)\n```\n\n\n### Create a local zip file (demos more of the API)\n\n```python\nfrom zipstream import ZipStream, ZIP_DEFLATED\n\n# Create a ZipStream that uses the maximum level of Deflate compression.\nzs = ZipStream(compress_type=ZIP_DEFLATED, compress_level=9)\n\n# Set the zip file's comment.\nzs.comment = \"Contains compressed important files\"\n\n# Add all the files under a path.\n# Will add all files under a top-level folder called \"files\" in the zip.\nzs.add_path(\"/path/to/files/\")\n\n# Add another file (will be added as \"data.txt\" in the zip file).\nzs.add_path(\"/path/to/file.txt\", \"data.txt\")\n\n# Add some random data from an iterable.\n# This generator will only be run when the stream is generated.\ndef random_data():\n import random\n for _ in range(10):\n yield random.randbytes(1024)\n\nzs.add(random_data(), \"random.bin\")\n\n# Add a file containing some static text.\n# Will automatically be encoded to bytes before being added (uses utf-8).\nzs.add(\"This is some text\", \"README.txt\")\n\n# Write out the zip file as it's being generated.\n# At this point the data in the files will be read in and the generator\n# will be iterated over.\nwith open(\"files.zip\", \"wb\") as f:\n f.writelines(zs)\n```\n\n\n### zipserver (included)\n\nA fully-functional and useful example can be found in the included\n[`zipstream.server`](zipstream/server.py) module. It's a clone of Python's built in `http.server`\nwith the added ability to serve multiple files and folders as a single zip file. Try it out by\ninstalling the package and running `zipserver --help` or `python -m zipstream.server --help`.\n\n\n\n\n### Integration with a Flask webapp\n\nA very basic [Flask](https://flask.palletsprojects.com/)-based file server that streams all the\nfiles under the requested path to the client as a zip file. It provides the total size of the stream\nin the `Content-Length` header so the client can show a progress bar as the stream is downloaded. It\nalso provides a `Last-Modified` header so the client can check if it already has the most recent\ncopy of the zipped data with a `HEAD` request instead of having to download the file and check.\n\nNote that while this example works, it's not a good idea to deploy it as-is due to the lack of input\nvalidation and other checks.\n\n```python\nimport os.path\nfrom flask import Flask, Response\nfrom zipstream import ZipStream\n\napp = Flask(__name__)\n\n@app.route(\"/\", defaults={\"path\": \".\"})\n@app.route(\"/<path:path>\")\ndef stream_zip(path):\n name = os.path.basename(os.path.abspath(path))\n zs = ZipStream.from_path(path)\n return Response(\n zs,\n mimetype=\"application/zip\",\n headers={\n \"Content-Disposition\": f\"attachment; filename={name}.zip\",\n \"Content-Length\": len(zs),\n \"Last-Modified\": zs.last_modified,\n }\n )\n\nif __name__ == \"__main__\":\n app.run(host=\"0.0.0.0\", port=5000)\n```\n\n\n### Partial generation and last-minute file additions\n\nIt's possible to generate the zip stream, but stop before finalizing it. This enables adding\nsomething like a file manifest or compression log after all the files have been added.\n\n`ZipStream` provides a `info_list` method that returns information on all the files added to the\nstream. In this example, all that information will be added to the zip in a file named\n\"manifest.json\" before finalizing it.\n\n```python\nfrom zipstream import ZipStream\nimport json\n\ndef gen_zipfile():\n zs = ZipStream.from_path(\"/path/to/files\")\n yield from zs.all_files()\n zs.add(\n json.dumps(\n zs.info_list(),\n indent=2\n ),\n \"manifest.json\"\n )\n yield from zs.finalize()\n```\n\n\nComparison to stdlib\n--------------------\nSince Python 3.6 it has actually been possible to generate zip files as a stream using just the\nstandard library, it just hasn't been very ergonomic or efficient. Consider the typical use case of\nzipping up a directory of files while streaming it over a network connection:\n\n(note that the size of the stream is not pre-calculated in this case as this would make the stdlib\nexample way too long).\n\nUsing ZipStream:\n```python\nfrom zipstream import ZipStream\n\nsend_stream(\n ZipStream.from_path(\"/path/to/files/\")\n)\n```\n\n<details>\n<summary>The same(ish) functionality using just the stdlib:</summary>\n\n```python\nimport os\nimport io\nfrom zipfile import ZipFile, ZipInfo\n\nclass Stream(io.RawIOBase):\n \"\"\"An unseekable stream for the ZipFile to write to\"\"\"\n\n def __init__(self):\n self._buffer = bytearray()\n self._closed = False\n\n def close(self):\n self._closed = True\n\n def write(self, b):\n if self._closed:\n raise ValueError(\"Can't write to a closed stream\")\n self._buffer += b\n return len(b)\n\n def readall(self):\n chunk = bytes(self._buffer)\n self._buffer.clear()\n return chunk\n\ndef iter_files(path):\n for dirpath, _, files in os.walk(path, followlinks=True):\n if not files:\n yield dirpath # Preserve empty directories\n for f in files:\n yield os.path.join(dirpath, f)\n\ndef read_file(path):\n with open(path, \"rb\") as fp:\n while True:\n buf = fp.read(1024 * 64)\n if not buf:\n break\n yield buf\n\ndef generate_zipstream(path):\n stream = Stream()\n with ZipFile(stream, mode=\"w\") as zf:\n toplevel = os.path.basename(os.path.normpath(path))\n for f in iter_files(path):\n # Use the basename of the path to set the arcname\n arcname = os.path.join(toplevel, os.path.relpath(f, path))\n zinfo = ZipInfo.from_file(f, arcname)\n\n # Write data to the zip file then yield the stream content\n with zf.open(zinfo, mode=\"w\") as fp:\n if zinfo.is_dir():\n continue\n for buf in read_file(f):\n fp.write(buf)\n yield stream.readall()\n yield stream.readall()\n\nsend_stream(\n generate_zipstream(\"/path/to/files/\")\n)\n```\n</details>\n\n\nTests\n-----\nThis package contains extensive tests. To run them, install `pytest` (`pip install pytest`) and run\n`py.test` in the project directory.\n\n\nLicense\n-------\nLicensed under the [GNU LGPLv3](https://www.gnu.org/licenses/lgpl-3.0.html).\n",
"bugtrack_url": null,
"license": "LGPL-3.0-only",
"summary": "A modern and easy to use streamable zip file generator",
"version": "1.9.0",
"project_urls": {
"Changelog": "https://github.com/pR0Ps/zipstream-ng/blob/master/CHANGELOG.md",
"Homepage": "https://github.com/pR0Ps/zipstream-ng",
"Source": "https://github.com/pR0Ps/zipstream-ng"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "de62c2da1c495291a52e561257d017585e08906d288035d025ccf636f6b9a266",
"md5": "173f5cd3de38fd504fe8161218d5b941",
"sha256": "31dc2cf617abdbf28d44f2e08c0d14c8eee2ea0ec26507a7e4d5d5f97c564b7a"
},
"downloads": -1,
"filename": "zipstream_ng-1.9.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "173f5cd3de38fd504fe8161218d5b941",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.5.0",
"size": 24852,
"upload_time": "2025-08-29T01:03:35",
"upload_time_iso_8601": "2025-08-29T01:03:35.046253Z",
"url": "https://files.pythonhosted.org/packages/de/62/c2da1c495291a52e561257d017585e08906d288035d025ccf636f6b9a266/zipstream_ng-1.9.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "11f2690a35762cf8366ce6f3b644805de970bd6a897ca44ce74184c7b2bc94e7",
"md5": "eb08f32b64b28005d5d3846cf49aabd3",
"sha256": "a0d94030822d137efbf80dfdc680603c42f804696f41147bb3db895df667daea"
},
"downloads": -1,
"filename": "zipstream_ng-1.9.0.tar.gz",
"has_sig": false,
"md5_digest": "eb08f32b64b28005d5d3846cf49aabd3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.5.0",
"size": 37963,
"upload_time": "2025-08-29T01:03:36",
"upload_time_iso_8601": "2025-08-29T01:03:36.323501Z",
"url": "https://files.pythonhosted.org/packages/11/f2/690a35762cf8366ce6f3b644805de970bd6a897ca44ce74184c7b2bc94e7/zipstream_ng-1.9.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-29 01:03:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pR0Ps",
"github_project": "zipstream-ng",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "zipstream-ng"
}