
Namerwkit JSON
Version 2.0.0 PyPI version JSON
SummarySimplified reading & writing files with support for compression
upload_time2024-08-31 16:59:01
authorDavid Adametz
keywords io compression json jsonl yaml
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # rwkit

`rwkit` is a Python package that simplifies reading and writing various file formats, including text, json, jsonl and yaml. It supports transparent handling of compression, and allows for processing large files in chunks.

## Features

-   Easy-to-use functions for reading and writing text, json, jsonl and yaml files.
-   Transparent compression support: bz2, gzip, tar, tar.bz2, tar.gz, tar.xz, xz, zip, zstd.
-   Generator functions for processing large files in chunks.

## Installation

Install `rwkit` using pip:

pip install rwkit

### Optional Dependencies

`rwkit` comes with optional features that you can install based on your needs:

pip install rwkit[zstd]  # For Zstandard compression support
pip install rwkit[yaml]  # For YAML file handling
pip install rwkit[all]   # For all optional features

## Quick Start

Here are some examples to get you started:

### Reading and Writing Text Files

Using a single string:

import rwkit as rw

# Sample text
text = "Hello, rwkit!"

# Write a string
rw.write_text("file.txt", text)

# Append another string
rw.write_text("file.txt", "\nNice to meet you.", mode="a")

# Read file
loaded_text = rw.read_text("file.txt")

# Output: 'Hello, rwkit!\nNice to meet you.'

... using lines (= list of strings):

import rwkit as rw

# Sample
lines = ["Hello, rwkit!", "Nice to meet you."]

# Write lines, each element on its own line (separated by '\n')
rw.write_lines("file.txt", lines)

# Append a line(s)
rw.write_lines("file.txt", "What a beautiful day.", mode="a")

# Read file (transparently splits on '\n')
loaded_lines = rw.read_lines("file.txt")

# Output: ['Hello, rwkit!', 'Nice to meet you.', 'What a beautiful day.']

### Reading and Writing JSON Files

Using a single object:

import rwkit as rw

# Sample data
data = {"name": "Alice", "age": 25}

# Write data to a JSON file
rw.write_json("file.json", data)

# Read data
loaded_data = rw.read_json("file.json")

# Output: {'name': 'Alice', 'age': 25}

### Reading and Writing JSONL (= JSON Lines) Files

Using multiple objects, each on their own line. This format is especially useful for large files that are processed in chunks (see also below).

import rwkit as rw

# Sample data
data = [
    {"name": "Alice", "age": 25},
    {"name": "Bob", "age": 30},

# Write data to a JSONL file
rw.write_jsonl("file.jsonl", data)

# Read data
loaded_data = rw.read_jsonl("file.jsonl")

# Output: [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 30}]

### Reading and Writing YAML Files

Note: Requires `pyyaml` package.

import rwkit as rw

# Sample data
data = {"name": "Alice", "age": 25}

# Write to a YAML file
rw.write_yaml("file.yaml", data)

# Read a YAML file
loaded_data = rw.read_yaml("file.yaml")

# Output: {'name': 'Alice', 'age': 25}

## Compression

`rwkit` supports various compression formats via argument `compression`. The default is `compression='infer'`, which tries to infer it from the filename extension:

import rwkit as rw

# Sample text
text = "Hello, rwkit!"

# Write to a gzip compressed text file, inferred from the filename extension
rw.write_text("file.txt.gz", text)

# Read a gzip compressed text file
loaded_text = rw.read_text("file.txt.gz")

# Output: 'Hello, rwkit!'

Alternatively, specify `compression` explicitly (see all available options in table

import rwkit as rw

# Sample text
text = "Hello, rwkit!"

# Write to a gzip compressed text file, explicitly specified
rw.write_text("file.txt.gz", text, compression="gzip")

# Read a gzip compressed text file, explicitly specified
loaded_text = rw.read_text("file.txt.gz", compression="gzip")

# Output: 'Hello, rwkit!'

When `compression='infer'`, the following rules apply:

| File extension    | Inferred compression |
| ----------------- | -------------------- |
| `.tar`            | `tar`                |
| `.tar.bz2`        | `tar.bz2`            |
| `.tar.gz`         | `tar.gz`             |
| `.tar.xz`         | `tar.xz`             |
| `.bz2`            | `bz2`                |
| `.gz`             | `gzip`               |
| `.xz`             | `xz`                 |
| `.zip`            | `zip`                |
| `.zst`            | `zstd`               |
| [everything else] | None                 |

## Reading Large Files in Chunks

Both text and jsonl files can be read in chunks using the `chunksize` argument. This
also works in combination with `compression`.

import rwkit as rw

# Assume a large text file, optionally compressed
for chunk in rw.read_lines("file.txt", chunksize=3):
    # Output: ['Hello, rwkit!', 'Nice to meet you.', 'What a beautiful day.']
    # ...

# The same works for jsonl files
for chunk in rw.read_jsonl("file.jsonl", chunksize=3):
    # Output: [{'name': 'Alice'}, {'name': 'Bob'}, {'name': 'Charlie'}]
    # ...

## License

`rwkit` is released under the Apache License Version 2.0. See the LICENSE file for details.


Raw data

    "_id": null,
    "home_page": "",
    "name": "rwkit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "io, compression, json, jsonl, yaml",
    "author": "David Adametz",
    "author_email": "",
    "download_url": "",
    "platform": null,
    "description": "# rwkit\n\n`rwkit` is a Python package that simplifies reading and writing various file formats, including text, json, jsonl and yaml. It supports transparent handling of compression, and allows for processing large files in chunks.\n\n## Features\n\n-   Easy-to-use functions for reading and writing text, json, jsonl and yaml files.\n-   Transparent compression support: bz2, gzip, tar, tar.bz2, tar.gz, tar.xz, xz, zip, zstd.\n-   Generator functions for processing large files in chunks.\n\n## Installation\n\nInstall `rwkit` using pip:\n\n```bash\npip install rwkit\n```\n\n### Optional Dependencies\n\n`rwkit` comes with optional features that you can install based on your needs:\n\n```bash\npip install rwkit[zstd]  # For Zstandard compression support\npip install rwkit[yaml]  # For YAML file handling\npip install rwkit[all]   # For all optional features\n```\n\n## Quick Start\n\nHere are some examples to get you started:\n\n### Reading and Writing Text Files\n\nUsing a single string:\n\n```python\nimport rwkit as rw\n\n\n# Sample text\ntext = \"Hello, rwkit!\"\n\n# Write a string\nrw.write_text(\"file.txt\", text)\n\n# Append another string\nrw.write_text(\"file.txt\", \"\\nNice to meet you.\", mode=\"a\")\n\n# Read file\nloaded_text = rw.read_text(\"file.txt\")\n\nprint(loaded_text)\n# Output: 'Hello, rwkit!\\nNice to meet you.'\n```\n\n... using lines (= list of strings):\n\n```python\nimport rwkit as rw\n\n\n# Sample\nlines = [\"Hello, rwkit!\", \"Nice to meet you.\"]\n\n# Write lines, each element on its own line (separated by '\\n')\nrw.write_lines(\"file.txt\", lines)\n\n# Append a line(s)\nrw.write_lines(\"file.txt\", \"What a beautiful day.\", mode=\"a\")\n\n# Read file (transparently splits on '\\n')\nloaded_lines = rw.read_lines(\"file.txt\")\n\nprint(loaded_lines)\n# Output: ['Hello, rwkit!', 'Nice to meet you.', 'What a beautiful day.']\n```\n\n### Reading and Writing JSON Files\n\nUsing a single object:\n\n```python\nimport rwkit as rw\n\n\n# Sample data\ndata = {\"name\": \"Alice\", \"age\": 25}\n\n# Write data to a JSON file\nrw.write_json(\"file.json\", data)\n\n# Read data\nloaded_data = rw.read_json(\"file.json\")\n\nprint(loaded_data)\n# Output: {'name': 'Alice', 'age': 25}\n```\n\n### Reading and Writing JSONL (= JSON Lines) Files\n\nUsing multiple objects, each on their own line. This format is especially useful for large files that are processed in chunks (see also below).\n\n```python\nimport rwkit as rw\n\n\n# Sample data\ndata = [\n    {\"name\": \"Alice\", \"age\": 25},\n    {\"name\": \"Bob\", \"age\": 30},\n]\n\n# Write data to a JSONL file\nrw.write_jsonl(\"file.jsonl\", data)\n\n# Read data\nloaded_data = rw.read_jsonl(\"file.jsonl\")\n\nprint(loaded_data)\n# Output: [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 30}]\n```\n\n### Reading and Writing YAML Files\n\nNote: Requires `pyyaml` package.\n\n```python\nimport rwkit as rw\n\n\n# Sample data\ndata = {\"name\": \"Alice\", \"age\": 25}\n\n# Write to a YAML file\nrw.write_yaml(\"file.yaml\", data)\n\n# Read a YAML file\nloaded_data = rw.read_yaml(\"file.yaml\")\n\nprint(loaded_data)\n# Output: {'name': 'Alice', 'age': 25}\n```\n\n## Compression\n\n`rwkit` supports various compression formats via argument `compression`. The default is `compression='infer'`, which tries to infer it from the filename extension:\n\n```python\nimport rwkit as rw\n\n\n# Sample text\ntext = \"Hello, rwkit!\"\n\n# Write to a gzip compressed text file, inferred from the filename extension\nrw.write_text(\"file.txt.gz\", text)\n\n# Read a gzip compressed text file\nloaded_text = rw.read_text(\"file.txt.gz\")\n\nprint(loaded_text)\n# Output: 'Hello, rwkit!'\n```\n\nAlternatively, specify `compression` explicitly (see all available options in table\nbelow):\n\n```python\nimport rwkit as rw\n\n\n# Sample text\ntext = \"Hello, rwkit!\"\n\n# Write to a gzip compressed text file, explicitly specified\nrw.write_text(\"file.txt.gz\", text, compression=\"gzip\")\n\n# Read a gzip compressed text file, explicitly specified\nloaded_text = rw.read_text(\"file.txt.gz\", compression=\"gzip\")\n\nprint(loaded_text)\n# Output: 'Hello, rwkit!'\n```\n\nWhen `compression='infer'`, the following rules apply:\n\n| File extension    | Inferred compression |\n| ----------------- | -------------------- |\n| `.tar`            | `tar`                |\n| `.tar.bz2`        | `tar.bz2`            |\n| `.tar.gz`         | `tar.gz`             |\n| `.tar.xz`         | `tar.xz`             |\n| `.bz2`            | `bz2`                |\n| `.gz`             | `gzip`               |\n| `.xz`             | `xz`                 |\n| `.zip`            | `zip`                |\n| `.zst`            | `zstd`               |\n| [everything else] | None                 |\n\n## Reading Large Files in Chunks\n\nBoth text and jsonl files can be read in chunks using the `chunksize` argument. This\nalso works in combination with `compression`.\n\n```python\nimport rwkit as rw\n\n\n# Assume a large text file, optionally compressed\nfor chunk in rw.read_lines(\"file.txt\", chunksize=3):\n    print(chunk)\n    # Output: ['Hello, rwkit!', 'Nice to meet you.', 'What a beautiful day.']\n    # ...\n\n# The same works for jsonl files\nfor chunk in rw.read_jsonl(\"file.jsonl\", chunksize=3):\n    print(chunk)\n    # Output: [{'name': 'Alice'}, {'name': 'Bob'}, {'name': 'Charlie'}]\n    # ...\n```\n\n## License\n\n`rwkit` is released under the Apache License Version 2.0. See the LICENSE file for details.\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Simplified reading & writing files with support for compression",
    "version": "2.0.0",
    "project_urls": {
        "Homepage": "",
        "Repository": ""
    "split_keywords": [
        " compression",
        " json",
        " jsonl",
        " yaml"
    "urls": [
            "comment_text": "",
            "digests": {
                "blake2b_256": "062ace0a79b2d16aa01c3a687c8f703a9ae4388d738fd6cf82f008dad0fd3241",
                "md5": "d830ccb07c18a37a5c48cd67785daa39",
                "sha256": "79ca7053ba906a75b034894b70647057832ab478410ec58602af3d61ffa478b9"
            "downloads": -1,
            "filename": "rwkit-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d830ccb07c18a37a5c48cd67785daa39",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 15127,
            "upload_time": "2024-08-31T16:59:00",
            "upload_time_iso_8601": "2024-08-31T16:59:00.690863Z",
            "url": "",
            "yanked": false,
            "yanked_reason": null
            "comment_text": "",
            "digests": {
                "blake2b_256": "e9a463ce23029cbb938f51aed3ec7b67871849497e79a7f6098eeb26b5545581",
                "md5": "309369d5c5470bc5ed88d03a99300a9a",
                "sha256": "0c56550f18a4158ed2d4d84702264954f47476ac69100d2d99dd38a980d80bba"
            "downloads": -1,
            "filename": "rwkit-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "309369d5c5470bc5ed88d03a99300a9a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 12857,
            "upload_time": "2024-08-31T16:59:01",
            "upload_time_iso_8601": "2024-08-31T16:59:01.989624Z",
            "url": "",
            "yanked": false,
            "yanked_reason": null
    "upload_time": "2024-08-31 16:59:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "neural-tools",
    "github_project": "rwkit",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "rwkit"
Elapsed time: 0.33304s