py-jsonl


Namepy-jsonl JSON
Version 1.1.2 PyPI version JSON
download
home_pagehttps://github.com/rmoralespp/jsonl
SummaryA simple Python library for handling jsonlines files.
upload_time2024-08-19 09:29:18
maintainerNone
docs_urlNone
authorrmoralespp
requires_python>=3.8
licenseMIT
keywords jsonlines ndjson jsonl
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # jsonl

[![CI](https://github.com/rmoralespp/jsonl/workflows/CI/badge.svg)](https://github.com/rmoralespp/jsonl/actions?query=event%3Arelease+workflow%3ACI)
[![pypi](https://img.shields.io/pypi/v/py-jsonl.svg)](https://pypi.python.org/pypi/py-jsonl)
[![versions](https://img.shields.io/pypi/pyversions/py-jsonl.svg)](https://github.com/rmoralespp/jsonl)
[![codecov](https://codecov.io/gh/rmoralespp/jsonl/branch/main/graph/badge.svg)](https://app.codecov.io/gh/rmoralespp/jsonl)
[![license](https://img.shields.io/github/license/rmoralespp/jsonl.svg)](https://github.com/rmoralespp/jsonl/blob/main/LICENSE)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Linter: ruff](https://img.shields.io/badge/linter-_ruff-orange)](https://github.com/charliermarsh/ruff)

### About

Useful functions for working with jsonlines data as
described: https://jsonlines.org/

Features:

- Exposes an API similar to the `json` module from the standard library.
- Supports `orjson`, `ujson` libraries or standard `json` for serialization/deserialization, prioritizing `orjson`,
  then `ujson`, and defaulting to the standard `json` if none are installed.
- Supports `gzip` and `bzip2` compression formats.

### Installation (via pip)

```pip install py-jsonl```

### Usage

##### dumps

Serialize an iterable into a jsonlines formatted string.

```
dumps(iterable, **kwargs)

:param Iterable[Any] iterable: Iterable of objects
:param kwargs: `json.dumps` kwargs
:rtype: str
```

Examples:

```python
import jsonl

data = ({'foo': 1}, {'bar': 2})
result = jsonl.dumps(data)
print(result)  # >> '{"foo": 1}\n{"bar": 2}\n'
```

##### dump

Dump an iterable to a jsonlines file.
- Use (`.gz`, `.gzip`, `.bz2`) extensions to dump the compressed file.
- Dumps falls back to the following functions: (`orjson.dumps`, `ujson.dumps`, and `json.dumps`).

```
dump(iterable, file, **kwargs)

:param Iterable[Any] iterable: Iterable of objects
:param Union[str | bytes | os.PathLike | io.IOBase] file: File to dump
:param kwargs: `json.dumps` kwargs
```

Examples:

```python
import gzip

import jsonl

data = ({'foo': 1}, {'bar': 2})

# Dump the data into an uncompressed file at the given path.
jsonl.dump(data, "file1.jsonl")

# Dump the data into a gzipped file at the given path.
jsonl.dump(data, "file2.jsonl.gz")

# Dump the data into the already opened gzipped file.
with gzip.open("file3.jsonl.gz", mode="wb") as fp:
    jsonl.dump(data, fp)

# Append the data to the end of the existing gzipped file.
with gzip.open("file3.jsonl.gz", mode="ab") as fp:
    jsonl.dump(data, fp)
```

##### dump_fork

Incrementally dumps multiple iterables into the specified jsonlines file paths,
effectively reducing memory consumption.

- Use (`.gz`, `.gzip`, `.bz2`) extensions to dump the compressed file.
- Dumps falls back to the following functions: (`orjson.dumps`, `ujson.dumps`, and `json.dumps`).

```
dump_fork(path_iterables, dump_if_empty=True, **kwargs)

:param Iterable[str, Iterable[Any]] path_iterables: Iterable of iterables by filepath
:param bool dump_if_empty: If false, don't create an empty jsonlines file.
:param kwargs: `json.dumps` kwargs
```

Examples:

```python
import jsonl

path_iterables = (
    ("num.jsonl", ({"value": 1}, {"value": 2})),
    ("foo.jsonl", ({"a": "1"}, {"b": 2})),
    ("num.jsonl", ({"value": 3},)),
    ("foo.jsonl", ()),
)
jsonl.dump_fork(path_iterables)
```

##### load

Deserialize a UTF-8-encoded jsonlines file into an iterable of Python objects.

- Recognizes (`.gz`, `.gzip`, `.bz2`)  extensions to load compressed files.
- Loads falls back to the following functions: (`orjson.loads`, `ujson.loads`, and `json.loads`).

```
def load(file, **kwargs)

:param Union[str | bytes | os.PathLike | io.IOBase] file: File to load
:param kwargs: `json.loads` kwargs
:rtype: Iterable[Any]
```

Examples:

```python
import gzip

import jsonl

# Load the uncompressed file from the given path.
iterable1 = jsonl.load("file1.jsonl")
print(tuple(iterable1))

# Load the gzipped file from the given path.
iterable2 = jsonl.load("file2.jsonl.gz")
print(tuple(iterable2))

# Load the gzipped file from the given open file.
with gzip.open("file3.jsonl.gz", mode="rb") as fp:
    iterable3 = jsonl.load(fp)
    print(tuple(iterable3))
```

### Unit tests

```
(env)$ pip install -r requirements.txt   # Ignore this command if it has already been executed
(env)$ pytest tests/
(env)$ pytest --cov jsonl # Tests with coverge
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/rmoralespp/jsonl",
    "name": "py-jsonl",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "jsonlines, ndjson, jsonl",
    "author": "rmoralespp",
    "author_email": "rmoralespp@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/e2/82/abd58965ecbae756a5c1eed510c340c532594df83451de6ad3ae75d12b1f/py_jsonl-1.1.2.tar.gz",
    "platform": null,
    "description": "# jsonl\n\n[![CI](https://github.com/rmoralespp/jsonl/workflows/CI/badge.svg)](https://github.com/rmoralespp/jsonl/actions?query=event%3Arelease+workflow%3ACI)\n[![pypi](https://img.shields.io/pypi/v/py-jsonl.svg)](https://pypi.python.org/pypi/py-jsonl)\n[![versions](https://img.shields.io/pypi/pyversions/py-jsonl.svg)](https://github.com/rmoralespp/jsonl)\n[![codecov](https://codecov.io/gh/rmoralespp/jsonl/branch/main/graph/badge.svg)](https://app.codecov.io/gh/rmoralespp/jsonl)\n[![license](https://img.shields.io/github/license/rmoralespp/jsonl.svg)](https://github.com/rmoralespp/jsonl/blob/main/LICENSE)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Linter: ruff](https://img.shields.io/badge/linter-_ruff-orange)](https://github.com/charliermarsh/ruff)\n\n### About\n\nUseful functions for working with jsonlines data as\ndescribed: https://jsonlines.org/\n\nFeatures:\n\n- Exposes an API similar to the `json` module from the standard library.\n- Supports `orjson`, `ujson` libraries or standard `json` for serialization/deserialization, prioritizing `orjson`,\n  then `ujson`, and defaulting to the standard `json` if none are installed.\n- Supports `gzip` and `bzip2` compression formats.\n\n### Installation (via pip)\n\n```pip install py-jsonl```\n\n### Usage\n\n##### dumps\n\nSerialize an iterable into a jsonlines formatted string.\n\n```\ndumps(iterable, **kwargs)\n\n:param Iterable[Any] iterable: Iterable of objects\n:param kwargs: `json.dumps` kwargs\n:rtype: str\n```\n\nExamples:\n\n```python\nimport jsonl\n\ndata = ({'foo': 1}, {'bar': 2})\nresult = jsonl.dumps(data)\nprint(result)  # >> '{\"foo\": 1}\\n{\"bar\": 2}\\n'\n```\n\n##### dump\n\nDump an iterable to a jsonlines file.\n- Use (`.gz`, `.gzip`, `.bz2`) extensions to dump the compressed file.\n- Dumps falls back to the following functions: (`orjson.dumps`, `ujson.dumps`, and `json.dumps`).\n\n```\ndump(iterable, file, **kwargs)\n\n:param Iterable[Any] iterable: Iterable of objects\n:param Union[str | bytes | os.PathLike | io.IOBase] file: File to dump\n:param kwargs: `json.dumps` kwargs\n```\n\nExamples:\n\n```python\nimport gzip\n\nimport jsonl\n\ndata = ({'foo': 1}, {'bar': 2})\n\n# Dump the data into an uncompressed file at the given path.\njsonl.dump(data, \"file1.jsonl\")\n\n# Dump the data into a gzipped file at the given path.\njsonl.dump(data, \"file2.jsonl.gz\")\n\n# Dump the data into the already opened gzipped file.\nwith gzip.open(\"file3.jsonl.gz\", mode=\"wb\") as fp:\n    jsonl.dump(data, fp)\n\n# Append the data to the end of the existing gzipped file.\nwith gzip.open(\"file3.jsonl.gz\", mode=\"ab\") as fp:\n    jsonl.dump(data, fp)\n```\n\n##### dump_fork\n\nIncrementally dumps multiple iterables into the specified jsonlines file paths,\neffectively reducing memory consumption.\n\n- Use (`.gz`, `.gzip`, `.bz2`) extensions to dump the compressed file.\n- Dumps falls back to the following functions: (`orjson.dumps`, `ujson.dumps`, and `json.dumps`).\n\n```\ndump_fork(path_iterables, dump_if_empty=True, **kwargs)\n\n:param Iterable[str, Iterable[Any]] path_iterables: Iterable of iterables by filepath\n:param bool dump_if_empty: If false, don't create an empty jsonlines file.\n:param kwargs: `json.dumps` kwargs\n```\n\nExamples:\n\n```python\nimport jsonl\n\npath_iterables = (\n    (\"num.jsonl\", ({\"value\": 1}, {\"value\": 2})),\n    (\"foo.jsonl\", ({\"a\": \"1\"}, {\"b\": 2})),\n    (\"num.jsonl\", ({\"value\": 3},)),\n    (\"foo.jsonl\", ()),\n)\njsonl.dump_fork(path_iterables)\n```\n\n##### load\n\nDeserialize a UTF-8-encoded jsonlines file into an iterable of Python objects.\n\n- Recognizes (`.gz`, `.gzip`, `.bz2`)  extensions to load compressed files.\n- Loads falls back to the following functions: (`orjson.loads`, `ujson.loads`, and `json.loads`).\n\n```\ndef load(file, **kwargs)\n\n:param Union[str | bytes | os.PathLike | io.IOBase] file: File to load\n:param kwargs: `json.loads` kwargs\n:rtype: Iterable[Any]\n```\n\nExamples:\n\n```python\nimport gzip\n\nimport jsonl\n\n# Load the uncompressed file from the given path.\niterable1 = jsonl.load(\"file1.jsonl\")\nprint(tuple(iterable1))\n\n# Load the gzipped file from the given path.\niterable2 = jsonl.load(\"file2.jsonl.gz\")\nprint(tuple(iterable2))\n\n# Load the gzipped file from the given open file.\nwith gzip.open(\"file3.jsonl.gz\", mode=\"rb\") as fp:\n    iterable3 = jsonl.load(fp)\n    print(tuple(iterable3))\n```\n\n### Unit tests\n\n```\n(env)$ pip install -r requirements.txt   # Ignore this command if it has already been executed\n(env)$ pytest tests/\n(env)$ pytest --cov jsonl # Tests with coverge\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A simple Python library for handling jsonlines files.",
    "version": "1.1.2",
    "project_urls": {
        "Homepage": "https://github.com/rmoralespp/jsonl"
    },
    "split_keywords": [
        "jsonlines",
        " ndjson",
        " jsonl"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4ce8738c2b3097b3452cb7c01f31f6ce41a0497987ef524601a6a982b2183ed2",
                "md5": "8774104a204a9652ae5c411467c8717f",
                "sha256": "68f4b1f558ef259dbab8cad40973165f41c181b97773b22962425fdb91316e54"
            },
            "downloads": -1,
            "filename": "py_jsonl-1.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8774104a204a9652ae5c411467c8717f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 5698,
            "upload_time": "2024-08-19T09:29:17",
            "upload_time_iso_8601": "2024-08-19T09:29:17.334127Z",
            "url": "https://files.pythonhosted.org/packages/4c/e8/738c2b3097b3452cb7c01f31f6ce41a0497987ef524601a6a982b2183ed2/py_jsonl-1.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e282abd58965ecbae756a5c1eed510c340c532594df83451de6ad3ae75d12b1f",
                "md5": "80f6bd6b635ec9e7b5a77d58f44654bd",
                "sha256": "8db526ceebb91703a9a55ed3015d5861a5dfabd4fcbbe115fb603965d9e8d63f"
            },
            "downloads": -1,
            "filename": "py_jsonl-1.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "80f6bd6b635ec9e7b5a77d58f44654bd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 6321,
            "upload_time": "2024-08-19T09:29:18",
            "upload_time_iso_8601": "2024-08-19T09:29:18.863233Z",
            "url": "https://files.pythonhosted.org/packages/e2/82/abd58965ecbae756a5c1eed510c340c532594df83451de6ad3ae75d12b1f/py_jsonl-1.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-19 09:29:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rmoralespp",
    "github_project": "jsonl",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "py-jsonl"
}
        
Elapsed time: 0.31969s