jsonlutils


Namejsonlutils JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryUtilities for working with JSONL and JSON formats
upload_time2025-08-25 20:53:57
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords json jsonl converter utilities
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # jsonlutils

A lightweight Python library for working with **JSONL (JSON Lines)** and **JSON** files.  
Provides easy conversion, validation, and utility functions.

## ✨ Features

- Convert JSONL → JSON (with metadata)
- Convert JSON → JSONL
- Validate JSONL consistency
- Extract keys from JSONL
- Get statistics (average values, counts, etc.)
- Stream JSONL objects with a generator

## 📦 Installation

```bash```
pip install jsonlutils

import jsonlutils as ju

## Functions 🚀 Quick Start

# Convert JSONL → JSON
Take in a JSONL file and convert it to a JSON file and output it to the specified output file
return the json dictionary object
Parameters:
input_file: str - the path to the input JSONL file
output_file: str - the path to the output JSON file
json_indent: int - the number of spaces to use for indentation in the output JSON file
jsonl_start_index: int - the starting index of the JSONL lines to include in the output JSON file
jsonl_end_index: int - the ending index of the JSONL lines to include in the output JSON file
print_error_logs: bool - whether to print error logs for invalid JSON lines in the input file
print_conversion_summary: bool - whether to print a summary of the conversion process
return_json_dict: bool - whether to return the JSON dictionary object
sort_data: bool - whether to sort the data array in the output JSON file
sort_data_key: str - the key to sort the data array by if sort_data is True
sort_descending: bool - whether to sort the data array in descending order if sort_data is

returns JSON dictionary unless parameter set to false

Expected output format returned and/or written to file:
{
    "config": {
        "name": "Converted Dataset",
        "Original_Data_set_filename": "input.jsonl",
        "number_of_objects": "100",
        "number_of_objects_in_original_file": "150"
    },
    "data": [
        {...}, 
        {...}
    ]
}

Base Usage
ju.convert_jsonl_to_json("data.jsonl", "out_data.json")

# Validate consistency
Checks if all JSON objects in a JSONL file have the same keys, and that every line is valid JSON
Parameters:
input_file: str - the path to the input JSONL file
print_error_messages: bool - whether to print error messages for inconsistent keys or invalid JSON lines

returns true if all json objects have the same keys

is_consistent = ju.check_jsonl_is_consistent("data.jsonl")
print("Consistent:", is_consistent)

# Get keys

Goes through a JSONL file and finds all unique keys in the JSON objects
Parameters:
input_file: str - the path to the input JSONL file
print_error_message: bool - whether to print error messages for invalid JSON lines

returns: set - set of all unique keys found in the JSON objects

Usage:
keys = ju.find_all_jsonl_keys("data.jsonl")
print("Keys found:", keys)

# Get json objects through yield
Reads a JSONL file and yields JSON objects
Parameters:
input_file: str - the path to the input JSONL file

Yields: dict - the next JSON object in the file

get_json_objects_from_jsonl_yield("data.jsonl")

#Get list of json objects
Reads a JSONL file and returns a list of JSON objects
Parameters:
input_file: str - the path to the input JSONL file

Returns: list - a list of JSON objects in the file

Usage:
json_list = get_list_of_json_objects_from_jsonl("data.jsonl")

# Get average numeric value of jsonl
Reads a JSONL file and returns the average value of a specified key
Parameters:
input_file: str - the path to the input JSONL file
key: str - the key to calculate the average value for, needs to be numeric
print_error_messages: bool - whether to print error messages for invalid JSON lines or non-numeric values

Returns: float - the average value of the specified key

Usage:
average_val = get_average_value_of_jsonl_value("data.jsonl", "num_key")

# Get number of valid json objects in jsonl
Reads a jsonl object and returns number of valid json and total lines in the file
parameters:
input_file - file to read
    
returns int, int - returns number of objects and then total lines

Usage:
get_number_of_json_objects_in_jsonl("data.jsonl")

# Convert JSON → JSONL
Takes a json objects and converts it to a jsonl file by taking an array of keys from the json object
and then writing each object in the data array to a new line in the jsonl file with all other keys being consistent
Parameters:
input_file: str - the path to the input JSON file
output_file: str - the path to the output JSONL file
data_key_array: list - the keys to extract from each object in the data array and write to the jsonl file
    
Returns: bool - True if the conversion was successful, False otherwise

Usage:
ju.convert_json_to_jsonl("data.json", "data.jsonl", data_key="data")

📂 Project Structure
jsonlutils/
│
├── src/jsonlutils/        # Core library code
│   ├── __init__.py
│   └── converter.py
│
├── tests/                 # Unit tests
│   └── test_converter.py
│
├── pyproject.toml         # Build metadata
├── README.md              # This file
├── LICENSE                # MIT license
└── .gitignore

📄 License

MIT License. See LICENSE for details.

Contributing

Fork the repo
Create a new branch (feature/my-feature)
Commit your changes
Push the branch and open a Pull Request

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "jsonlutils",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "json, jsonl, converter, utilities",
    "author": null,
    "author_email": "dja322 <dannyja0112@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/65/d4/6b242fe3b9090fcdd832f961bb40d5aec96999c82a51e27b8aa1592591e3/jsonlutils-0.1.1.tar.gz",
    "platform": null,
    "description": "# jsonlutils\r\n\r\nA lightweight Python library for working with **JSONL (JSON Lines)** and **JSON** files.  \r\nProvides easy conversion, validation, and utility functions.\r\n\r\n## \u2728 Features\r\n\r\n- Convert JSONL \u2192 JSON (with metadata)\r\n- Convert JSON \u2192 JSONL\r\n- Validate JSONL consistency\r\n- Extract keys from JSONL\r\n- Get statistics (average values, counts, etc.)\r\n- Stream JSONL objects with a generator\r\n\r\n## \ud83d\udce6 Installation\r\n\r\n```bash```\r\npip install jsonlutils\r\n\r\nimport jsonlutils as ju\r\n\r\n## Functions \ud83d\ude80 Quick Start\r\n\r\n# Convert JSONL \u2192 JSON\r\nTake in a JSONL file and convert it to a JSON file and output it to the specified output file\r\nreturn the json dictionary object\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\noutput_file: str - the path to the output JSON file\r\njson_indent: int - the number of spaces to use for indentation in the output JSON file\r\njsonl_start_index: int - the starting index of the JSONL lines to include in the output JSON file\r\njsonl_end_index: int - the ending index of the JSONL lines to include in the output JSON file\r\nprint_error_logs: bool - whether to print error logs for invalid JSON lines in the input file\r\nprint_conversion_summary: bool - whether to print a summary of the conversion process\r\nreturn_json_dict: bool - whether to return the JSON dictionary object\r\nsort_data: bool - whether to sort the data array in the output JSON file\r\nsort_data_key: str - the key to sort the data array by if sort_data is True\r\nsort_descending: bool - whether to sort the data array in descending order if sort_data is\r\n\r\nreturns JSON dictionary unless parameter set to false\r\n\r\nExpected output format returned and/or written to file:\r\n{\r\n    \"config\": {\r\n        \"name\": \"Converted Dataset\",\r\n        \"Original_Data_set_filename\": \"input.jsonl\",\r\n        \"number_of_objects\": \"100\",\r\n        \"number_of_objects_in_original_file\": \"150\"\r\n    },\r\n    \"data\": [\r\n        {...}, \r\n        {...}\r\n    ]\r\n}\r\n\r\nBase Usage\r\nju.convert_jsonl_to_json(\"data.jsonl\", \"out_data.json\")\r\n\r\n# Validate consistency\r\nChecks if all JSON objects in a JSONL file have the same keys, and that every line is valid JSON\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\nprint_error_messages: bool - whether to print error messages for inconsistent keys or invalid JSON lines\r\n\r\nreturns true if all json objects have the same keys\r\n\r\nis_consistent = ju.check_jsonl_is_consistent(\"data.jsonl\")\r\nprint(\"Consistent:\", is_consistent)\r\n\r\n# Get keys\r\n\r\nGoes through a JSONL file and finds all unique keys in the JSON objects\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\nprint_error_message: bool - whether to print error messages for invalid JSON lines\r\n\r\nreturns: set - set of all unique keys found in the JSON objects\r\n\r\nUsage:\r\nkeys = ju.find_all_jsonl_keys(\"data.jsonl\")\r\nprint(\"Keys found:\", keys)\r\n\r\n# Get json objects through yield\r\nReads a JSONL file and yields JSON objects\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\n\r\nYields: dict - the next JSON object in the file\r\n\r\nget_json_objects_from_jsonl_yield(\"data.jsonl\")\r\n\r\n#Get list of json objects\r\nReads a JSONL file and returns a list of JSON objects\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\n\r\nReturns: list - a list of JSON objects in the file\r\n\r\nUsage:\r\njson_list = get_list_of_json_objects_from_jsonl(\"data.jsonl\")\r\n\r\n# Get average numeric value of jsonl\r\nReads a JSONL file and returns the average value of a specified key\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\nkey: str - the key to calculate the average value for, needs to be numeric\r\nprint_error_messages: bool - whether to print error messages for invalid JSON lines or non-numeric values\r\n\r\nReturns: float - the average value of the specified key\r\n\r\nUsage:\r\naverage_val = get_average_value_of_jsonl_value(\"data.jsonl\", \"num_key\")\r\n\r\n# Get number of valid json objects in jsonl\r\nReads a jsonl object and returns number of valid json and total lines in the file\r\nparameters:\r\ninput_file - file to read\r\n    \r\nreturns int, int - returns number of objects and then total lines\r\n\r\nUsage:\r\nget_number_of_json_objects_in_jsonl(\"data.jsonl\")\r\n\r\n# Convert JSON \u2192 JSONL\r\nTakes a json objects and converts it to a jsonl file by taking an array of keys from the json object\r\nand then writing each object in the data array to a new line in the jsonl file with all other keys being consistent\r\nParameters:\r\ninput_file: str - the path to the input JSON file\r\noutput_file: str - the path to the output JSONL file\r\ndata_key_array: list - the keys to extract from each object in the data array and write to the jsonl file\r\n    \r\nReturns: bool - True if the conversion was successful, False otherwise\r\n\r\nUsage:\r\nju.convert_json_to_jsonl(\"data.json\", \"data.jsonl\", data_key=\"data\")\r\n\r\n\ud83d\udcc2 Project Structure\r\njsonlutils/\r\n\u2502\r\n\u251c\u2500\u2500 src/jsonlutils/        # Core library code\r\n\u2502   \u251c\u2500\u2500 __init__.py\r\n\u2502   \u2514\u2500\u2500 converter.py\r\n\u2502\r\n\u251c\u2500\u2500 tests/                 # Unit tests\r\n\u2502   \u2514\u2500\u2500 test_converter.py\r\n\u2502\r\n\u251c\u2500\u2500 pyproject.toml         # Build metadata\r\n\u251c\u2500\u2500 README.md              # This file\r\n\u251c\u2500\u2500 LICENSE                # MIT license\r\n\u2514\u2500\u2500 .gitignore\r\n\r\n\ud83d\udcc4 License\r\n\r\nMIT License. See LICENSE for details.\r\n\r\nContributing\r\n\r\nFork the repo\r\nCreate a new branch (feature/my-feature)\r\nCommit your changes\r\nPush the branch and open a Pull Request\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Utilities for working with JSONL and JSON formats",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/dja322/JsonlConverterUtils"
    },
    "split_keywords": [
        "json",
        " jsonl",
        " converter",
        " utilities"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3ae35605bed01099c4d75bb17e4fc3058ae5466154c8321eab1d270797a46766",
                "md5": "42432ebc7b2adb496953f539c483a5a2",
                "sha256": "6bc7c8e7f8982b651f70fb2a894115d2ccff0a741f4d4af7f9d9ec861eb98996"
            },
            "downloads": -1,
            "filename": "jsonlutils-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "42432ebc7b2adb496953f539c483a5a2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 7476,
            "upload_time": "2025-08-25T20:53:56",
            "upload_time_iso_8601": "2025-08-25T20:53:56.835410Z",
            "url": "https://files.pythonhosted.org/packages/3a/e3/5605bed01099c4d75bb17e4fc3058ae5466154c8321eab1d270797a46766/jsonlutils-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "65d46b242fe3b9090fcdd832f961bb40d5aec96999c82a51e27b8aa1592591e3",
                "md5": "665523dea349f554a66a93bbf9d2b68b",
                "sha256": "2cc01cddd5b5c98d7295baa63265d8f709a78c60f1378924de9b4e22b01e7dcb"
            },
            "downloads": -1,
            "filename": "jsonlutils-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "665523dea349f554a66a93bbf9d2b68b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 7932,
            "upload_time": "2025-08-25T20:53:57",
            "upload_time_iso_8601": "2025-08-25T20:53:57.682484Z",
            "url": "https://files.pythonhosted.org/packages/65/d4/6b242fe3b9090fcdd832f961bb40d5aec96999c82a51e27b8aa1592591e3/jsonlutils-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-25 20:53:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dja322",
    "github_project": "JsonlConverterUtils",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "jsonlutils"
}
        
Elapsed time: 2.34615s