Name | jsonlutils JSON |
Version |
0.1.1
JSON |
| download |
home_page | None |
Summary | Utilities for working with JSONL and JSON formats |
upload_time | 2025-08-25 20:53:57 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | MIT |
keywords |
json
jsonl
converter
utilities
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# jsonlutils
A lightweight Python library for working with **JSONL (JSON Lines)** and **JSON** files.
Provides easy conversion, validation, and utility functions.
## ✨ Features
- Convert JSONL → JSON (with metadata)
- Convert JSON → JSONL
- Validate JSONL consistency
- Extract keys from JSONL
- Get statistics (average values, counts, etc.)
- Stream JSONL objects with a generator
## 📦 Installation
```bash```
pip install jsonlutils
import jsonlutils as ju
## Functions 🚀 Quick Start
# Convert JSONL → JSON
Take in a JSONL file and convert it to a JSON file and output it to the specified output file
return the json dictionary object
Parameters:
input_file: str - the path to the input JSONL file
output_file: str - the path to the output JSON file
json_indent: int - the number of spaces to use for indentation in the output JSON file
jsonl_start_index: int - the starting index of the JSONL lines to include in the output JSON file
jsonl_end_index: int - the ending index of the JSONL lines to include in the output JSON file
print_error_logs: bool - whether to print error logs for invalid JSON lines in the input file
print_conversion_summary: bool - whether to print a summary of the conversion process
return_json_dict: bool - whether to return the JSON dictionary object
sort_data: bool - whether to sort the data array in the output JSON file
sort_data_key: str - the key to sort the data array by if sort_data is True
sort_descending: bool - whether to sort the data array in descending order if sort_data is
returns JSON dictionary unless parameter set to false
Expected output format returned and/or written to file:
{
"config": {
"name": "Converted Dataset",
"Original_Data_set_filename": "input.jsonl",
"number_of_objects": "100",
"number_of_objects_in_original_file": "150"
},
"data": [
{...},
{...}
]
}
Base Usage
ju.convert_jsonl_to_json("data.jsonl", "out_data.json")
# Validate consistency
Checks if all JSON objects in a JSONL file have the same keys, and that every line is valid JSON
Parameters:
input_file: str - the path to the input JSONL file
print_error_messages: bool - whether to print error messages for inconsistent keys or invalid JSON lines
returns true if all json objects have the same keys
is_consistent = ju.check_jsonl_is_consistent("data.jsonl")
print("Consistent:", is_consistent)
# Get keys
Goes through a JSONL file and finds all unique keys in the JSON objects
Parameters:
input_file: str - the path to the input JSONL file
print_error_message: bool - whether to print error messages for invalid JSON lines
returns: set - set of all unique keys found in the JSON objects
Usage:
keys = ju.find_all_jsonl_keys("data.jsonl")
print("Keys found:", keys)
# Get json objects through yield
Reads a JSONL file and yields JSON objects
Parameters:
input_file: str - the path to the input JSONL file
Yields: dict - the next JSON object in the file
get_json_objects_from_jsonl_yield("data.jsonl")
#Get list of json objects
Reads a JSONL file and returns a list of JSON objects
Parameters:
input_file: str - the path to the input JSONL file
Returns: list - a list of JSON objects in the file
Usage:
json_list = get_list_of_json_objects_from_jsonl("data.jsonl")
# Get average numeric value of jsonl
Reads a JSONL file and returns the average value of a specified key
Parameters:
input_file: str - the path to the input JSONL file
key: str - the key to calculate the average value for, needs to be numeric
print_error_messages: bool - whether to print error messages for invalid JSON lines or non-numeric values
Returns: float - the average value of the specified key
Usage:
average_val = get_average_value_of_jsonl_value("data.jsonl", "num_key")
# Get number of valid json objects in jsonl
Reads a jsonl object and returns number of valid json and total lines in the file
parameters:
input_file - file to read
returns int, int - returns number of objects and then total lines
Usage:
get_number_of_json_objects_in_jsonl("data.jsonl")
# Convert JSON → JSONL
Takes a json objects and converts it to a jsonl file by taking an array of keys from the json object
and then writing each object in the data array to a new line in the jsonl file with all other keys being consistent
Parameters:
input_file: str - the path to the input JSON file
output_file: str - the path to the output JSONL file
data_key_array: list - the keys to extract from each object in the data array and write to the jsonl file
Returns: bool - True if the conversion was successful, False otherwise
Usage:
ju.convert_json_to_jsonl("data.json", "data.jsonl", data_key="data")
📂 Project Structure
jsonlutils/
│
├── src/jsonlutils/ # Core library code
│ ├── __init__.py
│ └── converter.py
│
├── tests/ # Unit tests
│ └── test_converter.py
│
├── pyproject.toml # Build metadata
├── README.md # This file
├── LICENSE # MIT license
└── .gitignore
📄 License
MIT License. See LICENSE for details.
Contributing
Fork the repo
Create a new branch (feature/my-feature)
Commit your changes
Push the branch and open a Pull Request
Raw data
{
"_id": null,
"home_page": null,
"name": "jsonlutils",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "json, jsonl, converter, utilities",
"author": null,
"author_email": "dja322 <dannyja0112@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/65/d4/6b242fe3b9090fcdd832f961bb40d5aec96999c82a51e27b8aa1592591e3/jsonlutils-0.1.1.tar.gz",
"platform": null,
"description": "# jsonlutils\r\n\r\nA lightweight Python library for working with **JSONL (JSON Lines)** and **JSON** files. \r\nProvides easy conversion, validation, and utility functions.\r\n\r\n## \u2728 Features\r\n\r\n- Convert JSONL \u2192 JSON (with metadata)\r\n- Convert JSON \u2192 JSONL\r\n- Validate JSONL consistency\r\n- Extract keys from JSONL\r\n- Get statistics (average values, counts, etc.)\r\n- Stream JSONL objects with a generator\r\n\r\n## \ud83d\udce6 Installation\r\n\r\n```bash```\r\npip install jsonlutils\r\n\r\nimport jsonlutils as ju\r\n\r\n## Functions \ud83d\ude80 Quick Start\r\n\r\n# Convert JSONL \u2192 JSON\r\nTake in a JSONL file and convert it to a JSON file and output it to the specified output file\r\nreturn the json dictionary object\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\noutput_file: str - the path to the output JSON file\r\njson_indent: int - the number of spaces to use for indentation in the output JSON file\r\njsonl_start_index: int - the starting index of the JSONL lines to include in the output JSON file\r\njsonl_end_index: int - the ending index of the JSONL lines to include in the output JSON file\r\nprint_error_logs: bool - whether to print error logs for invalid JSON lines in the input file\r\nprint_conversion_summary: bool - whether to print a summary of the conversion process\r\nreturn_json_dict: bool - whether to return the JSON dictionary object\r\nsort_data: bool - whether to sort the data array in the output JSON file\r\nsort_data_key: str - the key to sort the data array by if sort_data is True\r\nsort_descending: bool - whether to sort the data array in descending order if sort_data is\r\n\r\nreturns JSON dictionary unless parameter set to false\r\n\r\nExpected output format returned and/or written to file:\r\n{\r\n \"config\": {\r\n \"name\": \"Converted Dataset\",\r\n \"Original_Data_set_filename\": \"input.jsonl\",\r\n \"number_of_objects\": \"100\",\r\n \"number_of_objects_in_original_file\": \"150\"\r\n },\r\n \"data\": [\r\n {...}, \r\n {...}\r\n ]\r\n}\r\n\r\nBase Usage\r\nju.convert_jsonl_to_json(\"data.jsonl\", \"out_data.json\")\r\n\r\n# Validate consistency\r\nChecks if all JSON objects in a JSONL file have the same keys, and that every line is valid JSON\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\nprint_error_messages: bool - whether to print error messages for inconsistent keys or invalid JSON lines\r\n\r\nreturns true if all json objects have the same keys\r\n\r\nis_consistent = ju.check_jsonl_is_consistent(\"data.jsonl\")\r\nprint(\"Consistent:\", is_consistent)\r\n\r\n# Get keys\r\n\r\nGoes through a JSONL file and finds all unique keys in the JSON objects\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\nprint_error_message: bool - whether to print error messages for invalid JSON lines\r\n\r\nreturns: set - set of all unique keys found in the JSON objects\r\n\r\nUsage:\r\nkeys = ju.find_all_jsonl_keys(\"data.jsonl\")\r\nprint(\"Keys found:\", keys)\r\n\r\n# Get json objects through yield\r\nReads a JSONL file and yields JSON objects\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\n\r\nYields: dict - the next JSON object in the file\r\n\r\nget_json_objects_from_jsonl_yield(\"data.jsonl\")\r\n\r\n#Get list of json objects\r\nReads a JSONL file and returns a list of JSON objects\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\n\r\nReturns: list - a list of JSON objects in the file\r\n\r\nUsage:\r\njson_list = get_list_of_json_objects_from_jsonl(\"data.jsonl\")\r\n\r\n# Get average numeric value of jsonl\r\nReads a JSONL file and returns the average value of a specified key\r\nParameters:\r\ninput_file: str - the path to the input JSONL file\r\nkey: str - the key to calculate the average value for, needs to be numeric\r\nprint_error_messages: bool - whether to print error messages for invalid JSON lines or non-numeric values\r\n\r\nReturns: float - the average value of the specified key\r\n\r\nUsage:\r\naverage_val = get_average_value_of_jsonl_value(\"data.jsonl\", \"num_key\")\r\n\r\n# Get number of valid json objects in jsonl\r\nReads a jsonl object and returns number of valid json and total lines in the file\r\nparameters:\r\ninput_file - file to read\r\n \r\nreturns int, int - returns number of objects and then total lines\r\n\r\nUsage:\r\nget_number_of_json_objects_in_jsonl(\"data.jsonl\")\r\n\r\n# Convert JSON \u2192 JSONL\r\nTakes a json objects and converts it to a jsonl file by taking an array of keys from the json object\r\nand then writing each object in the data array to a new line in the jsonl file with all other keys being consistent\r\nParameters:\r\ninput_file: str - the path to the input JSON file\r\noutput_file: str - the path to the output JSONL file\r\ndata_key_array: list - the keys to extract from each object in the data array and write to the jsonl file\r\n \r\nReturns: bool - True if the conversion was successful, False otherwise\r\n\r\nUsage:\r\nju.convert_json_to_jsonl(\"data.json\", \"data.jsonl\", data_key=\"data\")\r\n\r\n\ud83d\udcc2 Project Structure\r\njsonlutils/\r\n\u2502\r\n\u251c\u2500\u2500 src/jsonlutils/ # Core library code\r\n\u2502 \u251c\u2500\u2500 __init__.py\r\n\u2502 \u2514\u2500\u2500 converter.py\r\n\u2502\r\n\u251c\u2500\u2500 tests/ # Unit tests\r\n\u2502 \u2514\u2500\u2500 test_converter.py\r\n\u2502\r\n\u251c\u2500\u2500 pyproject.toml # Build metadata\r\n\u251c\u2500\u2500 README.md # This file\r\n\u251c\u2500\u2500 LICENSE # MIT license\r\n\u2514\u2500\u2500 .gitignore\r\n\r\n\ud83d\udcc4 License\r\n\r\nMIT License. See LICENSE for details.\r\n\r\nContributing\r\n\r\nFork the repo\r\nCreate a new branch (feature/my-feature)\r\nCommit your changes\r\nPush the branch and open a Pull Request\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Utilities for working with JSONL and JSON formats",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/dja322/JsonlConverterUtils"
},
"split_keywords": [
"json",
" jsonl",
" converter",
" utilities"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "3ae35605bed01099c4d75bb17e4fc3058ae5466154c8321eab1d270797a46766",
"md5": "42432ebc7b2adb496953f539c483a5a2",
"sha256": "6bc7c8e7f8982b651f70fb2a894115d2ccff0a741f4d4af7f9d9ec861eb98996"
},
"downloads": -1,
"filename": "jsonlutils-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "42432ebc7b2adb496953f539c483a5a2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 7476,
"upload_time": "2025-08-25T20:53:56",
"upload_time_iso_8601": "2025-08-25T20:53:56.835410Z",
"url": "https://files.pythonhosted.org/packages/3a/e3/5605bed01099c4d75bb17e4fc3058ae5466154c8321eab1d270797a46766/jsonlutils-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "65d46b242fe3b9090fcdd832f961bb40d5aec96999c82a51e27b8aa1592591e3",
"md5": "665523dea349f554a66a93bbf9d2b68b",
"sha256": "2cc01cddd5b5c98d7295baa63265d8f709a78c60f1378924de9b4e22b01e7dcb"
},
"downloads": -1,
"filename": "jsonlutils-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "665523dea349f554a66a93bbf9d2b68b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 7932,
"upload_time": "2025-08-25T20:53:57",
"upload_time_iso_8601": "2025-08-25T20:53:57.682484Z",
"url": "https://files.pythonhosted.org/packages/65/d4/6b242fe3b9090fcdd832f961bb40d5aec96999c82a51e27b8aa1592591e3/jsonlutils-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-25 20:53:57",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dja322",
"github_project": "JsonlConverterUtils",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "jsonlutils"
}