# csv-jsonl
A convenient module for writing a list of dictionaries or list of lists to a [`.jsonl`-formatted](https://jsonlines.org/) text file, suitable for ingestion by [BigQuery](https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-json) and other services.
`csv-jsonl` is built on top of Python's built-in `csv` module. It allows you to specify a `fieldnames` list to add a bit of assurance. Otherwise, no schema-handling is offered.
# Why not Just Use csv Files?
If you are here asking that question, I'm guessing you have not spent exciting times attempting to clean up poorly-formatted `csv` files (I'm looking at you, Excel).
# Other Data Formats
Basically supports anything with a `__getitem__`, as well as dataclasses. See `test` for everything.
# Installation
`pip install csv-jsonl`
# Usage
## List of Dictonaries
```python
>>> from csv_jsonl import JSONLinesDictWriter
>>> l = [{"foo": "bar", "bat": 1}, {"foo": "bar", "bat": 2}]
>>> with open("foo.jsonl", "w", encoding="utf-8") as _fh:
... writer = JSONLinesDictWriter(_fh)
... writer.writerows(l)
...
>>> d = {"foo": "bar", "bat": 1}
>>> with open("bar.jsonl", "w", encoding="utf-8") as _fh:
... writer = JSONLinesDictWriter(_fh)
... writer.writerow(d)
...
>>> from collections import OrderedDict
>>> od = OrderedDict([('foo', 'bar'), ('bat', 1)])
>>> with open("qux.jsonl", "w", encoding="utf-8") as _fh:
... writer = JSONLinesDictWriter(_fh)
... writer.writerow(od)
...
>>> fieldnames = ["foo", "bar"] # keys = ["foo", "bat"] expect fail
>>> with open("baz.jsonl", "w", encoding="utf-8") as _fh:
... writer = JSONLinesDictWriter(_fh, fieldnames=fieldnames)
... writer.writerows(l)
...
Expect ValueError
```
## List of Lists
```python
>>> from csv_jsonl import JSONLineslistWriter
>>> l = zip(["foo", "bar", "bat"], range(3), range(3))
>>> with open("foo.jsonl", "w", encoding="utf-8") as _fh:
... writer = JSONLinesListWriter(_fh)
... writer.writerows(l)
...
>>> l = zip(["foo", "bar", "bat"], range(3), range(3))
>>> with open("bar.jsonl", "w", encoding="utf-8") as _fh:
... writer = JSONLinesDictWriter(_fh)
... writer.writerow(next(l))
...
>>> fieldnames = ["baz", "qux", "quux"]
>>> l = zip(["foo", "bar", "bat"], range(3), range(3))
>>> with open("foo.jsonl", "w", encoding="utf-8") as _fh:
... writer = JSONLinesListWriter(_fh, fieldnames=fieldnames)
... writer.writeheader()
... writer.writerows(l)
...
```
[![pipeline status](https://gitlab.com/doug.shawhan/csv-jsonl/badges/main/pipeline.svg)](https://gitlab.com/doug.shawhan/csv-jsonl/-/commits/main)
[![Latest Release](https://gitlab.com/doug.shawhan/csv-jsonl/-/badges/release.svg)](https://gitlab.com/doug.shawhan/csv-jsonl/-/releases)
[![Downloads](https://pepy.tech/badge/csv-jsonl/month)](https://pepy.tech/project/csv-jsonl)
Raw data
{
"_id": null,
"home_page": "https://gitlab.com/doug.shawhan/csv-jsonl",
"name": "csv-jsonl",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "python,jsonl,jsonlines,csv,bigquery",
"author": "Doug Shawhan",
"author_email": "doug.shawhan@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/a1/7a/eada95f82b5f627d77c8b4f4f34ec66e4d21c3c52587908eb5c2fddaaa72/csv-jsonl-0.1.6.tar.gz",
"platform": "any",
"description": "# csv-jsonl\n\nA convenient module for writing a list of dictionaries or list of lists to a [`.jsonl`-formatted](https://jsonlines.org/) text file, suitable for ingestion by [BigQuery](https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-json) and other services.\n\n`csv-jsonl` is built on top of Python's built-in `csv` module. It allows you to specify a `fieldnames` list to add a bit of assurance. Otherwise, no schema-handling is offered.\n\n# Why not Just Use csv Files?\n\nIf you are here asking that question, I'm guessing you have not spent exciting times attempting to clean up poorly-formatted `csv` files (I'm looking at you, Excel).\n\n# Other Data Formats\n\nBasically supports anything with a `__getitem__`, as well as dataclasses. See `test` for everything.\n\n\n# Installation\n\n`pip install csv-jsonl`\n\n# Usage\n\n## List of Dictonaries\n\n```python\n>>> from csv_jsonl import JSONLinesDictWriter\n>>> l = [{\"foo\": \"bar\", \"bat\": 1}, {\"foo\": \"bar\", \"bat\": 2}]\n>>> with open(\"foo.jsonl\", \"w\", encoding=\"utf-8\") as _fh:\n... writer = JSONLinesDictWriter(_fh)\n... writer.writerows(l)\n...\n>>> d = {\"foo\": \"bar\", \"bat\": 1}\n>>> with open(\"bar.jsonl\", \"w\", encoding=\"utf-8\") as _fh:\n... writer = JSONLinesDictWriter(_fh)\n... writer.writerow(d)\n...\n>>> from collections import OrderedDict\n>>> od = OrderedDict([('foo', 'bar'), ('bat', 1)])\n>>> with open(\"qux.jsonl\", \"w\", encoding=\"utf-8\") as _fh:\n... writer = JSONLinesDictWriter(_fh)\n... writer.writerow(od)\n...\n>>> fieldnames = [\"foo\", \"bar\"] # keys = [\"foo\", \"bat\"] expect fail\n>>> with open(\"baz.jsonl\", \"w\", encoding=\"utf-8\") as _fh:\n... writer = JSONLinesDictWriter(_fh, fieldnames=fieldnames)\n... writer.writerows(l)\n...\nExpect ValueError\n```\n\n## List of Lists\n\n```python\n >>> from csv_jsonl import JSONLineslistWriter\n >>> l = zip([\"foo\", \"bar\", \"bat\"], range(3), range(3))\n >>> with open(\"foo.jsonl\", \"w\", encoding=\"utf-8\") as _fh:\n ... writer = JSONLinesListWriter(_fh)\n ... writer.writerows(l)\n ...\n >>> l = zip([\"foo\", \"bar\", \"bat\"], range(3), range(3))\n >>> with open(\"bar.jsonl\", \"w\", encoding=\"utf-8\") as _fh:\n ... writer = JSONLinesDictWriter(_fh)\n ... writer.writerow(next(l))\n ...\n >>> fieldnames = [\"baz\", \"qux\", \"quux\"]\n >>> l = zip([\"foo\", \"bar\", \"bat\"], range(3), range(3))\n >>> with open(\"foo.jsonl\", \"w\", encoding=\"utf-8\") as _fh:\n ... writer = JSONLinesListWriter(_fh, fieldnames=fieldnames)\n ... writer.writeheader()\n ... writer.writerows(l)\n ...\n```\n\n[![pipeline status](https://gitlab.com/doug.shawhan/csv-jsonl/badges/main/pipeline.svg)](https://gitlab.com/doug.shawhan/csv-jsonl/-/commits/main)\n[![Latest Release](https://gitlab.com/doug.shawhan/csv-jsonl/-/badges/release.svg)](https://gitlab.com/doug.shawhan/csv-jsonl/-/releases)\n[![Downloads](https://pepy.tech/badge/csv-jsonl/month)](https://pepy.tech/project/csv-jsonl)\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Leverage the built-in python csv module to write files in jsonl format",
"version": "0.1.6",
"split_keywords": [
"python",
"jsonl",
"jsonlines",
"csv",
"bigquery"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "a434c51199c7c5d2243bccfd4ddf9be898c049e6a453705dc851d73cca68d6a2",
"md5": "e6b1d7a2d9659cc7380a3ce6d417601b",
"sha256": "a1f0113ca916cbb60775bea2ebd31fec94dcf5c4794811189cff0cce7ed280a2"
},
"downloads": -1,
"filename": "csv_jsonl-0.1.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e6b1d7a2d9659cc7380a3ce6d417601b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 18228,
"upload_time": "2023-03-17T00:30:20",
"upload_time_iso_8601": "2023-03-17T00:30:20.698815Z",
"url": "https://files.pythonhosted.org/packages/a4/34/c51199c7c5d2243bccfd4ddf9be898c049e6a453705dc851d73cca68d6a2/csv_jsonl-0.1.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a17aeada95f82b5f627d77c8b4f4f34ec66e4d21c3c52587908eb5c2fddaaa72",
"md5": "38d54d9dc912de83c4cb472e4e3ea52f",
"sha256": "f270e5babf7f8e42804de30289f1fed7cb1f3d2b70899b6e2ecf814a26eb3b98"
},
"downloads": -1,
"filename": "csv-jsonl-0.1.6.tar.gz",
"has_sig": false,
"md5_digest": "38d54d9dc912de83c4cb472e4e3ea52f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 17240,
"upload_time": "2023-03-17T00:30:22",
"upload_time_iso_8601": "2023-03-17T00:30:22.530823Z",
"url": "https://files.pythonhosted.org/packages/a1/7a/eada95f82b5f627d77c8b4f4f34ec66e4d21c3c52587908eb5c2fddaaa72/csv-jsonl-0.1.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-17 00:30:22",
"github": false,
"gitlab": true,
"bitbucket": false,
"gitlab_user": "doug.shawhan",
"gitlab_project": "csv-jsonl",
"lcname": "csv-jsonl"
}