# clean your CSVs!
This command line tool cleans CSV files by:
1. converting encoding to utf-8
2. detecting the delimiter and safely converting it to a comma
3. casting all variables to json form, i.e. integers, floats, booleans, string or null.
* install `pip install csv-bleach`
* and run like `python -m csv_bleach my-data.csv`
The only option is the output file name, by default it will be your original file name with `.scsv` extension.
You will now be able to parse your CSV safely with a simple script like:
```python
import json
def parse_row(text):
return json.loads(f"[{text}]")
def parse_file(file):
rows = map(parse_row, file)
header = next(rows)
for row in rows:
yield dict(zip(header, row))
with open("my-data.scsv") as f:
for item in parse_file(f):
print(item)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/gecBurton/csv-bleach",
"name": "csv_bleach",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9.0",
"maintainer_email": "",
"keywords": "csv",
"author": "George Burton",
"author_email": "g.e.c.burton@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/38/cb/d31492cc98f56492798e390ab003c68457156064e33a2ff916a470a597b5/csv_bleach-2.4.0.tar.gz",
"platform": null,
"description": "# clean your CSVs!\n\nThis command line tool cleans CSV files by:\n1. converting encoding to utf-8\n2. detecting the delimiter and safely converting it to a comma\n3. casting all variables to json form, i.e. integers, floats, booleans, string or null.\n\n\n* install `pip install csv-bleach`\n* and run like `python -m csv_bleach my-data.csv`\n\nThe only option is the output file name, by default it will be your original file name with `.scsv` extension.\n\nYou will now be able to parse your CSV safely with a simple script like:\n\n```python\nimport json\n\n\ndef parse_row(text):\n return json.loads(f\"[{text}]\")\n\ndef parse_file(file):\n rows = map(parse_row, file)\n header = next(rows)\n for row in rows:\n yield dict(zip(header, row))\n\n\nwith open(\"my-data.scsv\") as f:\n for item in parse_file(f):\n print(item)\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "clean CSVs",
"version": "2.4.0",
"project_urls": {
"Homepage": "https://github.com/gecBurton/csv-bleach",
"Repository": "https://github.com/gecBurton/csv-bleach"
},
"split_keywords": [
"csv"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4daebeaf3330912c38372ffacd55ed3264147fbd79dbd54cfc1caaa9f1f63b40",
"md5": "31cc1543eed34c9e0f2337adf2958397",
"sha256": "ff2307c0d14ab9da793ec2f343ca94685ed810f874a815d8b94363426f859296"
},
"downloads": -1,
"filename": "csv_bleach-2.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "31cc1543eed34c9e0f2337adf2958397",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9.0",
"size": 6365,
"upload_time": "2024-01-02T10:16:14",
"upload_time_iso_8601": "2024-01-02T10:16:14.703183Z",
"url": "https://files.pythonhosted.org/packages/4d/ae/beaf3330912c38372ffacd55ed3264147fbd79dbd54cfc1caaa9f1f63b40/csv_bleach-2.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "38cbd31492cc98f56492798e390ab003c68457156064e33a2ff916a470a597b5",
"md5": "1fb09850ad24716f9637ba8b9d5b7090",
"sha256": "671a2be98925ea08bb3de0656283a8b5c33524fc7a28b6088228d29fcc72ca08"
},
"downloads": -1,
"filename": "csv_bleach-2.4.0.tar.gz",
"has_sig": false,
"md5_digest": "1fb09850ad24716f9637ba8b9d5b7090",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9.0",
"size": 4325,
"upload_time": "2024-01-02T10:16:16",
"upload_time_iso_8601": "2024-01-02T10:16:16.177646Z",
"url": "https://files.pythonhosted.org/packages/38/cb/d31492cc98f56492798e390ab003c68457156064e33a2ff916a470a597b5/csv_bleach-2.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-02 10:16:16",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "gecBurton",
"github_project": "csv-bleach",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "csv_bleach"
}