# Codebooks
Automatically generate codebooks from dataframes. Includes methods to:
* Infer variable type (as unique key, indicator, categorical, or continuous).
* Summarize values with histograms and KDEs.
* Generate a self-contained HTML report (may be extended to PDF or other formats in the future).
Usage:
codebooks -o output.html input.csv
## Example
![Screenshot of codebook for test dataset](https://raw.githubusercontent.com/mhowison/codebooks/dev/doc/screenshot.png)
## Adding variable descriptions
You can specify a csv file that maps variable names to descriptions using:
codebooks --desc descriptions.csv -o output.html input.csv
The csv file is expected to have two columns (variable, description).
## License
3-Clause BSD (see LICENSE)
## Tests
The `test/` subdirectory contains a script to generate a synthetic data set, an integration test for the codebooks package, and a benchmark script used to test performance optimizations. You can run these with:
cd test
python dataset.py
codebooks --desc desc.csv dataset.csv
codebooks --desc desc.csv --parquet dataset.parquet
python benchmark.py
## Authors
Mark Howison
[http://mark.howison.org](http://mark.howison.org)
Raw data
{
"_id": null,
"home_page": "https://github.com/mhowison/codebooks",
"name": "codebooks",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Mark Howison",
"author_email": "mark@howison.org",
"download_url": "https://files.pythonhosted.org/packages/26/15/d219330625145edd01de78a49d745629dff393a74a712ce2e9ce36159d00/codebooks-0.0.6.tar.gz",
"platform": null,
"description": "# Codebooks\r\n\r\nAutomatically generate codebooks from dataframes. Includes methods to:\r\n* Infer variable type (as unique key, indicator, categorical, or continuous).\r\n* Summarize values with histograms and KDEs.\r\n* Generate a self-contained HTML report (may be extended to PDF or other formats in the future).\r\n\r\nUsage:\r\n\r\n codebooks -o output.html input.csv\r\n\r\n## Example\r\n\r\n![Screenshot of codebook for test dataset](https://raw.githubusercontent.com/mhowison/codebooks/dev/doc/screenshot.png)\r\n\r\n## Adding variable descriptions\r\n\r\nYou can specify a csv file that maps variable names to descriptions using:\r\n\r\n codebooks --desc descriptions.csv -o output.html input.csv\r\n\r\nThe csv file is expected to have two columns (variable, description).\r\n\r\n## License\r\n\r\n3-Clause BSD (see LICENSE)\r\n\r\n## Tests\r\n\r\nThe `test/` subdirectory contains a script to generate a synthetic data set, an integration test for the codebooks package, and a benchmark script used to test performance optimizations. You can run these with:\r\n\r\n cd test\r\n python dataset.py\r\n codebooks --desc desc.csv dataset.csv\r\n codebooks --desc desc.csv --parquet dataset.parquet\r\n python benchmark.py\r\n\r\n## Authors\r\n\r\nMark Howison \r\n[http://mark.howison.org](http://mark.howison.org)\r\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "Automatic generation of codebooks from dataframes.",
"version": "0.0.6",
"project_urls": {
"Homepage": "https://github.com/mhowison/codebooks"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "43f3b6ca4630d1ae1c8f638fba714c9e85a18c56dd7450a8ad7705d699acab99",
"md5": "8792e39302694066b9d56e68c328db1a",
"sha256": "358e46985f14a7bf07d259766e74faaed7c19a4b8a5e90d008fced3a74e67684"
},
"downloads": -1,
"filename": "codebooks-0.0.6-1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8792e39302694066b9d56e68c328db1a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 13700,
"upload_time": "2024-10-21T02:24:04",
"upload_time_iso_8601": "2024-10-21T02:24:04.738383Z",
"url": "https://files.pythonhosted.org/packages/43/f3/b6ca4630d1ae1c8f638fba714c9e85a18c56dd7450a8ad7705d699acab99/codebooks-0.0.6-1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2615d219330625145edd01de78a49d745629dff393a74a712ce2e9ce36159d00",
"md5": "429bea4a7de24df9d24992fa2ea105d4",
"sha256": "fec501f88a7bb067cbb512779cf2a873f0dee1e1fa93153aa4a0c330d5c92dcc"
},
"downloads": -1,
"filename": "codebooks-0.0.6.tar.gz",
"has_sig": false,
"md5_digest": "429bea4a7de24df9d24992fa2ea105d4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 13407,
"upload_time": "2024-10-21T02:25:01",
"upload_time_iso_8601": "2024-10-21T02:25:01.334869Z",
"url": "https://files.pythonhosted.org/packages/26/15/d219330625145edd01de78a49d745629dff393a74a712ce2e9ce36159d00/codebooks-0.0.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-21 02:25:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mhowison",
"github_project": "codebooks",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "codebooks"
}