pycobol2csv

Name	pycobol2csv JSON
Version	1.0.6 JSON
	download
home_page	https://github.com/jasonli-lijie/pycobol2csv
Summary	A Python library to convert COBOL ebcdic file to CSV format based on copybook
upload_time	2024-06-25 09:42:54
maintainer	None
docs_url	None
author	Jason Li
requires_python	None
license	MIT
keywords	cobol ebcdic csv
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # pycobol2csv
pycobol2csv is a Python library to convert COBOL ebcdic file to CSV format. The package is built to cater for advanced features in COBOL copybooks such as *OCCURES x TIMES*, *BINARY*, *COMP*. 

The CSV file is RDBMS friendly and all headers are ready to be used as database column names.
CSV conversion is controlled by config file in *csv_config.json*

- [x] Update in version 1.0.6

Recently Microsoft *upgraded* Spark in Synapse from version 3.2 to 3.4, which *upgraded* the included Python version from 3.8 to 3.10, which is a version with known issues on csv writer. 

Any users on Python 3.10 should **upgrade to pycobol2csv 1.0.6 and above ASAP**, otherwise there might get an error *_csv.Error: need to escape, but no escapechar set*. 

Other Python versions (such as 3.8, 3.11) are safe so far.

- [x] Update in version 1.0.5

Added more enhancements for the outdated REDEFINE and PIC syntax for a new client.



#### Install the python module:



`pip install pycobol2csv`

#### To use the module:

```
from pycobol2csv import convert_cobol_file, decode_copybook_file

row_length, cobol_struc = decode_copybook_file(copybook_file)

convert_cobol_file(copybook_file, data_file, output_file, config_file, codepage, debug=False)

```

- copybook_file: copybook filename
- data_file: data filename 
- output_file: output csv filename
- config_file: csv configuration filename, refer to csv_config.json
- codepage: codepage for edibic, refer to https://docs.python.org/3.7/library/codecs.html#standard-encodings for details
- debug: enable for more debug information, default is OFF

#### test 

2 sets of test data have been created from scratch. Each set includes a copybook and an EBCDIC data file.

To test:

```
python convert_cobol_test_main.py --copybook [COPYBOOK_FILE] --data [DATA_FILE] --output [CSV_FILE]

```

#### known issues and limitations

- Be aware of the resources available in your runtime environment and make sure the Cobol file size is not beyond the limit or cause any performance issue.

To handle large Cobol files, you can split the files into smaller chunks and then process the chunks in parallel. Please refer to the [medium post](https://medium.com/@jasonli.lijie/process-large-cobol-files-efficiently-with-pycobol2csv-pycobol2parquet-f023533607e4) for details.


<!-- Repo: https://github.com/jasonli-lijie/pycobol2csv -->

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jasonli-lijie/pycobol2csv",
    "name": "pycobol2csv",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "COBOL, EBCDIC, CSV",
    "author": "Jason Li",
    "author_email": "niomobileapp@gmail.com",
    "download_url": "https://github.com/user/reponame/archive/v_01.tar.gz",
    "platform": null,
    "description": "# pycobol2csv\r\npycobol2csv is a Python library to convert COBOL ebcdic file to CSV format. The package is built to cater for advanced features in COBOL copybooks such as *OCCURES x TIMES*, *BINARY*, *COMP*. \r\n\r\nThe CSV file is RDBMS friendly and all headers are ready to be used as database column names.\r\nCSV conversion is controlled by config file in *csv_config.json*\r\n\r\n- [x] Update in version 1.0.6\r\n\r\nRecently Microsoft *upgraded* Spark in Synapse from version 3.2 to 3.4, which *upgraded* the included Python version from 3.8 to 3.10, which is a version with known issues on csv writer. \r\n\r\nAny users on Python 3.10 should **upgrade to pycobol2csv 1.0.6 and above ASAP**, otherwise there might get an error *_csv.Error: need to escape, but no escapechar set*. \r\n\r\nOther Python versions (such as 3.8, 3.11) are safe so far.\r\n\r\n- [x] Update in version 1.0.5\r\n\r\nAdded more enhancements for the outdated REDEFINE and PIC syntax for a new client.\r\n\r\n\r\n\r\n#### Install the python module:\r\n\r\n\r\n\r\n`pip install pycobol2csv`\r\n\r\n#### To use the module:\r\n\r\n```\r\nfrom pycobol2csv import convert_cobol_file, decode_copybook_file\r\n\r\nrow_length, cobol_struc = decode_copybook_file(copybook_file)\r\n\r\nconvert_cobol_file(copybook_file, data_file, output_file, config_file, codepage, debug=False)\r\n\r\n```\r\n\r\n- copybook_file: copybook filename\r\n- data_file: data filename \r\n- output_file: output csv filename\r\n- config_file: csv configuration filename, refer to csv_config.json\r\n- codepage: codepage for edibic, refer to https://docs.python.org/3.7/library/codecs.html#standard-encodings for details\r\n- debug: enable for more debug information, default is OFF\r\n\r\n#### test \r\n\r\n2 sets of test data have been created from scratch. Each set includes a copybook and an EBCDIC data file.\r\n\r\nTo test:\r\n\r\n```\r\npython convert_cobol_test_main.py --copybook [COPYBOOK_FILE] --data [DATA_FILE] --output [CSV_FILE]\r\n\r\n```\r\n\r\n#### known issues and limitations\r\n\r\n- Be aware of the resources available in your runtime environment and make sure the Cobol file size is not beyond the limit or cause any performance issue.\r\n\r\nTo handle large Cobol files, you can split the files into smaller chunks and then process the chunks in parallel. Please refer to the [medium post](https://medium.com/@jasonli.lijie/process-large-cobol-files-efficiently-with-pycobol2csv-pycobol2parquet-f023533607e4) for details.\r\n\r\n\r\n<!-- Repo: https://github.com/jasonli-lijie/pycobol2csv -->\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python library to convert COBOL ebcdic file to CSV format based on copybook",
    "version": "1.0.6",
    "project_urls": {
        "Bug Tracker": "https://github.com/jasonli-lijie/pycobol2csv/issues",
        "Download": "https://github.com/user/reponame/archive/v_01.tar.gz",
        "Homepage": "https://github.com/jasonli-lijie/pycobol2csv"
    },
    "split_keywords": [
        "cobol",
        " ebcdic",
        " csv"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "df8664fbaf8ab886e2e0f8eaccfb38f7f36926fcab9c1720d6236496e44e5f7b",
                "md5": "98c1e043a92189e5d1d3a87d0fb05014",
                "sha256": "b8d336c752245dbd6fc85252ba52cd04ed2af83cc61a04a7952722521211be60"
            },
            "downloads": -1,
            "filename": "pycobol2csv-1.0.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "98c1e043a92189e5d1d3a87d0fb05014",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 9297,
            "upload_time": "2024-06-25T09:42:54",
            "upload_time_iso_8601": "2024-06-25T09:42:54.106320Z",
            "url": "https://files.pythonhosted.org/packages/df/86/64fbaf8ab886e2e0f8eaccfb38f7f36926fcab9c1720d6236496e44e5f7b/pycobol2csv-1.0.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-25 09:42:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jasonli-lijie",
    "github_project": "pycobol2csv",
    "github_not_found": true,
    "lcname": "pycobol2csv"
}

Jason Li