# Mixed Width
## About
This project is designed to enable the easy reading/writing of fixed-width files with variable widths. For example:
```
FOO BAR BAZ
Hello World SomeExtraLongString
```
In this case we have differing widths between the columns. This means we can't just specify a single width for the columns. This might not be much of an issue itself, just code in the differing widths and parse using that, however with things like console output we can't rely on a consistent width for cells, which is where Mixed Width comes in. To resolve this, it detects the start of cells by looking for location where spaces precede letters which line up across all lines of the files. Using this method we are able to parse files with unknown widths and convert them into either a list of lists, or a list of dictionaries where the keys are the header values.
## Examples
To use `multiwidth` for the most part it's the same as the built-in `json` module. This means that parsing some output takes the form of:
```python
import multiwidth
string_to_parse = """FOO BAR BAZ
Hello World SomeExtraLongString"""
data = multiwidth.loads(string_to_parse)
print(data)
# output:
# [['Hello', 'World', 'SomeExtraLongString']]
```
If preserving the headers is important, `output_json=True` can be added to the `loads` method:
```python
import multiwidth
string_to_parse = """FOO BAR BAZ
Hello World SomeExtraLongString"""
data = multiwidth.loads(string_to_parse, output_json=True)
print(data)
# output:
# [{'FOO': 'Hello', 'BAR': 'World', 'BAZ': 'SomeExtraLongString'}]
```
Each line will then be a dictionary with the header keys and their corresponding values
In addition, if the content is stored in a file, `multiwidth.load(<file_object>)` can be used.
Finally, data can be output as well from multiwidth.
```python
import multiwidth
headers = ['FOO', 'BAR', 'BAZ']
data = [['Hello', 'World', 'SomeExtraLongString']]
print(multiwidth.dumps(data, headers=headers))
# Output:
# FOO BAR BAZ
# Hello World SomeExtraLongString
```
You can also control the spacing between columns with `cell_suffix='<your desired padding between columns>'`. For example:
```python
import multiwidth
headers = ['FOO', 'BAR', 'BAZ']
data = [['Hello', 'World', 'SomeExtraLongString']]
print(multiwidth.dumps(data, headers=headers, cell_suffix=' '))
# Output:
# FOO BAR BAZ
# Hello World SomeExtraLongString
```
You can also dump JSON data by omitting the `headers` argument:
```python
import multiwidth
data = [{'FOO': 'Hello', 'BAR': 'World', 'BAZ': 'SomeExtraLongString'}]
print(multiwidth.dumps(data))
# Output:
# FOO BAR BAZ
# Hello World SomeExtraLongString
```
Finally, you can dump to a file with `dumps(<your file object>)`
## Usage
**load**
```python
"""Parse data from a file object
Args:
file_object (io.TextIOWrapper): File object to read from
padding (str, optional): Which character takes up the space to create the fixed
width. Defaults to " ".
header (bool, optional): Does the file contain a header. Defaults to True.
output_json (bool, optional): Should a list of dictionaries be returned instead
of a list of lists. Defaults to False. Requires that 'header' be set to
True.
Returns:
Union[List[List],List[Dict]]: Either a list of lists or a list of dictionaries that
represent the extracted data
"""
```
**loads**
```python
"""Takes a string of a fixed-width file and breaks it apart into the data contained.
Args:
contents (str): String fixed-width contents.
padding (str, optional): Which character takes up the space to create the fixed
width. Defaults to " ".
header (bool, optional): Does the file contain a header. Defaults to True.
output_json (bool, optional): Should a list of dictionaries be returned instead
of a list of lists. Defaults to False. Requires that 'header' be set to
True.
Raises:
Exception: 'output_json' is True but 'header' is False.
Returns:
List[List] | List[Dict]: Either a list of lists or a list of dictionaries that
represent the extracted data
"""
```
**dump**
```python
"""Dumps a formatted table to a file
Args:
data (Union[List[List],List[Dict]]): Data to dump to a file. If using JSON data
then omit the `headers` argument
file_object (io.TextIOWrapper): File object to write to
headers (List[str], optional): Headers to use with list data. Defaults to None.
padding (str, optional): Character to use as padding between values. Defaults to
' '.
cell_suffix (str, optional): String to use as the padding between columns.
Defaults to ' '.
"""
```
**dumps**
```python
"""Dumps a formatted table to a string
Args:
data (Union[List[List],List[Dict]]): List or dictionary data to format
headers (List[str], optional): Headers to use with list data. Defaults to None.
padding (str, optional): Character to use as padding between values. Defaults to
' '.
cell_suffix (str, optional): String to use as the padding between columns.
Defaults to ' '.
Returns:
str: Formatted table of input data
"""
```
## License
Multiwidth is under the [MIT license](https://opensource.org/licenses/MIT).
## Contact
If you have any questions or concerns please reach out to me (John Carter) at [jfcarter2358@gmail.com](mailto:jfcarter2358@gmail.com)
Raw data
{
"_id": null,
"home_page": "https://github.com/jfcarter2358/multiwidth",
"name": "multiwidth",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9,<4.0",
"maintainer_email": "",
"keywords": "data,table,multiwidth",
"author": "John Carter",
"author_email": "jfcarter2358@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/85/97/54f5c24c9d9c2fed21ec4d6b2396a3a72c504a2e4ec10fe204930dcedb33/multiwidth-1.0.1.tar.gz",
"platform": null,
"description": "# Mixed Width\n\n## About\n\nThis project is designed to enable the easy reading/writing of fixed-width files with variable widths. For example:\n\n```\nFOO BAR BAZ\nHello World SomeExtraLongString\n```\n\nIn this case we have differing widths between the columns. This means we can't just specify a single width for the columns. This might not be much of an issue itself, just code in the differing widths and parse using that, however with things like console output we can't rely on a consistent width for cells, which is where Mixed Width comes in. To resolve this, it detects the start of cells by looking for location where spaces precede letters which line up across all lines of the files. Using this method we are able to parse files with unknown widths and convert them into either a list of lists, or a list of dictionaries where the keys are the header values.\n\n## Examples\n\nTo use `multiwidth` for the most part it's the same as the built-in `json` module. This means that parsing some output takes the form of:\n\n```python\nimport multiwidth\n\nstring_to_parse = \"\"\"FOO BAR BAZ\nHello World SomeExtraLongString\"\"\"\n\ndata = multiwidth.loads(string_to_parse)\n\nprint(data)\n# output:\n# [['Hello', 'World', 'SomeExtraLongString']]\n```\n\nIf preserving the headers is important, `output_json=True` can be added to the `loads` method:\n\n```python\nimport multiwidth\n\nstring_to_parse = \"\"\"FOO BAR BAZ\nHello World SomeExtraLongString\"\"\"\n\ndata = multiwidth.loads(string_to_parse, output_json=True)\n\nprint(data)\n# output:\n# [{'FOO': 'Hello', 'BAR': 'World', 'BAZ': 'SomeExtraLongString'}]\n```\n\nEach line will then be a dictionary with the header keys and their corresponding values\n\nIn addition, if the content is stored in a file, `multiwidth.load(<file_object>)` can be used. \n\nFinally, data can be output as well from multiwidth. \n\n```python\nimport multiwidth\n\nheaders = ['FOO', 'BAR', 'BAZ']\ndata = [['Hello', 'World', 'SomeExtraLongString']]\n\nprint(multiwidth.dumps(data, headers=headers))\n\n# Output:\n# FOO BAR BAZ\n# Hello World SomeExtraLongString \n```\n\nYou can also control the spacing between columns with `cell_suffix='<your desired padding between columns>'`. For example:\n\n```python\nimport multiwidth\n\nheaders = ['FOO', 'BAR', 'BAZ']\ndata = [['Hello', 'World', 'SomeExtraLongString']]\n\nprint(multiwidth.dumps(data, headers=headers, cell_suffix=' '))\n\n# Output:\n# FOO BAR BAZ\n# Hello World SomeExtraLongString \n```\n\nYou can also dump JSON data by omitting the `headers` argument:\n\n```python\nimport multiwidth\n\ndata = [{'FOO': 'Hello', 'BAR': 'World', 'BAZ': 'SomeExtraLongString'}]\n\nprint(multiwidth.dumps(data))\n\n# Output:\n# FOO BAR BAZ\n# Hello World SomeExtraLongString\n```\n\nFinally, you can dump to a file with `dumps(<your file object>)`\n\n## Usage\n\n**load**\n\n```python\n\"\"\"Parse data from a file object\n\nArgs:\n file_object (io.TextIOWrapper): File object to read from\n padding (str, optional): Which character takes up the space to create the fixed\n width. Defaults to \" \".\n header (bool, optional): Does the file contain a header. Defaults to True.\n output_json (bool, optional): Should a list of dictionaries be returned instead\n of a list of lists. Defaults to False. Requires that 'header' be set to\n True.\n\nReturns:\n Union[List[List],List[Dict]]: Either a list of lists or a list of dictionaries that\n represent the extracted data\n\"\"\"\n```\n\n**loads**\n\n```python\n\"\"\"Takes a string of a fixed-width file and breaks it apart into the data contained.\n\nArgs:\n contents (str): String fixed-width contents.\n padding (str, optional): Which character takes up the space to create the fixed\n width. Defaults to \" \".\n header (bool, optional): Does the file contain a header. Defaults to True.\n output_json (bool, optional): Should a list of dictionaries be returned instead\n of a list of lists. Defaults to False. Requires that 'header' be set to\n True.\n\nRaises:\n Exception: 'output_json' is True but 'header' is False.\n\nReturns:\n List[List] | List[Dict]: Either a list of lists or a list of dictionaries that\n represent the extracted data\n\"\"\"\n```\n\n**dump**\n\n```python\n\"\"\"Dumps a formatted table to a file\n\nArgs:\n data (Union[List[List],List[Dict]]): Data to dump to a file. If using JSON data\n then omit the `headers` argument\n file_object (io.TextIOWrapper): File object to write to\n headers (List[str], optional): Headers to use with list data. Defaults to None.\n padding (str, optional): Character to use as padding between values. Defaults to\n ' '.\n cell_suffix (str, optional): String to use as the padding between columns.\n Defaults to ' '.\n\"\"\"\n```\n\n**dumps**\n\n```python\n\"\"\"Dumps a formatted table to a string\n\nArgs:\n data (Union[List[List],List[Dict]]): List or dictionary data to format\n headers (List[str], optional): Headers to use with list data. Defaults to None.\n padding (str, optional): Character to use as padding between values. Defaults to\n ' '.\n cell_suffix (str, optional): String to use as the padding between columns.\n Defaults to ' '.\n\nReturns:\n str: Formatted table of input data\n\"\"\"\n```\n\n## License\n\nMultiwidth is under the [MIT license](https://opensource.org/licenses/MIT).\n\n## Contact\n\nIf you have any questions or concerns please reach out to me (John Carter) at [jfcarter2358@gmail.com](mailto:jfcarter2358@gmail.com)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A package for reading and writing mixed width tables",
"version": "1.0.1",
"split_keywords": [
"data",
"table",
"multiwidth"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6485e9fa9fbf2ea37c9b0839ca71e09505241d3dbc173defea2fc1f0dca7428b",
"md5": "72987990bad6471ff4d72b5aafab5982",
"sha256": "842d13fac268abf7ab90c5daad45d07d4ef6c5c86a17f88798b3c2a5fa6124b1"
},
"downloads": -1,
"filename": "multiwidth-1.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "72987990bad6471ff4d72b5aafab5982",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9,<4.0",
"size": 6532,
"upload_time": "2023-01-20T17:16:39",
"upload_time_iso_8601": "2023-01-20T17:16:39.048008Z",
"url": "https://files.pythonhosted.org/packages/64/85/e9fa9fbf2ea37c9b0839ca71e09505241d3dbc173defea2fc1f0dca7428b/multiwidth-1.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "859754f5c24c9d9c2fed21ec4d6b2396a3a72c504a2e4ec10fe204930dcedb33",
"md5": "e21cf5b288f8c1d11cf5c25cf5b74490",
"sha256": "f782ea465712dd9f2dcd576ade3bf7be905b3becd648ba9c05d33191d04dac89"
},
"downloads": -1,
"filename": "multiwidth-1.0.1.tar.gz",
"has_sig": false,
"md5_digest": "e21cf5b288f8c1d11cf5c25cf5b74490",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9,<4.0",
"size": 6221,
"upload_time": "2023-01-20T17:16:40",
"upload_time_iso_8601": "2023-01-20T17:16:40.574910Z",
"url": "https://files.pythonhosted.org/packages/85/97/54f5c24c9d9c2fed21ec4d6b2396a3a72c504a2e4ec10fe204930dcedb33/multiwidth-1.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-20 17:16:40",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "jfcarter2358",
"github_project": "multiwidth",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "multiwidth"
}