# markpickle
Lossy serialization of markdown to simple python data types and back. It will create predictable markdown from a python object, but can't turn all markdown files into sensible python objects (for that use a markdown library that creates an AST). I created this because I wanted a way to turn json into Markdown.
For example this
```markdown
- 1
- 2
```
becomes the python list `[1, 2]`
Almost all markdown libraries use it as intended, as a way to generate HTML fragments from untrusted sources for insertion into some other HTML template. We are using it to represent data.
 [](https://pepy.tech/project/markpickle/month)
## Installation
```shell
pip install markpickle
```
## Capabilities
This is a lossy serialization. Markdown is missing too many concepts to make a high fidelity representation of a python data structure. If you want an object model that faithfully represents each object in a Markdown document, use the AST of mistune or one of the other markdown parsers.
Supported types
- Scalar values
- Lists of scalar values
- Dictionaries with scalar values
- Lists of dictionaries of scalar values
- Dictionaries with list values
- Partial support for blanks/string with leading/trailing whitespace
Not supported
- Things not ordinarily serializable
- Markdown that uses more than headers, lists, tables
- Blanks, falsy values, empty iterables don't round trip
- Scalar type inference doesn't round trip. After a scalar is converted to a markdown string, there is no indication if the original was a string or not.
# Serializing
Results can be formatted at cost of speed.
Dictionaries can be represented as tables or header text pairs.
# Deserializing
Markdown is deserialized by parsing the document to an abstract syntax tree. This is done by `mistune`. If the markdown file has the same structure that markpickle uses, then it will create a sensible object. Deserializing a random README.md file is not expected to always work. For that you should use mistune's AST.
# Round Tripping
Some but not all data structures will be round-trippable. The goal is that the sort of dicts you get from loading JSON will be round-trippable, provided everything is a string.
## Prior Art
People normally want to convert json to markdown. Json looks like python dict, so if you can do that you can probably do both.
### Serializing to Markdown
[json2md](https://github.com/IonicaBizau/json2md), a node package, will turn json that looks like the HTML document object model into markdown, e.g.
```python
{"h1": "Some Header",
"p": "Some Text"}
```
[tomark](https://pypi.org/project/tomark/) will turn dict into a markdown table. Unmaintained.
[pytablewriter](https://pytablewriter.readthedocs.io/en/latest/pages/reference/writers/text/markup/md.html) also, dict to table, but supports many tabular formats.
### Deserializing to Python
I don't know of any libraries that turn markdown into basic python types. At the moment, they all turn markdown into document object model.
[mistune](https://pypi.org/project/mistune/) will turn markdown into an Abstract Syntax Tree. The AST is faithful representation of the Markdown, including concepts that have no semantic equivalent to python datatypes.
[beautifulsoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) will let you navigate the HTML DOM. So you can turn the markdown into HTML, then parse with Beautiful Soup.
# Documentation
- [Examples](https://github.com/matthewdeanmartin/markpickle/blob/main/docs/examples.md)
- [TODO](https://github.com/matthewdeanmartin/markpickle/blob/main/docs/TODO.md)
- [People solving similar problems on StackOverflow](https://github.com/matthewdeanmartin/markpickle/blob/main/docs/stackoverflow.md)
Raw data
{
"_id": null,
"home_page": "https://github.com/matthewdeanmartin/markpickle",
"name": "markpickle",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "serializer,deserializer,markdown",
"author": "Matthew Martin",
"author_email": "matthewdeanmartin@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/bd/7b/836e0ca6d93e515128c96442dd92122fbbcdc0298d68ef80a07d23bcb716/markpickle-1.1.0.tar.gz",
"platform": null,
"description": "# markpickle\n\nLossy serialization of markdown to simple python data types and back. It will create predictable markdown from a python object, but can't turn all markdown files into sensible python objects (for that use a markdown library that creates an AST). I created this because I wanted a way to turn json into Markdown.\n\nFor example this\n\n```markdown\n- 1\n- 2\n```\n\nbecomes the python list `[1, 2]`\n\nAlmost all markdown libraries use it as intended, as a way to generate HTML fragments from untrusted sources for insertion into some other HTML template. We are using it to represent data.\n\n [](https://pepy.tech/project/markpickle/month)\n\n## Installation\n\n```shell\npip install markpickle\n```\n\n## Capabilities\nThis is a lossy serialization. Markdown is missing too many concepts to make a high fidelity representation of a python data structure. If you want an object model that faithfully represents each object in a Markdown document, use the AST of mistune or one of the other markdown parsers.\n\nSupported types\n\n- Scalar values\n- Lists of scalar values\n- Dictionaries with scalar values\n- Lists of dictionaries of scalar values\n- Dictionaries with list values\n- Partial support for blanks/string with leading/trailing whitespace\n\nNot supported\n\n- Things not ordinarily serializable\n- Markdown that uses more than headers, lists, tables\n- Blanks, falsy values, empty iterables don't round trip\n- Scalar type inference doesn't round trip. After a scalar is converted to a markdown string, there is no indication if the original was a string or not.\n\n# Serializing\nResults can be formatted at cost of speed.\n\nDictionaries can be represented as tables or header text pairs.\n\n# Deserializing\n\nMarkdown is deserialized by parsing the document to an abstract syntax tree. This is done by `mistune`. If the markdown file has the same structure that markpickle uses, then it will create a sensible object. Deserializing a random README.md file is not expected to always work. For that you should use mistune's AST.\n\n# Round Tripping\n\nSome but not all data structures will be round-trippable. The goal is that the sort of dicts you get from loading JSON will be round-trippable, provided everything is a string.\n\n\n## Prior Art\n\nPeople normally want to convert json to markdown. Json looks like python dict, so if you can do that you can probably do both.\n\n### Serializing to Markdown\n[json2md](https://github.com/IonicaBizau/json2md), a node package, will turn json that looks like the HTML document object model into markdown, e.g.\n```python\n{\"h1\": \"Some Header\",\n \"p\": \"Some Text\"}\n```\n\n[tomark](https://pypi.org/project/tomark/) will turn dict into a markdown table. Unmaintained.\n\n[pytablewriter](https://pytablewriter.readthedocs.io/en/latest/pages/reference/writers/text/markup/md.html) also, dict to table, but supports many tabular formats.\n\n### Deserializing to Python \n\nI don't know of any libraries that turn markdown into basic python types. At the moment, they all turn markdown into document object model.\n\n[mistune](https://pypi.org/project/mistune/) will turn markdown into an Abstract Syntax Tree. The AST is faithful representation of the Markdown, including concepts that have no semantic equivalent to python datatypes. \n\n[beautifulsoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) will let you navigate the HTML DOM. So you can turn the markdown into HTML, then parse with Beautiful Soup. \n\n# Documentation\n\n- [Examples](https://github.com/matthewdeanmartin/markpickle/blob/main/docs/examples.md)\n- [TODO](https://github.com/matthewdeanmartin/markpickle/blob/main/docs/TODO.md)\n- [People solving similar problems on StackOverflow](https://github.com/matthewdeanmartin/markpickle/blob/main/docs/stackoverflow.md)\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Lossy python to markdown serializer",
"version": "1.1.0",
"split_keywords": [
"serializer",
"deserializer",
"markdown"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0c873d7604260a76453ebec0ae1e55629646768b60bf64c0be265f20a33c89a7",
"md5": "c0023dc15518ad3e6e1305cbbe87e765",
"sha256": "e9b047b5f2d466fc756e081cf9d7d8218db2c5726fac11b99137ebb398310d8b"
},
"downloads": -1,
"filename": "markpickle-1.1.0-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "c0023dc15518ad3e6e1305cbbe87e765",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 12114,
"upload_time": "2023-03-19T16:33:16",
"upload_time_iso_8601": "2023-03-19T16:33:16.088904Z",
"url": "https://files.pythonhosted.org/packages/0c/87/3d7604260a76453ebec0ae1e55629646768b60bf64c0be265f20a33c89a7/markpickle-1.1.0-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bd7b836e0ca6d93e515128c96442dd92122fbbcdc0298d68ef80a07d23bcb716",
"md5": "eb94dd9409ab700b8681b80f3725449b",
"sha256": "5d9ac6280302a843be40ab5affedd47d6f79d605acd3fab75273087e349a94c6"
},
"downloads": -1,
"filename": "markpickle-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "eb94dd9409ab700b8681b80f3725449b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 11234,
"upload_time": "2023-03-19T16:33:19",
"upload_time_iso_8601": "2023-03-19T16:33:19.724550Z",
"url": "https://files.pythonhosted.org/packages/bd/7b/836e0ca6d93e515128c96442dd92122fbbcdc0298d68ef80a07d23bcb716/markpickle-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-19 16:33:19",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "matthewdeanmartin",
"github_project": "markpickle",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "markpickle"
}