# pyhtml2md
pyhtml2md provides a way to use the html2md C++ library in Python. html2md is a fast and reliable library for converting HTML content into markdown.
<div class="hidable-toc">
- [Installation](#installation)
- [Basic usage](#basic-usage)
- [Advanced usage](#advanced-usage)
- [Supported Tags](#supported-tags)
- [License](#license)
</div>
<div id="doxygen-toc" style="visibility:hidden">
[TOC]
</div>
## Installation
You can install using pip:
```bash
pip3 install pyhtml2md
```
## Basic usage
Here is an example of how to use the pyhtml2md to convert HTML to markdown:
```python
import pyhtml2md
markdown = pyhtml2md.convert("<h1>Hello, world!</h1>")
print(markdown)
```
The `convert` function takes an HTML string as input and returns a markdown string.
## Advanced usage
pyhtml2md provides a `Options` class to customize the generation process.
You can find all information on the c++ [documentation](https://tim-gromeyer.github.io/html2md/index.html)
Here is an example:
```python
import pyhtml2md
options = pyhtml2md.Options()
options.splitLines = False
converter = pyhtml2md.Converter("<h1>Hello Python!</h1>", options)
markdown = converter.convert()
print(markdown)
print(converter.ok())
```
## Supported Tags
pyhtml2md supports the following HTML tags:
| Tag | Description | Comment |
|--------------|--------------------|-----------------------------------------------------|
| `a` | Anchor or link | Supports the `href`, `name` and `title` attributes. |
| `b` | Bold | |
| `blockquote` | Indented paragraph | |
| `br` | Line break | |
| `cite` | Inline citation | Same as `i`. |
| `code` | Code | |
| `dd` | Definition data | |
| `del` | Strikethrough | |
| `dfn` | Definition | Same as `i`. |
| `div` | Document division | |
| `em` | Emphasized | Same as `i`. |
| `h1` | Level 1 heading | |
| `h2` | Level 2 heading | |
| `h3` | Level 3 heading | |
| `h4` | Level 4 heading | |
| `h5` | Level 5 heading | |
| `h6` | Level 6 heading | |
| `head` | Document header | Ignored. |
| `hr` | Horizontal line | |
| `i` | Italic | |
| `img` | Image | Supports `src`, `alt`, `title` attributes. |
| `li` | List item | |
| `meta` | Meta-information | Ignored. |
| `ol` | Ordered list | |
| `p` | Paragraph | |
| `pre` | Preformatted text | Works only with `code`. |
| `s` | Strikethrough | Same as `del`. |
| `span` | Grouped elements | Does nothing. |
| `strong` | Strong | Same as `b`. |
| `table` | Table | Tables are formatted! |
| `tbody` | Table body | Does nothing. |
| `td` | Table data cell | Uses `align` from `th`. |
| `tfoot` | Table footer | Does nothing. |
| `th` | Table header cell | Supports the `align` attribute. |
| `thead` | Table header | Does nothing. |
| `title` | Document title | Same as `h1`. |
| `tr` | Table row | |
| `u` | Underlined | Uses HTML. |
| `ul` | Unordered list | |
## License
pyhtml2md is licensed under [The MIT License (MIT)](https://opensource.org/licenses/MIT)
Raw data
{
"_id": null,
"home_page": null,
"name": "pyhtml2md",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "html, markdown, html-to-markdown, python3, cpp17, cpp-library, html2markdown, html2md",
"author": null,
"author_email": "Tim Gromeyer <sakul8826@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/9c/6d/16cd30465700df4797d778420154eb114c6df5138ae144c9946bed1a98cb/pyhtml2md-1.6.0.tar.gz",
"platform": null,
"description": "# pyhtml2md\n\npyhtml2md provides a way to use the html2md C++ library in Python. html2md is a fast and reliable library for converting HTML content into markdown.\n\n<div class=\"hidable-toc\">\n\n- [Installation](#installation)\n- [Basic usage](#basic-usage)\n- [Advanced usage](#advanced-usage)\n- [Supported Tags](#supported-tags)\n- [License](#license)\n\n</div>\n\n<div id=\"doxygen-toc\" style=\"visibility:hidden\">\n\n[TOC]\n\n</div>\n\n\n## Installation\n\nYou can install using pip:\n\n```bash\npip3 install pyhtml2md\n```\n\n## Basic usage\n\nHere is an example of how to use the pyhtml2md to convert HTML to markdown:\n\n```python\nimport pyhtml2md\n\nmarkdown = pyhtml2md.convert(\"<h1>Hello, world!</h1>\")\nprint(markdown)\n```\n\nThe `convert` function takes an HTML string as input and returns a markdown string.\n\n## Advanced usage\n\npyhtml2md provides a `Options` class to customize the generation process. \nYou can find all information on the c++ [documentation](https://tim-gromeyer.github.io/html2md/index.html)\n\nHere is an example:\n\n```python\nimport pyhtml2md\n\noptions = pyhtml2md.Options()\noptions.splitLines = False\n\nconverter = pyhtml2md.Converter(\"<h1>Hello Python!</h1>\", options)\nmarkdown = converter.convert()\nprint(markdown)\nprint(converter.ok())\n```\n\n## Supported Tags\n\npyhtml2md supports the following HTML tags:\n\n| Tag | Description | Comment |\n|--------------|--------------------|-----------------------------------------------------|\n| `a` | Anchor or link | Supports the `href`, `name` and `title` attributes. |\n| `b` | Bold | |\n| `blockquote` | Indented paragraph | |\n| `br` | Line break | |\n| `cite` | Inline citation | Same as `i`. |\n| `code` | Code | |\n| `dd` | Definition data | |\n| `del` | Strikethrough | |\n| `dfn` | Definition | Same as `i`. |\n| `div` | Document division | |\n| `em` | Emphasized | Same as `i`. |\n| `h1` | Level 1 heading | |\n| `h2` | Level 2 heading | |\n| `h3` | Level 3 heading | |\n| `h4` | Level 4 heading | |\n| `h5` | Level 5 heading | |\n| `h6` | Level 6 heading | |\n| `head` | Document header | Ignored. |\n| `hr` | Horizontal line | |\n| `i` | Italic | |\n| `img` | Image | Supports `src`, `alt`, `title` attributes. |\n| `li` | List item | |\n| `meta` | Meta-information | Ignored. |\n| `ol` | Ordered list | |\n| `p` | Paragraph | |\n| `pre` | Preformatted text | Works only with `code`. |\n| `s` | Strikethrough | Same as `del`. |\n| `span` | Grouped elements | Does nothing. |\n| `strong` | Strong | Same as `b`. |\n| `table` | Table | Tables are formatted! |\n| `tbody` | Table body | Does nothing. |\n| `td` | Table data cell | Uses `align` from `th`. |\n| `tfoot` | Table footer | Does nothing. |\n| `th` | Table header cell | Supports the `align` attribute. |\n| `thead` | Table header | Does nothing. |\n| `title` | Document title | Same as `h1`. |\n| `tr` | Table row | |\n| `u` | Underlined | Uses HTML. |\n| `ul` | Unordered list | |\n\n## License\n\npyhtml2md is licensed under [The MIT License (MIT)](https://opensource.org/licenses/MIT)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Transform your HTML into clean, easy-to-read markdown with pyhtml2md.",
"version": "1.6.0",
"project_urls": {
"Repository": "https://github.com/tim-gromeyer/html2md"
},
"split_keywords": [
"html",
" markdown",
" html-to-markdown",
" python3",
" cpp17",
" cpp-library",
" html2markdown",
" html2md"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9c6d16cd30465700df4797d778420154eb114c6df5138ae144c9946bed1a98cb",
"md5": "81c3d333a84ffe1254cd1046a59dd2fd",
"sha256": "47a1b173ca49610457e438dfea57f89f16dbe6cbd26ca1ee0b5fd2b61f5fe60c"
},
"downloads": -1,
"filename": "pyhtml2md-1.6.0.tar.gz",
"has_sig": false,
"md5_digest": "81c3d333a84ffe1254cd1046a59dd2fd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 1048123,
"upload_time": "2024-06-01T09:48:25",
"upload_time_iso_8601": "2024-06-01T09:48:25.785225Z",
"url": "https://files.pythonhosted.org/packages/9c/6d/16cd30465700df4797d778420154eb114c6df5138ae144c9946bed1a98cb/pyhtml2md-1.6.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-01 09:48:25",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tim-gromeyer",
"github_project": "html2md",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pyhtml2md"
}