pyhtml2md


Namepyhtml2md JSON
Version 1.6.0 PyPI version JSON
download
home_pageNone
SummaryTransform your HTML into clean, easy-to-read markdown with pyhtml2md.
upload_time2024-06-01 09:48:25
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseMIT
keywords html markdown html-to-markdown python3 cpp17 cpp-library html2markdown html2md
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pyhtml2md

pyhtml2md provides a way to use the html2md C++ library in Python. html2md is a fast and reliable library for converting HTML content into markdown.

<div class="hidable-toc">

- [Installation](#installation)
- [Basic usage](#basic-usage)
- [Advanced usage](#advanced-usage)
- [Supported Tags](#supported-tags)
- [License](#license)

</div>

<div id="doxygen-toc" style="visibility:hidden">

[TOC]

</div>


## Installation

You can install using pip:

```bash
pip3 install pyhtml2md
```

## Basic usage

Here is an example of how to use the pyhtml2md to convert HTML to markdown:

```python
import pyhtml2md

markdown = pyhtml2md.convert("<h1>Hello, world!</h1>")
print(markdown)
```

The `convert` function takes an HTML string as input and returns a markdown string.

## Advanced usage

pyhtml2md provides a `Options` class to customize the generation process.  
You can find all information on the c++ [documentation](https://tim-gromeyer.github.io/html2md/index.html)

Here is an example:

```python
import pyhtml2md

options = pyhtml2md.Options()
options.splitLines = False

converter = pyhtml2md.Converter("<h1>Hello Python!</h1>", options)
markdown = converter.convert()
print(markdown)
print(converter.ok())
```

## Supported Tags

pyhtml2md supports the following HTML tags:

| Tag          | Description        | Comment                                             |
|--------------|--------------------|-----------------------------------------------------|
| `a`          | Anchor or link     | Supports the `href`, `name` and `title` attributes. |
| `b`          | Bold               |                                                     |
| `blockquote` | Indented paragraph |                                                     |
| `br`         | Line break         |                                                     |
| `cite`       | Inline citation    | Same as `i`.                                        |
| `code`       | Code               |                                                     |
| `dd`         | Definition data    |                                                     |
| `del`        | Strikethrough      |                                                     |
| `dfn`        | Definition         | Same as `i`.                                        |
| `div`        | Document division  |                                                     |
| `em`         | Emphasized         | Same as `i`.                                        |
| `h1`         | Level 1 heading    |                                                     |
| `h2`         | Level 2 heading    |                                                     |
| `h3`         | Level 3 heading    |                                                     |
| `h4`         | Level 4 heading    |                                                     |
| `h5`         | Level 5 heading    |                                                     |
| `h6`         | Level 6 heading    |                                                     |
| `head`       | Document header    | Ignored.                                            |
| `hr`         | Horizontal line    |                                                     |
| `i`          | Italic             |                                                     |
| `img`        | Image              | Supports `src`, `alt`, `title` attributes.          |
| `li`         | List item          |                                                     |
| `meta`       | Meta-information   | Ignored.                                            |
| `ol`         | Ordered list       |                                                     |
| `p`          | Paragraph          |                                                     |
| `pre`        | Preformatted text  | Works only with `code`.                             |
| `s`          | Strikethrough      | Same as `del`.                                      |
| `span`       | Grouped elements   | Does nothing.                                       |
| `strong`     | Strong             | Same as `b`.                                        |
| `table`      | Table              | Tables are formatted!                               |
| `tbody`      | Table body         | Does nothing.                                       |
| `td`         | Table data cell    | Uses `align` from `th`.                             |
| `tfoot`      | Table footer       | Does nothing.                                       |
| `th`         | Table header cell  | Supports the `align` attribute.                     |
| `thead`      | Table header       | Does nothing.                                       |
| `title`      | Document title     | Same as `h1`.                                       |
| `tr`         | Table row          |                                                     |
| `u`          | Underlined         | Uses HTML.                                          |
| `ul`         | Unordered list     |                                                     |

## License

pyhtml2md is licensed under [The MIT License (MIT)](https://opensource.org/licenses/MIT)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pyhtml2md",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "html, markdown, html-to-markdown, python3, cpp17, cpp-library, html2markdown, html2md",
    "author": null,
    "author_email": "Tim Gromeyer <sakul8826@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/9c/6d/16cd30465700df4797d778420154eb114c6df5138ae144c9946bed1a98cb/pyhtml2md-1.6.0.tar.gz",
    "platform": null,
    "description": "# pyhtml2md\n\npyhtml2md provides a way to use the html2md C++ library in Python. html2md is a fast and reliable library for converting HTML content into markdown.\n\n<div class=\"hidable-toc\">\n\n- [Installation](#installation)\n- [Basic usage](#basic-usage)\n- [Advanced usage](#advanced-usage)\n- [Supported Tags](#supported-tags)\n- [License](#license)\n\n</div>\n\n<div id=\"doxygen-toc\" style=\"visibility:hidden\">\n\n[TOC]\n\n</div>\n\n\n## Installation\n\nYou can install using pip:\n\n```bash\npip3 install pyhtml2md\n```\n\n## Basic usage\n\nHere is an example of how to use the pyhtml2md to convert HTML to markdown:\n\n```python\nimport pyhtml2md\n\nmarkdown = pyhtml2md.convert(\"<h1>Hello, world!</h1>\")\nprint(markdown)\n```\n\nThe `convert` function takes an HTML string as input and returns a markdown string.\n\n## Advanced usage\n\npyhtml2md provides a `Options` class to customize the generation process.  \nYou can find all information on the c++ [documentation](https://tim-gromeyer.github.io/html2md/index.html)\n\nHere is an example:\n\n```python\nimport pyhtml2md\n\noptions = pyhtml2md.Options()\noptions.splitLines = False\n\nconverter = pyhtml2md.Converter(\"<h1>Hello Python!</h1>\", options)\nmarkdown = converter.convert()\nprint(markdown)\nprint(converter.ok())\n```\n\n## Supported Tags\n\npyhtml2md supports the following HTML tags:\n\n| Tag          | Description        | Comment                                             |\n|--------------|--------------------|-----------------------------------------------------|\n| `a`          | Anchor or link     | Supports the `href`, `name` and `title` attributes. |\n| `b`          | Bold               |                                                     |\n| `blockquote` | Indented paragraph |                                                     |\n| `br`         | Line break         |                                                     |\n| `cite`       | Inline citation    | Same as `i`.                                        |\n| `code`       | Code               |                                                     |\n| `dd`         | Definition data    |                                                     |\n| `del`        | Strikethrough      |                                                     |\n| `dfn`        | Definition         | Same as `i`.                                        |\n| `div`        | Document division  |                                                     |\n| `em`         | Emphasized         | Same as `i`.                                        |\n| `h1`         | Level 1 heading    |                                                     |\n| `h2`         | Level 2 heading    |                                                     |\n| `h3`         | Level 3 heading    |                                                     |\n| `h4`         | Level 4 heading    |                                                     |\n| `h5`         | Level 5 heading    |                                                     |\n| `h6`         | Level 6 heading    |                                                     |\n| `head`       | Document header    | Ignored.                                            |\n| `hr`         | Horizontal line    |                                                     |\n| `i`          | Italic             |                                                     |\n| `img`        | Image              | Supports `src`, `alt`, `title` attributes.          |\n| `li`         | List item          |                                                     |\n| `meta`       | Meta-information   | Ignored.                                            |\n| `ol`         | Ordered list       |                                                     |\n| `p`          | Paragraph          |                                                     |\n| `pre`        | Preformatted text  | Works only with `code`.                             |\n| `s`          | Strikethrough      | Same as `del`.                                      |\n| `span`       | Grouped elements   | Does nothing.                                       |\n| `strong`     | Strong             | Same as `b`.                                        |\n| `table`      | Table              | Tables are formatted!                               |\n| `tbody`      | Table body         | Does nothing.                                       |\n| `td`         | Table data cell    | Uses `align` from `th`.                             |\n| `tfoot`      | Table footer       | Does nothing.                                       |\n| `th`         | Table header cell  | Supports the `align` attribute.                     |\n| `thead`      | Table header       | Does nothing.                                       |\n| `title`      | Document title     | Same as `h1`.                                       |\n| `tr`         | Table row          |                                                     |\n| `u`          | Underlined         | Uses HTML.                                          |\n| `ul`         | Unordered list     |                                                     |\n\n## License\n\npyhtml2md is licensed under [The MIT License (MIT)](https://opensource.org/licenses/MIT)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Transform your HTML into clean, easy-to-read markdown with pyhtml2md.",
    "version": "1.6.0",
    "project_urls": {
        "Repository": "https://github.com/tim-gromeyer/html2md"
    },
    "split_keywords": [
        "html",
        " markdown",
        " html-to-markdown",
        " python3",
        " cpp17",
        " cpp-library",
        " html2markdown",
        " html2md"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9c6d16cd30465700df4797d778420154eb114c6df5138ae144c9946bed1a98cb",
                "md5": "81c3d333a84ffe1254cd1046a59dd2fd",
                "sha256": "47a1b173ca49610457e438dfea57f89f16dbe6cbd26ca1ee0b5fd2b61f5fe60c"
            },
            "downloads": -1,
            "filename": "pyhtml2md-1.6.0.tar.gz",
            "has_sig": false,
            "md5_digest": "81c3d333a84ffe1254cd1046a59dd2fd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 1048123,
            "upload_time": "2024-06-01T09:48:25",
            "upload_time_iso_8601": "2024-06-01T09:48:25.785225Z",
            "url": "https://files.pythonhosted.org/packages/9c/6d/16cd30465700df4797d778420154eb114c6df5138ae144c9946bed1a98cb/pyhtml2md-1.6.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-01 09:48:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tim-gromeyer",
    "github_project": "html2md",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pyhtml2md"
}
        
Elapsed time: 0.75618s