# html-to-adf - Convert HTML to ADF (Atlassian Document Format)
## What is this?
This is a rudimentary python helper module, dedicated to producing Jira/Confluence ready ADF out of incoming HTML.
The module itself attempts to handle generalized sanitization, while cutting some corners to forcibly marshal 'whatever' text into something tangible and compatible with Jira/Confluence.
This module is focused at **front-loading** incoming comments and descriptions for Jira and follows out of the box a relatively strict subset of tags from HTML.
## Dependencies
Everything here is mostly a painfully hand-rolled parser; we have one dependency which is: `beautifulsoup4` non-version specific.
### Supported and Converted tags
```html
<html>
<body>
<h1>...<h6>
<head> -> Converted to a heading type
<title> -> Converted to a heading type
<div> -> Converted to a paragraph type
<p>
<table> -> Represents a tablebase
<thead> -> Represents a tablehead
<tbody> -> Represents a tablebody
<tr> -> represents a tablerow
<th> -> represents a tablecell
<td> -> represents a tablecell
Modifiers:
<b>
<strong>
<i>
<em>
<s>
<u>
```
We also _support links and `<a>` tags_. (The magic under the hood can break; usually defaulting to nothing happening at all or [your entire line being a link](https://example.com/))
### Example:
We'll convert the following HTML to ADF:
```python
# test.py
from html_to_adf import import_html_to_str, convert_html_to_adf, export_document
html_text = import_html_to_str("test.html")
# If you were going to send this in an API request format, you would want to structure the ADF around a 'body {}'
# adding True to: convert_html_to_adf(html_text, True) will wrap the entire contents of the dict in a body {} for your ease of use.
resulting_adf_document: dict = convert_html_to_adf(html_text)
print(resulting_adf_document)
export_document(resulting_adf_document)
```
```html
<!--test.html-->
<html>
<head>
<title>Monthly Sales Report</title>
</head>
<body>
<h1>Monthly Sales Report</h1>
<p>
The following table shows <b>sales performance</b> by region for
<i>September 2025</i>.
For more info, visit our
<a href="https://example.com/reports/september">report page</a>.
</p>
<table>
<thead>
<tr>
<th>Region</th>
<th>Sales ($)</th>
<th>Growth</th>
</tr>
</thead>
<tbody>
<tr>
<td>North America</td>
<td><b>125,000</b></td>
<td><span style="color:green">+5%</span></td>
</tr>
<tr>
<td>Europe</td>
<td>98,500</td>
<td><span style="color:red">-2%</span></td>
</tr>
<tr>
<td>Asia-Pacific</td>
<td>142,750</td>
<td><u>+8%</u></td>
</tr>
</tbody>
</table>
<p>
Summary: <strong>Asia-Pacific</strong> led the month with strong growth.
<br>
Keep up the great work!
</p>
</body>
</html>
```
This yields the following ADF:
[Here's the rather large textblob](https://raw.githubusercontent.com/actes2/html-to-adf/refs/heads/main/tests/output.json)
_To view in live time, copy that blob and float on over to the Atlassian Live Document Preview: https://developer.atlassian.com/cloud/jira/platform/apis/document/viewer/_

## Further development and support
This module is a creation of necessity, not passion; there's a large chance I won't update it very much, but that said if I get inspired you never know!
Feel free to drop something in the Issues section as you see them and I may visit those issues!
Raw data
{
"_id": null,
"home_page": null,
"name": "html-to-adf",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "atlassian, jira, adf, html, converter",
"author": null,
"author_email": "Avery Tomlin <actesbusiness@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/40/7d/31b48bc506bf3eaf1043c892c0b9f657f879124ed119cc809aba1800eadb/html_to_adf-0.1.0.tar.gz",
"platform": null,
"description": "# html-to-adf - Convert HTML to ADF (Atlassian Document Format)\n\n## What is this?\nThis is a rudimentary python helper module, dedicated to producing Jira/Confluence ready ADF out of incoming HTML.\n\nThe module itself attempts to handle generalized sanitization, while cutting some corners to forcibly marshal 'whatever' text into something tangible and compatible with Jira/Confluence.\n\nThis module is focused at **front-loading** incoming comments and descriptions for Jira and follows out of the box a relatively strict subset of tags from HTML.\n\n\n## Dependencies\nEverything here is mostly a painfully hand-rolled parser; we have one dependency which is: `beautifulsoup4` non-version specific.\n\n### Supported and Converted tags\n```html\n<html> \n<body>\n\n<h1>...<h6>\n<head> -> Converted to a heading type\n<title> -> Converted to a heading type\n\n<div> -> Converted to a paragraph type\n<p>\n\n<table> -> Represents a tablebase\n<thead> -> Represents a tablehead\n<tbody> -> Represents a tablebody\n\n<tr> -> represents a tablerow\n<th> -> represents a tablecell\n<td> -> represents a tablecell\n\nModifiers:\n\n<b>\n<strong>\n<i>\n<em>\n<s>\n<u>\n```\n\nWe also _support links and `<a>` tags_. (The magic under the hood can break; usually defaulting to nothing happening at all or [your entire line being a link](https://example.com/))\n\n### Example:\n\nWe'll convert the following HTML to ADF:\n\n```python\n# test.py\nfrom html_to_adf import import_html_to_str, convert_html_to_adf, export_document\n\nhtml_text = import_html_to_str(\"test.html\")\n\n# If you were going to send this in an API request format, you would want to structure the ADF around a 'body {}'\n# adding True to: convert_html_to_adf(html_text, True) will wrap the entire contents of the dict in a body {} for your ease of use.\nresulting_adf_document: dict = convert_html_to_adf(html_text)\n\n\nprint(resulting_adf_document)\nexport_document(resulting_adf_document)\n\n```\n\n```html\n<!--test.html-->\n\n<html>\n <head>\n <title>Monthly Sales Report</title>\n </head>\n <body>\n <h1>Monthly Sales Report</h1>\n <p>\n The following table shows <b>sales performance</b> by region for \n <i>September 2025</i>. \n For more info, visit our \n <a href=\"https://example.com/reports/september\">report page</a>.\n </p>\n\n <table>\n <thead>\n <tr>\n <th>Region</th>\n <th>Sales ($)</th>\n <th>Growth</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <td>North America</td>\n <td><b>125,000</b></td>\n <td><span style=\"color:green\">+5%</span></td>\n </tr>\n <tr>\n <td>Europe</td>\n <td>98,500</td>\n <td><span style=\"color:red\">-2%</span></td>\n </tr>\n <tr>\n <td>Asia-Pacific</td>\n <td>142,750</td>\n <td><u>+8%</u></td>\n </tr>\n </tbody>\n </table>\n\n <p>\n Summary: <strong>Asia-Pacific</strong> led the month with strong growth.\n <br>\n Keep up the great work!\n </p>\n </body>\n</html>\n```\n\nThis yields the following ADF:\n[Here's the rather large textblob](https://raw.githubusercontent.com/actes2/html-to-adf/refs/heads/main/tests/output.json)\n_To view in live time, copy that blob and float on over to the Atlassian Live Document Preview: https://developer.atlassian.com/cloud/jira/platform/apis/document/viewer/_\n\n\n\n\n## Further development and support\n\nThis module is a creation of necessity, not passion; there's a large chance I won't update it very much, but that said if I get inspired you never know!\n\nFeel free to drop something in the Issues section as you see them and I may visit those issues!\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Convert HTML to Atlassian Document Format (ADF) for Jira and Confluence.",
"version": "0.1.0",
"project_urls": {
"Documentation": "https://github.com/actes2/html-to-adf#readme",
"Homepage": "https://github.com/actes2/html-to-adf",
"Issues": "https://github.com/actes2/html-to-adf/issues"
},
"split_keywords": [
"atlassian",
" jira",
" adf",
" html",
" converter"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4d892e6de9ee20a2e82ccdc5d880291c68ec18598f6141b300a28e55e206fe04",
"md5": "5d12594a7d169d7d5e6c699861274d4f",
"sha256": "45a5aa0d9c082108e2bc46bee8a731cabfc964cd267f2034171630964f30403d"
},
"downloads": -1,
"filename": "html_to_adf-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5d12594a7d169d7d5e6c699861274d4f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 12938,
"upload_time": "2025-10-10T23:31:32",
"upload_time_iso_8601": "2025-10-10T23:31:32.320317Z",
"url": "https://files.pythonhosted.org/packages/4d/89/2e6de9ee20a2e82ccdc5d880291c68ec18598f6141b300a28e55e206fe04/html_to_adf-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "407d31b48bc506bf3eaf1043c892c0b9f657f879124ed119cc809aba1800eadb",
"md5": "8b67f499f09ea0222e11055d39a64e1a",
"sha256": "8869386360550599f51be004be70f18b000b934efe38efeb44b00735259c243a"
},
"downloads": -1,
"filename": "html_to_adf-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "8b67f499f09ea0222e11055d39a64e1a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 13132,
"upload_time": "2025-10-10T23:31:33",
"upload_time_iso_8601": "2025-10-10T23:31:33.611210Z",
"url": "https://files.pythonhosted.org/packages/40/7d/31b48bc506bf3eaf1043c892c0b9f657f879124ed119cc809aba1800eadb/html_to_adf-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-10 23:31:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "actes2",
"github_project": "html-to-adf#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "beautifulsoup4",
"specs": []
}
],
"lcname": "html-to-adf"
}