| Name | adparser JSON |
| Version |
0.1.0
JSON |
| download |
| home_page | None |
| Summary | The adparser library for Python provides powerful capabilities for working with AsciiDoc documents |
| upload_time | 2024-08-15 21:42:22 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | >=3.10 |
| license | None |
| keywords |
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
The adparser library for Python provides powerful capabilities for
working with AsciiDoc documents. In this Quick Start, you’ll learn how
to use the library’s main functions to extract various elements from an
AsciiDoc document.
# Installation
Install the asciidoc library using pip:
pip install adparser
It is also necessary that **asciidoctor** is preinstalled in the system.
You can find out how to do this by following the link
<https://asciidoctor.org/#installation> . Before using the library, make
sure that the asciidoctor path is in the *PATH*.
# Extracting Document Elements
The asciidoc library can extract the following elements from an AsciiDoc
document:
- text lines - the paragraph element are made up of it
- link
- paragraphs
- headings
- lists
- source blocks
- tables
- audio, video, and images.
To access these elements, you can use the Parser object.
# Parser object
To start parsing, we need to create Parser object:
```python
from adparser import Parser
my_file = open("test.adoc")
parser = Parser(my_file)
```
# Parser methods
To work with each of the document elements described above, the Parser
object has its own methods:
- text\_lines()
- links()
- paragraphs()
- headings
- lists
- source\_blocks()
- tables()
- audios()
- images()
- videos()
## Example
**test.adoc**
```asciidoc
= Document Title
This is a paragraph.
== Section 1
This is another paragraph.
[source,python]
print("Hello, World!")
[NOTE]
This is a note.
image::image.png[]
```
```python
>>> from adparser import Parser
... my_file = open("test.adoc")
... parser = Parser(my_file)
>>> for docelem in parser.headings():
... print(docelem.data)
'Document Title'
'Section 1'
>>> for docelem in parser.source_blocks():
... print(docelem.data)
... print(docelem.styles)
'print("Hello, World!")'
['listingblock', 'python']
```
The functions return an iterators for the objects-elements of the
document. They store the following attributes:
- data: The data associated with the element. Usually text, but in the
case of tables, you can get a dictionary (see the example at the end
of the readme).
- section: List of sections of the document the element belongs to
- styles: List of styles of the object
- attribute (**only for links**): text of the link
List of styles:
- text\_line
- italic
- bold
- monospace
- source
- source languages
- for all elements admonition styles
- note
- tip
- caution
- warning
- for all elements area style
- sidebarblock
- exampleblock
- quoteblock
- listningblock
- literalblock
You can get the text from the paragraph object only through the
**get\_text()** method. It has a url_opt parameter.
url_opt can be:
- 'show\_urls'
- 'hide\_urls'
This option can hide the url of a link ,hyperlink, media src(image,
audio, video) or show it. The default is 'hide_urls'
**test.adoc**
```asciidoc
= Document Title
You can also use https://www.macports.org[MacPorts], another package manager for macOS, to install Asciidoctor.
If you dont have MacPorts on your computer, complete the https://www.macports.org/install.php[installation instructions] first.
```
```python
>>> from adparser import Parser
... my_file = open("test.adoc")
... parser = Parser(my_file)
>>> for docelem in parser.paragraphs():
... print(docelem.get_text())
'You can also use MacPorts, another package manager for macOS, to install Asciidoctor.'
'If you dont have MacPorts on your computer, complete the installation instructions first.'
>>> for docelem in parser.paragraphs():
... print(docelem.get_text('show_urls'))
'You can also use https://www.macports.org[MacPorts], another package manager for macOS, to install Asciidoctor.'
'If you dont have MacPorts on your computer, complete the https://www.macports.org/install.php[installation instructions] first.'
```
You can set a named **style** and **section** parameters for Parser
methods for a more accurate selection.
**test.adoc**
```asciidoc
= Document Title
== Python
[source,python]
print("Hello, World!")
== C++
[source,cpp]
std::cout << "Hello, World!";
```
```python
>>> from adparser import Parser
... my_file = open("test.adoc")
... parser = Parser(my_file)
>>> for docelem in parser.source_blocks(['cpp']):
... print(docelem.data)
... print(docelem.styles)
'std::cout << "Hello, World!";'
['listingblock', 'cpp']
>>> for docelem in parser.source_blocks([], ['Python']):
... print(docelem.data)
'print("Hello, World!")'
```
Styles and sections are filtered by passing lists. They store the
necessary styles or sections. The selection takes place for objects
whose style and section attributes have elements of the passed lists as
a subset.
If you pass the list of sections \['C', 'Python'\] in the example above,
nothing will be output, because there is no code object that is both in
the C section and in the Python section.
**Features** of working with the parser:
- The level 0 section can only be 1 (and it must exist)
- Only the text is extracted from the tables and lists
- Nested tables cannot be used
## How get tables:
**test.adoc**
```asciidoc
= Document Title
[cols="1,1"]
|===
|Cell in column 1, row 1
|Cell in column 2, row 1
|Cell in column 1, row 2
|Cell in column 2, row 2
|Cell in column 1, row 3
|Cell in column 2, row 3
|===
```
The table objects also have the **data** attribute which stores the
dictionary
```python
>>> from adparser import Parser
... my_file = open("test.adoc")
... parser = Parser(my_file)
>>> elemiter = parser.tables()
>>> elemiter = next(elemiter)
>>> print(elemiter.data)
{'col1':['Cell in column 1, row 1', 'Cell in column 1, row 2', 'Cell in column 1, row 3'], 'col2':['Cell in column 2, row 1', 'Cell in column 2, row 2', 'Cell in column 2, row 3']}
```
Keys with the names "col1" and "col2" were automatically created
Using the **to\_dict()** and **to\_matrix()** methods, you can change
the data attribute to a dictionary or matrix, respectively
**test1.adoc**
```asciidoc
[cols="1,1,1,1"]
|===
|Column 1 |Column 2 |Column 3 |Column 4
|Cell in column 1
|Cell in column 2
|Cell in column 3
|Cell in column 4
|===
```
```python
>>> from adparser import Parser
... my_file = open("test1.adoc")
... parser = Parser(my_file)
>>> elemiter = parser.tables()
>>> elemiter = next(elemiter)
>>> print(elemiter.data["Column 1"])
["Cell in column 1"]
>>> elemiter.to_matrix()
>>> print(elemiter.data[0][0])
'Column 1'
>>> print(elemiter.data[0][1])
'Cell in column 1'
```
The first element in the column becomes the column name (in matrix)
## get\_near() method
To access the closest element to the current one, there is method
get\_near. The accepted parameters are a string with the name of the
required element and a string with the direction: 'up' or 'down'.
**test.adoc**
```asciidoc
= Document Title
This is a paragraph.
== Section 1
This is another paragraph.
[source,python]
print("Hello, World!")
[NOTE]
This is a note.
image::image.png[]
```
```python
>>> from adparser import Parser
... my_file = open("test.adoc")
... parser = Parser(my_file)
>>> for docelem in parser.source_blocks():
... up_heading = docelem.get_near("heading", direction='up')
... print(up_heading.data)
... down_image = docelem.get_near("image", direction='down')
... print(down_image.data)
'Section 1'
'image.png'
```
**test2.adoc**
```asciidoc
= Document Title
=====
Here's a sample AsciiDoc document:
-----
= Document Title
Content goes here.
-----
The document header is useful, but not required.
=====
```
```python
>>> from adparser import Parser
... my_file = open("test2.adoc")
... parser = Parser(my_file)
>>> for docelem in parser.paragraphs(style=['listingblock']):
... up_heading = docelem.get_near("paragraph", direction='up')
... print(up_heading.get_text())
'Here’s a sample AsciiDoc document:'
```
You can also set a named style parameter for these methods.
Raw data
{
"_id": null,
"home_page": null,
"name": "adparser",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Artem Ivashchenko <fertcool@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/51/ac/61f4b7d3968a4ec05f8c7896802fabacd88eb8c09798c9b2b2c8fc3e57a8/adparser-0.1.0.tar.gz",
"platform": null,
"description": "The adparser library for Python provides powerful capabilities for\nworking with AsciiDoc documents. In this Quick Start, you\u2019ll learn how\nto use the library\u2019s main functions to extract various elements from an\nAsciiDoc document.\n\n# Installation\n\nInstall the asciidoc library using pip:\n\n pip install adparser\n\nIt is also necessary that **asciidoctor** is preinstalled in the system.\nYou can find out how to do this by following the link\n<https://asciidoctor.org/#installation> . Before using the library, make\nsure that the asciidoctor path is in the *PATH*.\n\n# Extracting Document Elements\n\nThe asciidoc library can extract the following elements from an AsciiDoc\ndocument:\n\n- text lines - the paragraph element are made up of it\n\n- link\n\n- paragraphs\n\n- headings\n\n- lists\n\n- source blocks\n\n- tables\n\n- audio, video, and images.\n\nTo access these elements, you can use the Parser object.\n\n# Parser object\n\nTo start parsing, we need to create Parser object:\n```python\nfrom adparser import Parser\nmy_file = open(\"test.adoc\")\nparser = Parser(my_file)\n```\n# Parser methods\n\nTo work with each of the document elements described above, the Parser\nobject has its own methods:\n\n- text\\_lines()\n\n- links()\n\n- paragraphs()\n\n- headings\n\n- lists\n\n- source\\_blocks()\n\n- tables()\n\n- audios()\n\n- images()\n\n- videos()\n\n## Example\n\n**test.adoc**\n```asciidoc\n= Document Title\n\nThis is a paragraph.\n\n== Section 1\n\nThis is another paragraph.\n\n[source,python]\nprint(\"Hello, World!\")\n\n[NOTE]\nThis is a note.\n\nimage::image.png[]\n```\n```python\n>>> from adparser import Parser\n... my_file = open(\"test.adoc\")\n... parser = Parser(my_file)\n\n>>> for docelem in parser.headings():\n... print(docelem.data)\n'Document Title'\n'Section 1'\n>>> for docelem in parser.source_blocks():\n... print(docelem.data)\n... print(docelem.styles)\n'print(\"Hello, World!\")'\n['listingblock', 'python']\n```\nThe functions return an iterators for the objects-elements of the\ndocument. They store the following attributes:\n\n- data: The data associated with the element. Usually text, but in the\n case of tables, you can get a dictionary (see the example at the end\n of the readme).\n\n- section: List of sections of the document the element belongs to\n\n- styles: List of styles of the object\n- attribute (**only for links**): text of the link\n\nList of styles:\n\n- text\\_line\n\n - italic\n\n - bold\n\n - monospace\n\n- source\n\n - source languages\n\n- for all elements admonition styles\n\n - note\n\n - tip\n\n - caution\n\n - warning\n\n- for all elements area style\n\n - sidebarblock\n\n - exampleblock\n\n - quoteblock\n\n - listningblock\n\n - literalblock\n\nYou can get the text from the paragraph object only through the\n**get\\_text()** method. It has a url_opt parameter.\n\nurl_opt can be:\n\n- 'show\\_urls'\n\n- 'hide\\_urls'\n\nThis option can hide the url of a link ,hyperlink, media src(image,\naudio, video) or show it. The default is 'hide_urls'\n\n**test.adoc**\n```asciidoc\n= Document Title\n\nYou can also use https://www.macports.org[MacPorts], another package manager for macOS, to install Asciidoctor.\n\nIf you dont have MacPorts on your computer, complete the https://www.macports.org/install.php[installation instructions] first.\n```\n```python\n>>> from adparser import Parser\n... my_file = open(\"test.adoc\")\n... parser = Parser(my_file)\n\n>>> for docelem in parser.paragraphs():\n... print(docelem.get_text())\n'You can also use MacPorts, another package manager for macOS, to install Asciidoctor.'\n'If you dont have MacPorts on your computer, complete the installation instructions first.'\n>>> for docelem in parser.paragraphs():\n... print(docelem.get_text('show_urls'))\n'You can also use https://www.macports.org[MacPorts], another package manager for macOS, to install Asciidoctor.'\n'If you dont have MacPorts on your computer, complete the https://www.macports.org/install.php[installation instructions] first.'\n```\nYou can set a named **style** and **section** parameters for Parser\nmethods for a more accurate selection.\n\n**test.adoc**\n```asciidoc\n= Document Title\n\n== Python\n\n[source,python]\nprint(\"Hello, World!\")\n\n== C++\n\n[source,cpp]\nstd::cout << \"Hello, World!\";\n```\n```python\n>>> from adparser import Parser\n... my_file = open(\"test.adoc\")\n... parser = Parser(my_file)\n\n>>> for docelem in parser.source_blocks(['cpp']):\n... print(docelem.data)\n... print(docelem.styles)\n'std::cout << \"Hello, World!\";'\n['listingblock', 'cpp']\n>>> for docelem in parser.source_blocks([], ['Python']):\n... print(docelem.data)\n\n'print(\"Hello, World!\")'\n```\nStyles and sections are filtered by passing lists. They store the\nnecessary styles or sections. The selection takes place for objects\nwhose style and section attributes have elements of the passed lists as\na subset.\n\nIf you pass the list of sections \\['C', 'Python'\\] in the example above,\nnothing will be output, because there is no code object that is both in\nthe C section and in the Python section.\n\n**Features** of working with the parser:\n\n- The level 0 section can only be 1 (and it must exist)\n\n- Only the text is extracted from the tables and lists\n\n- Nested tables cannot be used\n\n## How get tables:\n\n**test.adoc**\n```asciidoc\n= Document Title\n\n[cols=\"1,1\"]\n|===\n|Cell in column 1, row 1\n|Cell in column 2, row 1\n\n|Cell in column 1, row 2\n|Cell in column 2, row 2\n\n|Cell in column 1, row 3\n|Cell in column 2, row 3\n|===\n```\nThe table objects also have the **data** attribute which stores the\ndictionary\n```python\n>>> from adparser import Parser\n... my_file = open(\"test.adoc\")\n... parser = Parser(my_file)\n>>> elemiter = parser.tables()\n>>> elemiter = next(elemiter)\n\n>>> print(elemiter.data)\n{'col1':['Cell in column 1, row 1', 'Cell in column 1, row 2', 'Cell in column 1, row 3'], 'col2':['Cell in column 2, row 1', 'Cell in column 2, row 2', 'Cell in column 2, row 3']}\n```\nKeys with the names \"col1\" and \"col2\" were automatically created\n\nUsing the **to\\_dict()** and **to\\_matrix()** methods, you can change\nthe data attribute to a dictionary or matrix, respectively\n\n**test1.adoc**\n```asciidoc\n[cols=\"1,1,1,1\"]\n|===\n|Column 1 |Column 2 |Column 3 |Column 4\n\n|Cell in column 1\n|Cell in column 2\n|Cell in column 3\n|Cell in column 4\n|===\n```\n```python\n>>> from adparser import Parser\n... my_file = open(\"test1.adoc\")\n... parser = Parser(my_file)\n>>> elemiter = parser.tables()\n>>> elemiter = next(elemiter)\n\n>>> print(elemiter.data[\"Column 1\"])\n[\"Cell in column 1\"]\n>>> elemiter.to_matrix()\n>>> print(elemiter.data[0][0])\n'Column 1'\n>>> print(elemiter.data[0][1])\n'Cell in column 1'\n```\nThe first element in the column becomes the column name (in matrix)\n\n## get\\_near() method\n\nTo access the closest element to the current one, there is method\nget\\_near. The accepted parameters are a string with the name of the\nrequired element and a string with the direction: 'up' or 'down'.\n\n**test.adoc**\n```asciidoc\n= Document Title\n\nThis is a paragraph.\n\n== Section 1\n\nThis is another paragraph.\n\n[source,python]\nprint(\"Hello, World!\")\n\n[NOTE]\nThis is a note.\n\nimage::image.png[]\n```\n```python\n>>> from adparser import Parser\n... my_file = open(\"test.adoc\")\n... parser = Parser(my_file)\n>>> for docelem in parser.source_blocks():\n... up_heading = docelem.get_near(\"heading\", direction='up')\n... print(up_heading.data)\n... down_image = docelem.get_near(\"image\", direction='down')\n... print(down_image.data)\n'Section 1'\n'image.png'\n```\n**test2.adoc**\n```asciidoc\n= Document Title\n\n=====\nHere's a sample AsciiDoc document:\n\n-----\n= Document Title\n\nContent goes here.\n-----\n\nThe document header is useful, but not required.\n=====\n```\n\n```python\n>>> from adparser import Parser\n... my_file = open(\"test2.adoc\")\n... parser = Parser(my_file)\n>>> for docelem in parser.paragraphs(style=['listingblock']):\n... up_heading = docelem.get_near(\"paragraph\", direction='up')\n... print(up_heading.get_text())\n\n'Here\u2019s a sample AsciiDoc document:'\n```\nYou can also set a named style parameter for these methods.\n",
"bugtrack_url": null,
"license": null,
"summary": "The adparser library for Python provides powerful capabilities for working with AsciiDoc documents",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/fertcool/AsciiDocParse",
"Issues": "https://github.com/fertcool/AsciiDocParse/issues"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8a29cc39941e919ae6c0d1bb27d02ea4287cd1e928661602c423c95eb8bd4205",
"md5": "ecd47e16241ea90645f4ba4c8a0d0d25",
"sha256": "93968faa73735832a4dc059f281cd5ead1c3f90db07b31087da3bc3eacacd65f"
},
"downloads": -1,
"filename": "adparser-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ecd47e16241ea90645f4ba4c8a0d0d25",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 14620,
"upload_time": "2024-08-15T21:42:20",
"upload_time_iso_8601": "2024-08-15T21:42:20.948733Z",
"url": "https://files.pythonhosted.org/packages/8a/29/cc39941e919ae6c0d1bb27d02ea4287cd1e928661602c423c95eb8bd4205/adparser-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "51ac61f4b7d3968a4ec05f8c7896802fabacd88eb8c09798c9b2b2c8fc3e57a8",
"md5": "b64d3d9534acf20aede79a3e568a83e4",
"sha256": "957b18217066ada173a7804d16c4633299924323717c9133335725da43998464"
},
"downloads": -1,
"filename": "adparser-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "b64d3d9534acf20aede79a3e568a83e4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 14087,
"upload_time": "2024-08-15T21:42:22",
"upload_time_iso_8601": "2024-08-15T21:42:22.092630Z",
"url": "https://files.pythonhosted.org/packages/51/ac/61f4b7d3968a4ec05f8c7896802fabacd88eb8c09798c9b2b2c8fc3e57a8/adparser-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-15 21:42:22",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "fertcool",
"github_project": "AsciiDocParse",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "adparser"
}