catbook


Namecatbook JSON
Version 0.1.1.3 PyPI version JSON
download
home_page
SummaryA library for compiling text files into a book-form docx file
upload_time2024-02-25 15:06:49
maintainer
docs_urlNone
authorDavid Kershaw
requires_python>=3.12,<4.0
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # catbook

A very simple docx file builder. Catbook was created to make managing book chapters simple. The goal was a minimal-markup way to concatenate text files into Word docs that could be converted to epub, mobi, pdf, etc.

The tool needed to:
* Allow chapters to be quickly rearranged
* Allow multi-section chapters
* Offer a trivially easy way to differentiate quotes, blocks, and special words
* Support three levels of hierarchy
* Include only the absolute minimum of markup and functionality

___

## Bookfiles

Catbook reads a flat list of text files from a .bookfile and concatenates them into a Word doc. The doc may have up to three levels. The levels are titled using Word styles.

Metadata about the files that are concatenated into the docx is available from the Book object and each section.

Bookfiles can include several things besides paths to text files.

* Comments as lines starting with #
* TITLE and AUTHOR to be shown in the book's metadata
* INCLUDE of preexisting docx
* A METADATA directive that inserts a page with a table containing the author, title, bookfile path, word count and other metadata.

For e.g.
```
#
# this is a complete bookfile
# TITLE: This is my book
# AUTHOR: John Doe
#
# INSERT: an-existing/file.docx
#
filesdir/section-1.txt
morefiles/section-2.txt
# INSERT: another/file.docx
still/morefiles/section-2.txt
#
# METADATA
#
```

___

## Text files

### Sections

Each text file that is concatenated into the docx is a "section". Sections have two parts:

- The first line
- All other lines

The first line is presented as a title, subject to the markup described below. Every other line becomes a paragraph.

Catbook skips blank lines. If the first line is blank the section will have no title to distinguish it from the section before it. A sequence of blank lines is no different than a single blank line.

Note that while in general blank lines are skipped and have no effect, in rare cases a blank line at the bottom of the doc will cause Word to insert a blank page. This can happens when the number of non-blank lines exactly fits the page.

### Comments

Any line that begins with a # is considered a comment. Comment lines are skipped. There can be any number of comment lines before the title line; the first non-comment line is considered the title line.

Each comment will be checked for directives.

The INCLUDE IMAGE directive includes an image. Images are centered in a paragraph. The directive is in the form:
```
# INCLUDE IMAGE: path/to/my/image.png
```

The METADATA directive prints the section metadata collected to that point. The directive looks like:
```
# METADATA
```

The MARK directive prints a file and line number indicating what file and line the directive was positioned. This is intended to help identify where a point in the text is located in the files being concatenated. Adding a MARK to files is useful when there is a series of files without title lines. Use the directive like:
```
# MARK
```

### Markups

There are a very small number of markups to do things like italicize quotes, force a page break between sections, etc. Markup chars and fonts are minimally customizable using .ini files. See catbook/markup.py and catbook/fonts.py.


* Book title: ~~

A book title is the first line of a text file. The markup must be the first char. Book titles are the top grouping unit in the same way that a first-level heading in a docx is the top of a TOC. Book titles contain chapters and sections.
```
~~Book One: A New Hope
```

* Chapter title: ~

A title is the first line of a text file. The markup must be the first char. Chapter titles are a 2nd level grouping that is below a book and above section
```
~Chapter ten: In which a storm gathers
```

* Stand-alone section: >

This markup must be the first char of the first line of a text file. It forces the section to start on a new page
```
>1918: Vienna
In 1918 the empire slept...
```

* Jump: \***

A jump is on the first line of a text file. Jumps creates a break within a chapter by adding an untitled section. The section is separated from the preceding section by an indicator called an asterism. Most commonly the asterism is three widely spaced stars. The asterism text is set as the ASTERISM.
```
***
In this section I will show that...
```

* Asterism: \*                           ⁂                           \*

The asterism is a section separator that is inserted when the JUMP markup is seen.

* Block: |

A block may start on any line. The markup must be the first char. Blocks are text that is set off from the rest of the paragraphs in a different font.
```
The letter said
|Dear Jack.
|I hope you've been well.

```

* Quoted line: "

A quote may start on any line. The markup must be the first char. A quote is another type of block. This markup is also useful for forcing a blank line. To make a blank line put the markup in the first char of an otherwise empty line.
```
"Hey!
Jack said. But it was quiet.

"
Eventually there was a sound.
```

* Highlighted text: |

Put pipes around any word or words to highlight them.  Assuming | is used for both highlights and blocks, if a highlight begins with the first word of a paragraph it looks like a block. In that case use a double highlight mark, as in:
```
||some highlighted words| that start a line.

There are more |highlighted words| in this line.
```

___

## Usage

For usage, see main.py and/or test/test_builder.py.

This code creates a docx file called My Book.docx in the working directory. It uses the charles.bookfile to know what text files to concatenate. The text files live in the directories below test/config/texts/charles and the bookfile refers to them relative to that path.

```
from catbook import Builder

def main():
    builder = Builder()
    builder.init()
    builder.files.OUTPUT = "./My Book.docx"
    builder.files.INPUT = "test/config/charles.bookfile"
    builder.files.FILES = "test/config/texts/charles"

    builder.build()
    print(f"words: {builder.book.metadata.word_count}")

if __name__ == "__main__":
    main()
```

The output looks like this:

<img width="75%" height="75%" src="https://github.com/dk107dk/catbook/raw/main/output.png"/>


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "catbook",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.12,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "David Kershaw",
    "author_email": "dk107dk@hotmail.com",
    "download_url": "https://files.pythonhosted.org/packages/63/24/4735a0e66c3ed54f37fde53c7d62ac6cc26a4277baeb5b4e77ea8da79555/catbook-0.1.1.3.tar.gz",
    "platform": null,
    "description": "# catbook\n\nA very simple docx file builder. Catbook was created to make managing book chapters simple. The goal was a minimal-markup way to concatenate text files into Word docs that could be converted to epub, mobi, pdf, etc.\n\nThe tool needed to:\n* Allow chapters to be quickly rearranged\n* Allow multi-section chapters\n* Offer a trivially easy way to differentiate quotes, blocks, and special words\n* Support three levels of hierarchy\n* Include only the absolute minimum of markup and functionality\n\n___\n\n## Bookfiles\n\nCatbook reads a flat list of text files from a .bookfile and concatenates them into a Word doc. The doc may have up to three levels. The levels are titled using Word styles.\n\nMetadata about the files that are concatenated into the docx is available from the Book object and each section.\n\nBookfiles can include several things besides paths to text files.\n\n* Comments as lines starting with #\n* TITLE and AUTHOR to be shown in the book's metadata\n* INCLUDE of preexisting docx\n* A METADATA directive that inserts a page with a table containing the author, title, bookfile path, word count and other metadata.\n\nFor e.g.\n```\n#\n# this is a complete bookfile\n# TITLE: This is my book\n# AUTHOR: John Doe\n#\n# INSERT: an-existing/file.docx\n#\nfilesdir/section-1.txt\nmorefiles/section-2.txt\n# INSERT: another/file.docx\nstill/morefiles/section-2.txt\n#\n# METADATA\n#\n```\n\n___\n\n## Text files\n\n### Sections\n\nEach text file that is concatenated into the docx is a \"section\". Sections have two parts:\n\n- The first line\n- All other lines\n\nThe first line is presented as a title, subject to the markup described below. Every other line becomes a paragraph.\n\nCatbook skips blank lines. If the first line is blank the section will have no title to distinguish it from the section before it. A sequence of blank lines is no different than a single blank line.\n\nNote that while in general blank lines are skipped and have no effect, in rare cases a blank line at the bottom of the doc will cause Word to insert a blank page. This can happens when the number of non-blank lines exactly fits the page.\n\n### Comments\n\nAny line that begins with a # is considered a comment. Comment lines are skipped. There can be any number of comment lines before the title line; the first non-comment line is considered the title line.\n\nEach comment will be checked for directives.\n\nThe INCLUDE IMAGE directive includes an image. Images are centered in a paragraph. The directive is in the form:\n```\n# INCLUDE IMAGE: path/to/my/image.png\n```\n\nThe METADATA directive prints the section metadata collected to that point. The directive looks like:\n```\n# METADATA\n```\n\nThe MARK directive prints a file and line number indicating what file and line the directive was positioned. This is intended to help identify where a point in the text is located in the files being concatenated. Adding a MARK to files is useful when there is a series of files without title lines. Use the directive like:\n```\n# MARK\n```\n\n### Markups\n\nThere are a very small number of markups to do things like italicize quotes, force a page break between sections, etc. Markup chars and fonts are minimally customizable using .ini files. See catbook/markup.py and catbook/fonts.py.\n\n\n* Book title: ~~\n\nA book title is the first line of a text file. The markup must be the first char. Book titles are the top grouping unit in the same way that a first-level heading in a docx is the top of a TOC. Book titles contain chapters and sections.\n```\n~~Book One: A New Hope\n```\n\n* Chapter title: ~\n\nA title is the first line of a text file. The markup must be the first char. Chapter titles are a 2nd level grouping that is below a book and above section\n```\n~Chapter ten: In which a storm gathers\n```\n\n* Stand-alone section: >\n\nThis markup must be the first char of the first line of a text file. It forces the section to start on a new page\n```\n>1918: Vienna\nIn 1918 the empire slept...\n```\n\n* Jump: \\***\n\nA jump is on the first line of a text file. Jumps creates a break within a chapter by adding an untitled section. The section is separated from the preceding section by an indicator called an asterism. Most commonly the asterism is three widely spaced stars. The asterism text is set as the ASTERISM.\n```\n***\nIn this section I will show that...\n```\n\n* Asterism: \\*                           \u2042                           \\*\n\nThe asterism is a section separator that is inserted when the JUMP markup is seen.\n\n* Block: |\n\nA block may start on any line. The markup must be the first char. Blocks are text that is set off from the rest of the paragraphs in a different font.\n```\nThe letter said\n|Dear Jack.\n|I hope you've been well.\n\n```\n\n* Quoted line: \"\n\nA quote may start on any line. The markup must be the first char. A quote is another type of block. This markup is also useful for forcing a blank line. To make a blank line put the markup in the first char of an otherwise empty line.\n```\n\"Hey!\nJack said. But it was quiet.\n\n\"\nEventually there was a sound.\n```\n\n* Highlighted text: |\n\nPut pipes around any word or words to highlight them.  Assuming | is used for both highlights and blocks, if a highlight begins with the first word of a paragraph it looks like a block. In that case use a double highlight mark, as in:\n```\n||some highlighted words| that start a line.\n\nThere are more |highlighted words| in this line.\n```\n\n___\n\n## Usage\n\nFor usage, see main.py and/or test/test_builder.py.\n\nThis code creates a docx file called My Book.docx in the working directory. It uses the charles.bookfile to know what text files to concatenate. The text files live in the directories below test/config/texts/charles and the bookfile refers to them relative to that path.\n\n```\nfrom catbook import Builder\n\ndef main():\n    builder = Builder()\n    builder.init()\n    builder.files.OUTPUT = \"./My Book.docx\"\n    builder.files.INPUT = \"test/config/charles.bookfile\"\n    builder.files.FILES = \"test/config/texts/charles\"\n\n    builder.build()\n    print(f\"words: {builder.book.metadata.word_count}\")\n\nif __name__ == \"__main__\":\n    main()\n```\n\nThe output looks like this:\n\n<img width=\"75%\" height=\"75%\" src=\"https://github.com/dk107dk/catbook/raw/main/output.png\"/>\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A library for compiling text files into a book-form docx file",
    "version": "0.1.1.3",
    "project_urls": {
        "Github": "https://github.com/dk107dk/catbook"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e16e60f9d3766bf85b87222a2612964ed90a4063a2e126d623077f37347f2c7c",
                "md5": "18a49b11f65bed897d782e600991e9ed",
                "sha256": "0bc55de70443cfeb1a540a3f7d1e43be891bd3a3d0f9ede3d1a0df37827a9b1c"
            },
            "downloads": -1,
            "filename": "catbook-0.1.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "18a49b11f65bed897d782e600991e9ed",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12,<4.0",
            "size": 18710,
            "upload_time": "2024-02-25T15:06:48",
            "upload_time_iso_8601": "2024-02-25T15:06:48.111899Z",
            "url": "https://files.pythonhosted.org/packages/e1/6e/60f9d3766bf85b87222a2612964ed90a4063a2e126d623077f37347f2c7c/catbook-0.1.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "63244735a0e66c3ed54f37fde53c7d62ac6cc26a4277baeb5b4e77ea8da79555",
                "md5": "2646ac7348b4dc4a9e42540cd8bc66af",
                "sha256": "894055ed49063ae09ed52400814f16dfba619dc005e43c52a4bad9986348cdb0"
            },
            "downloads": -1,
            "filename": "catbook-0.1.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "2646ac7348b4dc4a9e42540cd8bc66af",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12,<4.0",
            "size": 16429,
            "upload_time": "2024-02-25T15:06:49",
            "upload_time_iso_8601": "2024-02-25T15:06:49.609579Z",
            "url": "https://files.pythonhosted.org/packages/63/24/4735a0e66c3ed54f37fde53c7d62ac6cc26a4277baeb5b4e77ea8da79555/catbook-0.1.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-25 15:06:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dk107dk",
    "github_project": "catbook",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "catbook"
}
        
Elapsed time: 0.17267s