# langchain-utils
LangChain Utilities
- [langchain-utils](#langchain-utils)
- [Prompt generation using LangChain document loaders](#prompt-generation-using-langchain-document-loaders)
- [Demos](#demos)
- [`pandocprompt`](#pandocprompt)
- [`urlprompt`](#urlprompt)
- [`pdfprompt`](#pdfprompt)
- [`ytprompt`](#ytprompt)
- [`textprompt`](#textprompt)
- [`htmlprompt`](#htmlprompt)
- [Installation](#installation)
- [pipx](#pipx)
- [pip](#pip)
- [Develop](#develop)
## Prompt generation using LangChain document loaders
Do you find yourself frequently copy-pasting texts from the web / PDFs / other documents into ChatGPT?
If yes, these tools are for you!
Optimized to feed into a chat interface (like ChatGPT) manually in one or multiple (to get around context length limits) goes.
Basically, the prompts generated look like this:
```python
REPLY_OK_IF_YOU_READ_TEMPLATE = '''
Below is {what}, reply "OK" if you read:
"""
{content}
"""
'''.strip()
```
You can feed it directly to a chat interface like ChatGPT, and ask follow up questions about it.
See [`prompts.py`](./langchain_utils/prompts.py) for other variations.
### Demos
- Loading `https://github.com/tddschn/langchain-utils` and copy to clipboard:
<!-- create a video tag with https://user-images.githubusercontent.com/45612704/231729153-341bd962-28cc-40a3-af8b-91e038ccaf6c.mp4 -->
<video src="https://user-images.githubusercontent.com/45612704/231729153-341bd962-28cc-40a3-af8b-91e038ccaf6c.mp4" controls width="100%"></video>
- Load 3 pages of a pdf file, open each part for inspection before copying, and optionally merge 3 pages into 2 prompts that wouldn't go over the `gpt-3.5-turbo`'s context length limit with langchain's `TokenTextSplitter`.
<!-- for https://user-images.githubusercontent.com/45612704/231731553-63cf3cef-a210-4761-8ca3-dd47bedc3393.mp4 -->
<video src="https://user-images.githubusercontent.com/45612704/231731553-63cf3cef-a210-4761-8ca3-dd47bedc3393.mp4" controls width="100%"></video>
### `pandocprompt`
```
$ pandocprompt --help
usage: pandocprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]
[-P PARTS [PARTS ...]] [-r] [-R]
[--print-percentage-non-ascii] [-n] [--out OUT] [-C]
[-w WHAT] [-M] [--from PANDOC_FROM_FORMAT]
[--to PANDOC_TO_FORMAT]
[PATH ...]
Get prompts from arbitrary files. You need to have `pandoc` installed and in
$PATH, it will be used to convert source files to desired (hopefully textual)
format. Common use cases: Getting prompts from EPub books or several TeX
files.
positional arguments:
PATH Paths to the text files, or stdin if not provided
(default: None)
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-c, --copy Copy the prompt to clipboard (default: False)
-e, --edit Edit the prompt and copy manually (default: False)
-m model, --model model
Model to use. This only affects the chunk size. Use -S
to disable splitting (infinite chunk size). (default:
gpt-4-32k)
-S, --no-split Do not split the prompt into multiple parts (use this
if the model has a really large context size)
(default: False)
-s chunk_size, --chunk-size chunk_size
Chunk size when splitting transcript, also used to
determine whether to split, defaults to 1/2 of the
context length limit of the model (default: None)
-P PARTS [PARTS ...], --parts PARTS [PARTS ...]
Parts to select in the processes list of Documents
(default: None)
-r, --raw Wraps the content in triple quotes with no extra text
(default: False)
-R, --raw-no-quotes Output the content only (default: False)
--print-percentage-non-ascii
Print percentage of non-ascii characters (default:
False)
-n, --dry-run Dry run (default: False)
--out OUT Output file (default: None)
-C, --from-clipboard Load text from clipboard (default: False)
-w WHAT, --what WHAT Initial knowledge you want to insert before the PDF
content in the prompt (default: the content of a
document)
-M, --merge Merge contents of all pages before processing
(default: False)
--from PANDOC_FROM_FORMAT
The format that is passed to -f in pandoc (default:
None)
--to PANDOC_TO_FORMAT
The format that is passed to -t in pandoc. gfm-
raw_html means GitHub Flavored Markdown with raw HTML
stripped. (default: gfm-raw_html)
```
### `urlprompt`
```
$ urlprompt --help
usage: urlprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]
[-P PARTS [PARTS ...]] [-r] [-R]
[--print-percentage-non-ascii] [-n] [--out OUT] [-w WHAT]
[-M] [-j] [-g] [--github-path GITHUB_PATH]
[--github-revision GITHUB_REVISION] [--substack]
URL
Get a prompt consisting the text content of a webpage
positional arguments:
URL URL to the webpage
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-c, --copy Copy the prompt to clipboard (default: False)
-e, --edit Edit the prompt and copy manually (default: False)
-m model, --model model
Model to use. This only affects the chunk size. Use -S
to disable splitting (infinite chunk size). (default:
gpt-4-32k)
-S, --no-split Do not split the prompt into multiple parts (use this
if the model has a really large context size)
(default: False)
-s chunk_size, --chunk-size chunk_size
Chunk size when splitting transcript, also used to
determine whether to split, defaults to 1/2 of the
context length limit of the model (default: None)
-P PARTS [PARTS ...], --parts PARTS [PARTS ...]
Parts to select in the processes list of Documents
(default: None)
-r, --raw Wraps the content in triple quotes with no extra text
(default: False)
-R, --raw-no-quotes Output the content only (default: False)
--print-percentage-non-ascii
Print percentage of non-ascii characters (default:
False)
-n, --dry-run Dry run (default: False)
--out OUT Output file (default: None)
-w WHAT, --what WHAT Initial knowledge you want to insert before the PDF
content in the prompt (default: the content of a
webpage)
-M, --merge Merge contents of all pages before processing
(default: False)
-j, --javascript Use JavaScript to render the page (default: False)
-g, --github Load the raw file from a GitHub URL (default: False)
--github-path GITHUB_PATH
Path to the GitHub file (default: README.md)
--github-revision GITHUB_REVISION
Revision for the GitHub file (default: master)
--substack Load from a Substack URL and convert it to Markdown
(default: False)
```
### `pdfprompt`
```
$ pdfprompt --help
usage: pdfprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]
[-P PARTS [PARTS ...]] [-r] [-R]
[--print-percentage-non-ascii] [-n] [--out OUT]
[-p PAGES [PAGES ...]] [-l PAGE_SLICE] [-M] [-w WHAT] [-o]
[-O] [-L OCR_LANGUAGE]
PDF Path
Get a prompt consisting the text content of a PDF file
positional arguments:
PDF Path Path to the PDF file
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-c, --copy Copy the prompt to clipboard (default: False)
-e, --edit Edit the prompt and copy manually (default: False)
-m model, --model model
Model to use. This only affects the chunk size. Use -S
to disable splitting (infinite chunk size). (default:
gpt-4-32k)
-S, --no-split Do not split the prompt into multiple parts (use this
if the model has a really large context size)
(default: False)
-s chunk_size, --chunk-size chunk_size
Chunk size when splitting transcript, also used to
determine whether to split, defaults to 1/2 of the
context length limit of the model (default: None)
-P PARTS [PARTS ...], --parts PARTS [PARTS ...]
Parts to select in the processes list of Documents
(default: None)
-r, --raw Wraps the content in triple quotes with no extra text
(default: False)
-R, --raw-no-quotes Output the content only (default: False)
--print-percentage-non-ascii
Print percentage of non-ascii characters (default:
False)
-n, --dry-run Dry run (default: False)
--out OUT Output file (default: None)
-p PAGES [PAGES ...], --pages PAGES [PAGES ...]
Only include specified page numbers (default: None)
-l PAGE_SLICE, --page-slice PAGE_SLICE
Use Python slice syntax to select page numbers (e.g.
1:3, 1:10:2, etc.) (default: None)
-M, --merge Merge contents of all pages before processing
(default: False)
-w WHAT, --what WHAT Initial knowledge you want to insert before the PDF
content in the prompt (default: the content of a PDF
file)
-o, --fallback-ocr Use OCR as fallback if no text detected on page,
please set TESSDATA_PREFIX environment variable to the
path of your tesseract data directory (default: False)
-O, --force-ocr Force OCR on all pages (default: False)
-L OCR_LANGUAGE, --ocr-language OCR_LANGUAGE
Language to use for Tesseract OCR (like eng, chi_sim,
chi_tra, chi_tra_vert etc.)) (default: eng)
```
### `ytprompt`
```
$ ytprompt --help
usage: ytprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]
[-P PARTS [PARTS ...]] [-r] [-R]
[--print-percentage-non-ascii] [-n] [--out OUT]
YouTube URL
Get a prompt consisting Title and Transcript of a YouTube Video
positional arguments:
YouTube URL YouTube URL
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-c, --copy Copy the prompt to clipboard (default: False)
-e, --edit Edit the prompt and copy manually (default: False)
-m model, --model model
Model to use. This only affects the chunk size. Use -S
to disable splitting (infinite chunk size). (default:
gpt-4-32k)
-S, --no-split Do not split the prompt into multiple parts (use this
if the model has a really large context size)
(default: False)
-s chunk_size, --chunk-size chunk_size
Chunk size when splitting transcript, also used to
determine whether to split, defaults to 1/2 of the
context length limit of the model (default: None)
-P PARTS [PARTS ...], --parts PARTS [PARTS ...]
Parts to select in the processes list of Documents
(default: None)
-r, --raw Wraps the content in triple quotes with no extra text
(default: False)
-R, --raw-no-quotes Output the content only (default: False)
--print-percentage-non-ascii
Print percentage of non-ascii characters (default:
False)
-n, --dry-run Dry run (default: False)
--out OUT Output file (default: None)
```
### `textprompt`
```
$ textprompt --help
usage: textprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]
[-P PARTS [PARTS ...]] [-r] [-R]
[--print-percentage-non-ascii] [-n] [--out OUT] [-C]
[-w WHAT] [-M]
[PATH ...]
Get a prompt from text files
positional arguments:
PATH Paths to the text files, or stdin if not provided
(default: None)
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-c, --copy Copy the prompt to clipboard (default: False)
-e, --edit Edit the prompt and copy manually (default: False)
-m model, --model model
Model to use. This only affects the chunk size. Use -S
to disable splitting (infinite chunk size). (default:
gpt-4-32k)
-S, --no-split Do not split the prompt into multiple parts (use this
if the model has a really large context size)
(default: False)
-s chunk_size, --chunk-size chunk_size
Chunk size when splitting transcript, also used to
determine whether to split, defaults to 1/2 of the
context length limit of the model (default: None)
-P PARTS [PARTS ...], --parts PARTS [PARTS ...]
Parts to select in the processes list of Documents
(default: None)
-r, --raw Wraps the content in triple quotes with no extra text
(default: False)
-R, --raw-no-quotes Output the content only (default: False)
--print-percentage-non-ascii
Print percentage of non-ascii characters (default:
False)
-n, --dry-run Dry run (default: False)
--out OUT Output file (default: None)
-C, --from-clipboard Load text from clipboard (default: False)
-w WHAT, --what WHAT Initial knowledge you want to insert before the PDF
content in the prompt (default: the content of a
document)
-M, --merge Merge contents of all pages before processing
(default: False)
```
### `htmlprompt`
```
$ htmlprompt --help
usage: htmlprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]
[-P PARTS [PARTS ...]] [-r] [-R]
[--print-percentage-non-ascii] [-n] [--out OUT] [-C]
[-w WHAT] [-M]
[PATH ...]
Get a prompt from html files
positional arguments:
PATH Paths to the html files, or stdin if not provided
(default: None)
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-c, --copy Copy the prompt to clipboard (default: False)
-e, --edit Edit the prompt and copy manually (default: False)
-m model, --model model
Model to use. This only affects the chunk size. Use -S
to disable splitting (infinite chunk size). (default:
gpt-4-32k)
-S, --no-split Do not split the prompt into multiple parts (use this
if the model has a really large context size)
(default: False)
-s chunk_size, --chunk-size chunk_size
Chunk size when splitting transcript, also used to
determine whether to split, defaults to 1/2 of the
context length limit of the model (default: None)
-P PARTS [PARTS ...], --parts PARTS [PARTS ...]
Parts to select in the processes list of Documents
(default: None)
-r, --raw Wraps the content in triple quotes with no extra text
(default: False)
-R, --raw-no-quotes Output the content only (default: False)
--print-percentage-non-ascii
Print percentage of non-ascii characters (default:
False)
-n, --dry-run Dry run (default: False)
--out OUT Output file (default: None)
-C, --from-clipboard Load text from clipboard (default: False)
-w WHAT, --what WHAT Initial knowledge you want to insert before the PDF
content in the prompt (default: the text content of a
html file)
-M, --merge Merge contents of all pages before processing
(default: False)
```
## Installation
### pipx
This is the recommended installation method.
```
$ pipx install langchain-utils
```
### [pip](https://pypi.org/project/langchain-utils/)
```
$ pip install langchain-utils
```
## Develop
```
$ git clone https://github.com/tddschn/langchain-utils.git
$ cd langchain-utils
$ poetry install
```
Raw data
{
"_id": null,
"home_page": "https://github.com/tddschn/langchain-utils",
"name": "langchain-utils",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.11",
"maintainer_email": null,
"keywords": "langchain, utils, LLM, prompts, CLI",
"author": "Teddy Xinyuan Chen",
"author_email": "45612704+tddschn@users.noreply.github.com",
"download_url": "https://files.pythonhosted.org/packages/f1/7f/93f5a879414f8755811405fd90595d6116b2c9abcd11b355fe6ea8d59824/langchain_utils-0.8.0.tar.gz",
"platform": null,
"description": "# langchain-utils\n\nLangChain Utilities\n\n- [langchain-utils](#langchain-utils)\n - [Prompt generation using LangChain document loaders](#prompt-generation-using-langchain-document-loaders)\n - [Demos](#demos)\n - [`pandocprompt`](#pandocprompt)\n - [`urlprompt`](#urlprompt)\n - [`pdfprompt`](#pdfprompt)\n - [`ytprompt`](#ytprompt)\n - [`textprompt`](#textprompt)\n - [`htmlprompt`](#htmlprompt)\n - [Installation](#installation)\n - [pipx](#pipx)\n - [pip](#pip)\n - [Develop](#develop)\n\n## Prompt generation using LangChain document loaders\n\nDo you find yourself frequently copy-pasting texts from the web / PDFs / other documents into ChatGPT?\n\nIf yes, these tools are for you!\n\nOptimized to feed into a chat interface (like ChatGPT) manually in one or multiple (to get around context length limits) goes.\n\nBasically, the prompts generated look like this:\n\n```python\nREPLY_OK_IF_YOU_READ_TEMPLATE = '''\nBelow is {what}, reply \"OK\" if you read:\n\n\"\"\"\n{content}\n\"\"\"\n'''.strip()\n```\n\nYou can feed it directly to a chat interface like ChatGPT, and ask follow up questions about it.\n\nSee [`prompts.py`](./langchain_utils/prompts.py) for other variations.\n\n### Demos\n\n- Loading `https://github.com/tddschn/langchain-utils` and copy to clipboard:\n\n<!-- create a video tag with https://user-images.githubusercontent.com/45612704/231729153-341bd962-28cc-40a3-af8b-91e038ccaf6c.mp4 -->\n\n<video src=\"https://user-images.githubusercontent.com/45612704/231729153-341bd962-28cc-40a3-af8b-91e038ccaf6c.mp4\" controls width=\"100%\"></video>\n\n- Load 3 pages of a pdf file, open each part for inspection before copying, and optionally merge 3 pages into 2 prompts that wouldn't go over the `gpt-3.5-turbo`'s context length limit with langchain's `TokenTextSplitter`.\n\n<!-- for https://user-images.githubusercontent.com/45612704/231731553-63cf3cef-a210-4761-8ca3-dd47bedc3393.mp4 -->\n\n<video src=\"https://user-images.githubusercontent.com/45612704/231731553-63cf3cef-a210-4761-8ca3-dd47bedc3393.mp4\" controls width=\"100%\"></video>\n\n### `pandocprompt`\n\n```\n$ pandocprompt --help\n\nusage: pandocprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]\n [-P PARTS [PARTS ...]] [-r] [-R]\n [--print-percentage-non-ascii] [-n] [--out OUT] [-C]\n [-w WHAT] [-M] [--from PANDOC_FROM_FORMAT]\n [--to PANDOC_TO_FORMAT]\n [PATH ...]\n\nGet prompts from arbitrary files. You need to have `pandoc` installed and in\n$PATH, it will be used to convert source files to desired (hopefully textual)\nformat. Common use cases: Getting prompts from EPub books or several TeX\nfiles.\n\npositional arguments:\n PATH Paths to the text files, or stdin if not provided\n (default: None)\n\noptions:\n -h, --help show this help message and exit\n -V, --version show program's version number and exit\n -c, --copy Copy the prompt to clipboard (default: False)\n -e, --edit Edit the prompt and copy manually (default: False)\n -m model, --model model\n Model to use. This only affects the chunk size. Use -S\n to disable splitting (infinite chunk size). (default:\n gpt-4-32k)\n -S, --no-split Do not split the prompt into multiple parts (use this\n if the model has a really large context size)\n (default: False)\n -s chunk_size, --chunk-size chunk_size\n Chunk size when splitting transcript, also used to\n determine whether to split, defaults to 1/2 of the\n context length limit of the model (default: None)\n -P PARTS [PARTS ...], --parts PARTS [PARTS ...]\n Parts to select in the processes list of Documents\n (default: None)\n -r, --raw Wraps the content in triple quotes with no extra text\n (default: False)\n -R, --raw-no-quotes Output the content only (default: False)\n --print-percentage-non-ascii\n Print percentage of non-ascii characters (default:\n False)\n -n, --dry-run Dry run (default: False)\n --out OUT Output file (default: None)\n -C, --from-clipboard Load text from clipboard (default: False)\n -w WHAT, --what WHAT Initial knowledge you want to insert before the PDF\n content in the prompt (default: the content of a\n document)\n -M, --merge Merge contents of all pages before processing\n (default: False)\n --from PANDOC_FROM_FORMAT\n The format that is passed to -f in pandoc (default:\n None)\n --to PANDOC_TO_FORMAT\n The format that is passed to -t in pandoc. gfm-\n raw_html means GitHub Flavored Markdown with raw HTML\n stripped. (default: gfm-raw_html)\n\n```\n### `urlprompt`\n\n```\n$ urlprompt --help\n\nusage: urlprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]\n [-P PARTS [PARTS ...]] [-r] [-R]\n [--print-percentage-non-ascii] [-n] [--out OUT] [-w WHAT]\n [-M] [-j] [-g] [--github-path GITHUB_PATH]\n [--github-revision GITHUB_REVISION] [--substack]\n URL\n\nGet a prompt consisting the text content of a webpage\n\npositional arguments:\n URL URL to the webpage\n\noptions:\n -h, --help show this help message and exit\n -V, --version show program's version number and exit\n -c, --copy Copy the prompt to clipboard (default: False)\n -e, --edit Edit the prompt and copy manually (default: False)\n -m model, --model model\n Model to use. This only affects the chunk size. Use -S\n to disable splitting (infinite chunk size). (default:\n gpt-4-32k)\n -S, --no-split Do not split the prompt into multiple parts (use this\n if the model has a really large context size)\n (default: False)\n -s chunk_size, --chunk-size chunk_size\n Chunk size when splitting transcript, also used to\n determine whether to split, defaults to 1/2 of the\n context length limit of the model (default: None)\n -P PARTS [PARTS ...], --parts PARTS [PARTS ...]\n Parts to select in the processes list of Documents\n (default: None)\n -r, --raw Wraps the content in triple quotes with no extra text\n (default: False)\n -R, --raw-no-quotes Output the content only (default: False)\n --print-percentage-non-ascii\n Print percentage of non-ascii characters (default:\n False)\n -n, --dry-run Dry run (default: False)\n --out OUT Output file (default: None)\n -w WHAT, --what WHAT Initial knowledge you want to insert before the PDF\n content in the prompt (default: the content of a\n webpage)\n -M, --merge Merge contents of all pages before processing\n (default: False)\n -j, --javascript Use JavaScript to render the page (default: False)\n -g, --github Load the raw file from a GitHub URL (default: False)\n --github-path GITHUB_PATH\n Path to the GitHub file (default: README.md)\n --github-revision GITHUB_REVISION\n Revision for the GitHub file (default: master)\n --substack Load from a Substack URL and convert it to Markdown\n (default: False)\n\n```\n### `pdfprompt`\n\n```\n$ pdfprompt --help\n\nusage: pdfprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]\n [-P PARTS [PARTS ...]] [-r] [-R]\n [--print-percentage-non-ascii] [-n] [--out OUT]\n [-p PAGES [PAGES ...]] [-l PAGE_SLICE] [-M] [-w WHAT] [-o]\n [-O] [-L OCR_LANGUAGE]\n PDF Path\n\nGet a prompt consisting the text content of a PDF file\n\npositional arguments:\n PDF Path Path to the PDF file\n\noptions:\n -h, --help show this help message and exit\n -V, --version show program's version number and exit\n -c, --copy Copy the prompt to clipboard (default: False)\n -e, --edit Edit the prompt and copy manually (default: False)\n -m model, --model model\n Model to use. This only affects the chunk size. Use -S\n to disable splitting (infinite chunk size). (default:\n gpt-4-32k)\n -S, --no-split Do not split the prompt into multiple parts (use this\n if the model has a really large context size)\n (default: False)\n -s chunk_size, --chunk-size chunk_size\n Chunk size when splitting transcript, also used to\n determine whether to split, defaults to 1/2 of the\n context length limit of the model (default: None)\n -P PARTS [PARTS ...], --parts PARTS [PARTS ...]\n Parts to select in the processes list of Documents\n (default: None)\n -r, --raw Wraps the content in triple quotes with no extra text\n (default: False)\n -R, --raw-no-quotes Output the content only (default: False)\n --print-percentage-non-ascii\n Print percentage of non-ascii characters (default:\n False)\n -n, --dry-run Dry run (default: False)\n --out OUT Output file (default: None)\n -p PAGES [PAGES ...], --pages PAGES [PAGES ...]\n Only include specified page numbers (default: None)\n -l PAGE_SLICE, --page-slice PAGE_SLICE\n Use Python slice syntax to select page numbers (e.g.\n 1:3, 1:10:2, etc.) (default: None)\n -M, --merge Merge contents of all pages before processing\n (default: False)\n -w WHAT, --what WHAT Initial knowledge you want to insert before the PDF\n content in the prompt (default: the content of a PDF\n file)\n -o, --fallback-ocr Use OCR as fallback if no text detected on page,\n please set TESSDATA_PREFIX environment variable to the\n path of your tesseract data directory (default: False)\n -O, --force-ocr Force OCR on all pages (default: False)\n -L OCR_LANGUAGE, --ocr-language OCR_LANGUAGE\n Language to use for Tesseract OCR (like eng, chi_sim,\n chi_tra, chi_tra_vert etc.)) (default: eng)\n\n```\n### `ytprompt`\n\n```\n$ ytprompt --help\n\nusage: ytprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]\n [-P PARTS [PARTS ...]] [-r] [-R]\n [--print-percentage-non-ascii] [-n] [--out OUT]\n YouTube URL\n\nGet a prompt consisting Title and Transcript of a YouTube Video\n\npositional arguments:\n YouTube URL YouTube URL\n\noptions:\n -h, --help show this help message and exit\n -V, --version show program's version number and exit\n -c, --copy Copy the prompt to clipboard (default: False)\n -e, --edit Edit the prompt and copy manually (default: False)\n -m model, --model model\n Model to use. This only affects the chunk size. Use -S\n to disable splitting (infinite chunk size). (default:\n gpt-4-32k)\n -S, --no-split Do not split the prompt into multiple parts (use this\n if the model has a really large context size)\n (default: False)\n -s chunk_size, --chunk-size chunk_size\n Chunk size when splitting transcript, also used to\n determine whether to split, defaults to 1/2 of the\n context length limit of the model (default: None)\n -P PARTS [PARTS ...], --parts PARTS [PARTS ...]\n Parts to select in the processes list of Documents\n (default: None)\n -r, --raw Wraps the content in triple quotes with no extra text\n (default: False)\n -R, --raw-no-quotes Output the content only (default: False)\n --print-percentage-non-ascii\n Print percentage of non-ascii characters (default:\n False)\n -n, --dry-run Dry run (default: False)\n --out OUT Output file (default: None)\n\n```\n### `textprompt`\n\n```\n$ textprompt --help\n\nusage: textprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]\n [-P PARTS [PARTS ...]] [-r] [-R]\n [--print-percentage-non-ascii] [-n] [--out OUT] [-C]\n [-w WHAT] [-M]\n [PATH ...]\n\nGet a prompt from text files\n\npositional arguments:\n PATH Paths to the text files, or stdin if not provided\n (default: None)\n\noptions:\n -h, --help show this help message and exit\n -V, --version show program's version number and exit\n -c, --copy Copy the prompt to clipboard (default: False)\n -e, --edit Edit the prompt and copy manually (default: False)\n -m model, --model model\n Model to use. This only affects the chunk size. Use -S\n to disable splitting (infinite chunk size). (default:\n gpt-4-32k)\n -S, --no-split Do not split the prompt into multiple parts (use this\n if the model has a really large context size)\n (default: False)\n -s chunk_size, --chunk-size chunk_size\n Chunk size when splitting transcript, also used to\n determine whether to split, defaults to 1/2 of the\n context length limit of the model (default: None)\n -P PARTS [PARTS ...], --parts PARTS [PARTS ...]\n Parts to select in the processes list of Documents\n (default: None)\n -r, --raw Wraps the content in triple quotes with no extra text\n (default: False)\n -R, --raw-no-quotes Output the content only (default: False)\n --print-percentage-non-ascii\n Print percentage of non-ascii characters (default:\n False)\n -n, --dry-run Dry run (default: False)\n --out OUT Output file (default: None)\n -C, --from-clipboard Load text from clipboard (default: False)\n -w WHAT, --what WHAT Initial knowledge you want to insert before the PDF\n content in the prompt (default: the content of a\n document)\n -M, --merge Merge contents of all pages before processing\n (default: False)\n\n```\n### `htmlprompt`\n\n```\n$ htmlprompt --help\n\nusage: htmlprompt [-h] [-V] [-c] [-e] [-m model] [-S] [-s chunk_size]\n [-P PARTS [PARTS ...]] [-r] [-R]\n [--print-percentage-non-ascii] [-n] [--out OUT] [-C]\n [-w WHAT] [-M]\n [PATH ...]\n\nGet a prompt from html files\n\npositional arguments:\n PATH Paths to the html files, or stdin if not provided\n (default: None)\n\noptions:\n -h, --help show this help message and exit\n -V, --version show program's version number and exit\n -c, --copy Copy the prompt to clipboard (default: False)\n -e, --edit Edit the prompt and copy manually (default: False)\n -m model, --model model\n Model to use. This only affects the chunk size. Use -S\n to disable splitting (infinite chunk size). (default:\n gpt-4-32k)\n -S, --no-split Do not split the prompt into multiple parts (use this\n if the model has a really large context size)\n (default: False)\n -s chunk_size, --chunk-size chunk_size\n Chunk size when splitting transcript, also used to\n determine whether to split, defaults to 1/2 of the\n context length limit of the model (default: None)\n -P PARTS [PARTS ...], --parts PARTS [PARTS ...]\n Parts to select in the processes list of Documents\n (default: None)\n -r, --raw Wraps the content in triple quotes with no extra text\n (default: False)\n -R, --raw-no-quotes Output the content only (default: False)\n --print-percentage-non-ascii\n Print percentage of non-ascii characters (default:\n False)\n -n, --dry-run Dry run (default: False)\n --out OUT Output file (default: None)\n -C, --from-clipboard Load text from clipboard (default: False)\n -w WHAT, --what WHAT Initial knowledge you want to insert before the PDF\n content in the prompt (default: the text content of a\n html file)\n -M, --merge Merge contents of all pages before processing\n (default: False)\n\n```\n\n## Installation\n\n### pipx\n\nThis is the recommended installation method.\n\n```\n$ pipx install langchain-utils\n```\n\n### [pip](https://pypi.org/project/langchain-utils/)\n\n```\n$ pip install langchain-utils\n```\n\n## Develop\n\n```\n$ git clone https://github.com/tddschn/langchain-utils.git\n$ cd langchain-utils\n$ poetry install\n```",
"bugtrack_url": null,
"license": "MIT",
"summary": "Utilities built upon the langchain library",
"version": "0.8.0",
"project_urls": {
"Bug Tracker": "https://github.com/tddschn/langchain-utils/issues",
"Homepage": "https://github.com/tddschn/langchain-utils",
"Repository": "https://github.com/tddschn/langchain-utils"
},
"split_keywords": [
"langchain",
" utils",
" llm",
" prompts",
" cli"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0ced281e4deb22b99a6315760a1455db32177a23f260c598194448a04b7cb13d",
"md5": "24b58db6ad99c70464b977c4783abeaf",
"sha256": "dff1b76a58de8ac67a51380a1bd16e26a4868d4eaa0944ccc20a5eb5b021ee43"
},
"downloads": -1,
"filename": "langchain_utils-0.8.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "24b58db6ad99c70464b977c4783abeaf",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.11",
"size": 31206,
"upload_time": "2024-06-05T17:20:15",
"upload_time_iso_8601": "2024-06-05T17:20:15.533330Z",
"url": "https://files.pythonhosted.org/packages/0c/ed/281e4deb22b99a6315760a1455db32177a23f260c598194448a04b7cb13d/langchain_utils-0.8.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f17f93f5a879414f8755811405fd90595d6116b2c9abcd11b355fe6ea8d59824",
"md5": "c8e69d7ee0bb7756548fbfaf35aaa575",
"sha256": "73efaf790266acd9deb6105239e9937920e72fa2c2366827094b617b49a4d443"
},
"downloads": -1,
"filename": "langchain_utils-0.8.0.tar.gz",
"has_sig": false,
"md5_digest": "c8e69d7ee0bb7756548fbfaf35aaa575",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.11",
"size": 22193,
"upload_time": "2024-06-05T17:20:17",
"upload_time_iso_8601": "2024-06-05T17:20:17.243007Z",
"url": "https://files.pythonhosted.org/packages/f1/7f/93f5a879414f8755811405fd90595d6116b2c9abcd11b355fe6ea8d59824/langchain_utils-0.8.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-05 17:20:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tddschn",
"github_project": "langchain-utils",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "langchain-utils"
}