## llms-txt-mcp
[](https://pypi.org/project/llms-txt-mcp/) [](https://www.python.org/downloads/) [](https://github.com/modelcontextprotocol/python-sdk) [](LICENSE)
Fast, surgical access to big docs in Claude Code via llms.txt. Search first, fetch only what matters.
### Why this exists
- Hitting token limits and timeouts on huge `llms.txt` files hurts flow and drowns context.
- This MCP keeps responses tiny and relevant. No dumps, no noise — just the parts you asked for.
### Quick start (Claude Desktop)
Add to `~/Library/Application Support/Claude/claude_desktop_config.json` or `.mcp.json` in your project:
```json
{
"mcpServers": {
"llms-txt-mcp": {
"command": "uvx",
"args": [
"llms-txt-mcp",
"https://ai-sdk.dev/llms.txt",
"https://nextjs.org/docs/llms.txt",
"https://orm.drizzle.team/llms.txt"
]
}
}
}
```
Now Claude Code|Desktop can instantly search and retrieve exactly what it needs from those docs.
### How it works
URL → Parse YAML/Markdown → Embed → Search → Get Section
- Parses multiple llms.txt formats (YAML frontmatter + Markdown)
- Embeds sections and searches semantically
- Retrieves only the top matches with a byte cap (default: 75KB)
### Features
- Instant startup with lazy loading and background indexing
- Search-first; no full-document dumps
- Byte-capped responses to protect context windows
- Human-readable IDs (e.g. `https://ai-sdk.dev/llms.txt#rag-agent`)
### Source resolution and crawling behavior
- Always checks for `llms-full.txt` first, even when `llms.txt` is configured. If present, it uses `llms-full.txt` for richer structure.
- For a plain `llms.txt` that only lists links, it indexes those links in the collection but does not crawl or scrape the pages behind them. Link-following/scraping may be added later.
### Talk to it in Claude Code|Desktop
- "Search Next.js docs for middleware routing. Give only the most relevant sections and keep it under 60 KB."
- "From Drizzle ORM docs, show how to define relations. Retrieve the exact section content."
- "List which sources are indexed right now."
- "Refresh the Drizzle docs so I get the latest version, then search for migrations."
- "Get the section for app router dynamic routes from Next.js using its canonical ID."
### Configuration (optional)
- **--store-path PATH** (default: none) Absolute path to persist embeddings. If set, disk persistence is enabled automatically. Prefer absolute paths (e.g., `/Users/you/.llms-cache`).
- **--ttl DURATION** (default: `24h`) Refresh cadence for sources. Supports `30m`, `24h`, `7d`.
- **--timeout SECONDS** (default: `30`) HTTP timeout.
- **--embed-model MODEL** (default: `BAAI/bge-small-en-v1.5`) SentenceTransformers model id.
- **--max-get-bytes N** (default: `75000`) Byte cap for retrieved content.
- **--auto-retrieve-threshold FLOAT** (default: `0.1`) Score threshold (0–1) to auto-retrieve matches.
- **--auto-retrieve-limit N** (default: `5`) Max docs to auto-retrieve per query.
- **--no-preindex** (default: off) Disable automatic pre-indexing on launch.
- **--no-background-preindex** (default: off) If preindexing is on, wait for it to finish before serving.
- **--no-snippets** (default: off) Disable content snippets in search results.
- **--sources ... / positional sources** One or more `llms.txt` or `llms-full.txt` URLs.
- **--store {memory|disk}** (default: auto) Not usually needed. Auto-selected based on `--store-path`. Use only to explicitly override behavior.
### Development
```bash
make install # install deps
make test # run tests
make check # format check, lint, type-check, tests
make fix # auto-format and fix lint
```
Built on [FastMCP](https://github.com/modelcontextprotocol/python-sdk) and the [Model Context Protocol](https://modelcontextprotocol.io). MIT license — see `LICENSE`.
Raw data
{
"_id": null,
"home_page": null,
"name": "llms-txt-mcp",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "claude code, documentation, llms.txt, mcp, model context protocol, semantic search",
"author": null,
"author_email": "Misha Kolesnik <misha@kolesnik.io>",
"download_url": "https://files.pythonhosted.org/packages/e1/6f/c95628156a5df31650bd13b0b197dba8281e31be985acb19b4d8d2e3f7d6/llms_txt_mcp-0.1.1.tar.gz",
"platform": null,
"description": "## llms-txt-mcp\n\n[](https://pypi.org/project/llms-txt-mcp/) [](https://www.python.org/downloads/) [](https://github.com/modelcontextprotocol/python-sdk) [](LICENSE)\n\nFast, surgical access to big docs in Claude Code via llms.txt. Search first, fetch only what matters.\n\n### Why this exists\n- Hitting token limits and timeouts on huge `llms.txt` files hurts flow and drowns context.\n- This MCP keeps responses tiny and relevant. No dumps, no noise \u2014 just the parts you asked for.\n\n### Quick start (Claude Desktop)\nAdd to `~/Library/Application Support/Claude/claude_desktop_config.json` or `.mcp.json` in your project:\n```json\n{\n \"mcpServers\": {\n \"llms-txt-mcp\": {\n \"command\": \"uvx\",\n \"args\": [\n \"llms-txt-mcp\",\n \"https://ai-sdk.dev/llms.txt\",\n \"https://nextjs.org/docs/llms.txt\",\n \"https://orm.drizzle.team/llms.txt\"\n ]\n }\n }\n}\n```\nNow Claude Code|Desktop can instantly search and retrieve exactly what it needs from those docs.\n\n### How it works\nURL \u2192 Parse YAML/Markdown \u2192 Embed \u2192 Search \u2192 Get Section\n- Parses multiple llms.txt formats (YAML frontmatter + Markdown)\n- Embeds sections and searches semantically\n- Retrieves only the top matches with a byte cap (default: 75KB)\n\n### Features\n- Instant startup with lazy loading and background indexing\n- Search-first; no full-document dumps\n- Byte-capped responses to protect context windows\n- Human-readable IDs (e.g. `https://ai-sdk.dev/llms.txt#rag-agent`)\n\n### Source resolution and crawling behavior\n- Always checks for `llms-full.txt` first, even when `llms.txt` is configured. If present, it uses `llms-full.txt` for richer structure.\n- For a plain `llms.txt` that only lists links, it indexes those links in the collection but does not crawl or scrape the pages behind them. Link-following/scraping may be added later.\n\n### Talk to it in Claude Code|Desktop\n- \"Search Next.js docs for middleware routing. Give only the most relevant sections and keep it under 60 KB.\"\n- \"From Drizzle ORM docs, show how to define relations. Retrieve the exact section content.\"\n- \"List which sources are indexed right now.\"\n- \"Refresh the Drizzle docs so I get the latest version, then search for migrations.\"\n- \"Get the section for app router dynamic routes from Next.js using its canonical ID.\"\n\n### Configuration (optional)\n- **--store-path PATH** (default: none) Absolute path to persist embeddings. If set, disk persistence is enabled automatically. Prefer absolute paths (e.g., `/Users/you/.llms-cache`).\n- **--ttl DURATION** (default: `24h`) Refresh cadence for sources. Supports `30m`, `24h`, `7d`.\n- **--timeout SECONDS** (default: `30`) HTTP timeout.\n- **--embed-model MODEL** (default: `BAAI/bge-small-en-v1.5`) SentenceTransformers model id.\n- **--max-get-bytes N** (default: `75000`) Byte cap for retrieved content.\n- **--auto-retrieve-threshold FLOAT** (default: `0.1`) Score threshold (0\u20131) to auto-retrieve matches.\n- **--auto-retrieve-limit N** (default: `5`) Max docs to auto-retrieve per query.\n- **--no-preindex** (default: off) Disable automatic pre-indexing on launch.\n- **--no-background-preindex** (default: off) If preindexing is on, wait for it to finish before serving.\n- **--no-snippets** (default: off) Disable content snippets in search results.\n- **--sources ... / positional sources** One or more `llms.txt` or `llms-full.txt` URLs.\n\n- **--store {memory|disk}** (default: auto) Not usually needed. Auto-selected based on `--store-path`. Use only to explicitly override behavior.\n\n### Development\n```bash\nmake install # install deps\nmake test # run tests\nmake check # format check, lint, type-check, tests\nmake fix # auto-format and fix lint\n```\n\nBuilt on [FastMCP](https://github.com/modelcontextprotocol/python-sdk) and the [Model Context Protocol](https://modelcontextprotocol.io). MIT license \u2014 see `LICENSE`.",
"bugtrack_url": null,
"license": "MIT",
"summary": "Lean MCP server for minimal-context docs via llms.txt",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/tenequm/llms-mcp-txt",
"Issues": "https://github.com/tenequm/llms-mcp-txt/issues",
"Repository": "https://github.com/tenequm/llms-mcp-txt"
},
"split_keywords": [
"claude code",
" documentation",
" llms.txt",
" mcp",
" model context protocol",
" semantic search"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "0e7a7b619168f6f4908ed9e6fd65d638a525c5c4daddad52a51f93e4b163ed2f",
"md5": "ef2f5143e613b620eea5ad7e0fe0d2d5",
"sha256": "eb6d38e7587a29c63cbb2d770dbfbf0e385ee01c045b6fa65f7f12d35fa274ca"
},
"downloads": -1,
"filename": "llms_txt_mcp-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ef2f5143e613b620eea5ad7e0fe0d2d5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 19012,
"upload_time": "2025-08-10T08:51:28",
"upload_time_iso_8601": "2025-08-10T08:51:28.394471Z",
"url": "https://files.pythonhosted.org/packages/0e/7a/7b619168f6f4908ed9e6fd65d638a525c5c4daddad52a51f93e4b163ed2f/llms_txt_mcp-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "e16fc95628156a5df31650bd13b0b197dba8281e31be985acb19b4d8d2e3f7d6",
"md5": "9cf02115a1473b24240190d7e2bef9dc",
"sha256": "45131928d3060388eb08bffe5d8b65ed78ba4c8a42c7c2c4f6e5bb1bcf4e4b3d"
},
"downloads": -1,
"filename": "llms_txt_mcp-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "9cf02115a1473b24240190d7e2bef9dc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 2096214,
"upload_time": "2025-08-10T08:51:29",
"upload_time_iso_8601": "2025-08-10T08:51:29.900221Z",
"url": "https://files.pythonhosted.org/packages/e1/6f/c95628156a5df31650bd13b0b197dba8281e31be985acb19b4d8d2e3f7d6/llms_txt_mcp-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-10 08:51:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tenequm",
"github_project": "llms-mcp-txt",
"github_not_found": true,
"lcname": "llms-txt-mcp"
}