llms-txt-mcp

Name	llms-txt-mcp JSON
Version	0.1.1 JSON
	download
home_page	None
Summary	Lean MCP server for minimal-context docs via llms.txt
upload_time	2025-08-10 08:51:29
maintainer	None
docs_url	None
author	None
requires_python	>=3.12
license	MIT
keywords	claude code documentation llms.txt mcp model context protocol semantic search
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            ## llms-txt-mcp

[![PyPI](https://img.shields.io/pypi/v/llms-txt-mcp.svg)](https://pypi.org/project/llms-txt-mcp/) [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/) [![MCP SDK 1.12+](https://img.shields.io/badge/MCP%20SDK-1.12+-purple.svg)](https://github.com/modelcontextprotocol/python-sdk) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

Fast, surgical access to big docs in Claude Code via llms.txt. Search first, fetch only what matters.

### Why this exists
- Hitting token limits and timeouts on huge `llms.txt` files hurts flow and drowns context.
- This MCP keeps responses tiny and relevant. No dumps, no noise — just the parts you asked for.

### Quick start (Claude Desktop)
Add to `~/Library/Application Support/Claude/claude_desktop_config.json` or `.mcp.json` in your project:
```json
{
  "mcpServers": {
    "llms-txt-mcp": {
      "command": "uvx",
      "args": [
        "llms-txt-mcp",
        "https://ai-sdk.dev/llms.txt",
        "https://nextjs.org/docs/llms.txt",
        "https://orm.drizzle.team/llms.txt"
      ]
    }
  }
}
```
Now Claude Code|Desktop can instantly search and retrieve exactly what it needs from those docs.

### How it works
URL → Parse YAML/Markdown → Embed → Search → Get Section
- Parses multiple llms.txt formats (YAML frontmatter + Markdown)
- Embeds sections and searches semantically
- Retrieves only the top matches with a byte cap (default: 75KB)

### Features
- Instant startup with lazy loading and background indexing
- Search-first; no full-document dumps
- Byte-capped responses to protect context windows
- Human-readable IDs (e.g. `https://ai-sdk.dev/llms.txt#rag-agent`)

### Source resolution and crawling behavior
- Always checks for `llms-full.txt` first, even when `llms.txt` is configured. If present, it uses `llms-full.txt` for richer structure.
- For a plain `llms.txt` that only lists links, it indexes those links in the collection but does not crawl or scrape the pages behind them. Link-following/scraping may be added later.

### Talk to it in Claude Code|Desktop
- "Search Next.js docs for middleware routing. Give only the most relevant sections and keep it under 60 KB."
- "From Drizzle ORM docs, show how to define relations. Retrieve the exact section content."
- "List which sources are indexed right now."
- "Refresh the Drizzle docs so I get the latest version, then search for migrations."
- "Get the section for app router dynamic routes from Next.js using its canonical ID."

### Configuration (optional)
- **--store-path PATH** (default: none) Absolute path to persist embeddings. If set, disk persistence is enabled automatically. Prefer absolute paths (e.g., `/Users/you/.llms-cache`).
- **--ttl DURATION** (default: `24h`) Refresh cadence for sources. Supports `30m`, `24h`, `7d`.
- **--timeout SECONDS** (default: `30`) HTTP timeout.
- **--embed-model MODEL** (default: `BAAI/bge-small-en-v1.5`) SentenceTransformers model id.
- **--max-get-bytes N** (default: `75000`) Byte cap for retrieved content.
- **--auto-retrieve-threshold FLOAT** (default: `0.1`) Score threshold (0–1) to auto-retrieve matches.
- **--auto-retrieve-limit N** (default: `5`) Max docs to auto-retrieve per query.
- **--no-preindex** (default: off) Disable automatic pre-indexing on launch.
- **--no-background-preindex** (default: off) If preindexing is on, wait for it to finish before serving.
- **--no-snippets** (default: off) Disable content snippets in search results.
- **--sources ... / positional sources** One or more `llms.txt` or `llms-full.txt` URLs.

- **--store {memory|disk}** (default: auto) Not usually needed. Auto-selected based on `--store-path`. Use only to explicitly override behavior.

### Development
```bash
make install  # install deps
make test     # run tests
make check    # format check, lint, type-check, tests
make fix      # auto-format and fix lint
```

Built on [FastMCP](https://github.com/modelcontextprotocol/python-sdk) and the [Model Context Protocol](https://modelcontextprotocol.io). MIT license — see `LICENSE`.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llms-txt-mcp",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "claude code, documentation, llms.txt, mcp, model context protocol, semantic search",
    "author": null,
    "author_email": "Misha Kolesnik <misha@kolesnik.io>",
    "download_url": "https://files.pythonhosted.org/packages/e1/6f/c95628156a5df31650bd13b0b197dba8281e31be985acb19b4d8d2e3f7d6/llms_txt_mcp-0.1.1.tar.gz",
    "platform": null,
    "description": "## llms-txt-mcp\n\n[![PyPI](https://img.shields.io/pypi/v/llms-txt-mcp.svg)](https://pypi.org/project/llms-txt-mcp/) [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/) [![MCP SDK 1.12+](https://img.shields.io/badge/MCP%20SDK-1.12+-purple.svg)](https://github.com/modelcontextprotocol/python-sdk) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n\nFast, surgical access to big docs in Claude Code via llms.txt. Search first, fetch only what matters.\n\n### Why this exists\n- Hitting token limits and timeouts on huge `llms.txt` files hurts flow and drowns context.\n- This MCP keeps responses tiny and relevant. No dumps, no noise \u2014 just the parts you asked for.\n\n### Quick start (Claude Desktop)\nAdd to `~/Library/Application Support/Claude/claude_desktop_config.json` or `.mcp.json` in your project:\n```json\n{\n  \"mcpServers\": {\n    \"llms-txt-mcp\": {\n      \"command\": \"uvx\",\n      \"args\": [\n        \"llms-txt-mcp\",\n        \"https://ai-sdk.dev/llms.txt\",\n        \"https://nextjs.org/docs/llms.txt\",\n        \"https://orm.drizzle.team/llms.txt\"\n      ]\n    }\n  }\n}\n```\nNow Claude Code|Desktop can instantly search and retrieve exactly what it needs from those docs.\n\n### How it works\nURL \u2192 Parse YAML/Markdown \u2192 Embed \u2192 Search \u2192 Get Section\n- Parses multiple llms.txt formats (YAML frontmatter + Markdown)\n- Embeds sections and searches semantically\n- Retrieves only the top matches with a byte cap (default: 75KB)\n\n### Features\n- Instant startup with lazy loading and background indexing\n- Search-first; no full-document dumps\n- Byte-capped responses to protect context windows\n- Human-readable IDs (e.g. `https://ai-sdk.dev/llms.txt#rag-agent`)\n\n### Source resolution and crawling behavior\n- Always checks for `llms-full.txt` first, even when `llms.txt` is configured. If present, it uses `llms-full.txt` for richer structure.\n- For a plain `llms.txt` that only lists links, it indexes those links in the collection but does not crawl or scrape the pages behind them. Link-following/scraping may be added later.\n\n### Talk to it in Claude Code|Desktop\n- \"Search Next.js docs for middleware routing. Give only the most relevant sections and keep it under 60 KB.\"\n- \"From Drizzle ORM docs, show how to define relations. Retrieve the exact section content.\"\n- \"List which sources are indexed right now.\"\n- \"Refresh the Drizzle docs so I get the latest version, then search for migrations.\"\n- \"Get the section for app router dynamic routes from Next.js using its canonical ID.\"\n\n### Configuration (optional)\n- **--store-path PATH** (default: none) Absolute path to persist embeddings. If set, disk persistence is enabled automatically. Prefer absolute paths (e.g., `/Users/you/.llms-cache`).\n- **--ttl DURATION** (default: `24h`) Refresh cadence for sources. Supports `30m`, `24h`, `7d`.\n- **--timeout SECONDS** (default: `30`) HTTP timeout.\n- **--embed-model MODEL** (default: `BAAI/bge-small-en-v1.5`) SentenceTransformers model id.\n- **--max-get-bytes N** (default: `75000`) Byte cap for retrieved content.\n- **--auto-retrieve-threshold FLOAT** (default: `0.1`) Score threshold (0\u20131) to auto-retrieve matches.\n- **--auto-retrieve-limit N** (default: `5`) Max docs to auto-retrieve per query.\n- **--no-preindex** (default: off) Disable automatic pre-indexing on launch.\n- **--no-background-preindex** (default: off) If preindexing is on, wait for it to finish before serving.\n- **--no-snippets** (default: off) Disable content snippets in search results.\n- **--sources ... / positional sources** One or more `llms.txt` or `llms-full.txt` URLs.\n\n- **--store {memory|disk}** (default: auto) Not usually needed. Auto-selected based on `--store-path`. Use only to explicitly override behavior.\n\n### Development\n```bash\nmake install  # install deps\nmake test     # run tests\nmake check    # format check, lint, type-check, tests\nmake fix      # auto-format and fix lint\n```\n\nBuilt on [FastMCP](https://github.com/modelcontextprotocol/python-sdk) and the [Model Context Protocol](https://modelcontextprotocol.io). MIT license \u2014 see `LICENSE`.",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Lean MCP server for minimal-context docs via llms.txt",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/tenequm/llms-mcp-txt",
        "Issues": "https://github.com/tenequm/llms-mcp-txt/issues",
        "Repository": "https://github.com/tenequm/llms-mcp-txt"
    },
    "split_keywords": [
        "claude code",
        " documentation",
        " llms.txt",
        " mcp",
        " model context protocol",
        " semantic search"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0e7a7b619168f6f4908ed9e6fd65d638a525c5c4daddad52a51f93e4b163ed2f",
                "md5": "ef2f5143e613b620eea5ad7e0fe0d2d5",
                "sha256": "eb6d38e7587a29c63cbb2d770dbfbf0e385ee01c045b6fa65f7f12d35fa274ca"
            },
            "downloads": -1,
            "filename": "llms_txt_mcp-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ef2f5143e613b620eea5ad7e0fe0d2d5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 19012,
            "upload_time": "2025-08-10T08:51:28",
            "upload_time_iso_8601": "2025-08-10T08:51:28.394471Z",
            "url": "https://files.pythonhosted.org/packages/0e/7a/7b619168f6f4908ed9e6fd65d638a525c5c4daddad52a51f93e4b163ed2f/llms_txt_mcp-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e16fc95628156a5df31650bd13b0b197dba8281e31be985acb19b4d8d2e3f7d6",
                "md5": "9cf02115a1473b24240190d7e2bef9dc",
                "sha256": "45131928d3060388eb08bffe5d8b65ed78ba4c8a42c7c2c4f6e5bb1bcf4e4b3d"
            },
            "downloads": -1,
            "filename": "llms_txt_mcp-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "9cf02115a1473b24240190d7e2bef9dc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 2096214,
            "upload_time": "2025-08-10T08:51:29",
            "upload_time_iso_8601": "2025-08-10T08:51:29.900221Z",
            "url": "https://files.pythonhosted.org/packages/e1/6f/c95628156a5df31650bd13b0b197dba8281e31be985acb19b4d8d2e3f7d6/llms_txt_mcp-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-10 08:51:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tenequm",
    "github_project": "llms-mcp-txt",
    "github_not_found": true,
    "lcname": "llms-txt-mcp"
}

None