# MCP Server Code Extractor
A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing. Extract functions, classes, and code snippets from 30+ programming languages without manual parsing.
## Why MCP Server Code Extractor?
When working with AI coding assistants like Claude, you often need to:
- Extract specific functions or classes from large codebases
- Get an overview of what's in a file without reading the entire thing
- Retrieve precise code snippets with accurate line numbers
- Avoid manual parsing and grep/sed/awk gymnastics
MCP Server Code Extractor solves these problems by providing structured, tree-sitter-powered code extraction tools directly within your AI assistant.
## Features
- **🎯 Precise Extraction**: Uses tree-sitter parsing for accurate code boundary detection
- **🌍 30+ Languages**: Supports Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, and many more
- **📍 Line Numbers**: Every extraction includes precise line number information
- **🔍 Code Discovery**: List all functions and classes in a file before extracting
- **⚡ Fast & Lightweight**: Single-file implementation with minimal dependencies
- **🤖 AI-Optimized**: Designed specifically for use with AI coding assistants
## Installation
### Quick Start with UV (Recommended)
```bash
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone this repository
git clone https://github.com/ctoth/mcp_server_code_extractor
cd mcp_server_code_extractor
# Run directly with UV (no installation needed!)
uv run mcp_server_code_extractor.py
```
### Traditional Installation
```bash
pip install mcp[cli] tree-sitter-languages tree-sitter==0.21.3
```
### Configure with Claude Desktop
Add to your Claude Desktop configuration:
```json
{
"mcpServers": {
"mcp-server-code-extractor": {
"command": "uv",
"args": ["run", "/path/to/mcp_server_code_extractor.py"]
}
}
}
```
Or with traditional Python:
```json
{
"mcpServers": {
"mcp-server-code-extractor": {
"command": "python",
"args": ["/path/to/mcp_server_code_extractor.py"]
}
}
}
```
## Available Tools
### 1. `get_symbols` - Discover Code Structure
List all functions, classes, and other symbols in a file.
```
Returns:
- name: Symbol name
- type: function/class/method/etc
- start_line/end_line: Line numbers
- preview: First line of the symbol
```
### 2. `get_function` - Extract Complete Functions
Extract a complete function with all its code.
```
Parameters:
- file_path: Path to the source file
- function_name: Name of the function to extract
Returns:
- code: Complete function code
- start_line/end_line: Precise boundaries
- language: Detected language
```
### 3. `get_class` - Extract Complete Classes
Extract an entire class definition including all methods.
```
Parameters:
- file_path: Path to the source file
- class_name: Name of the class to extract
Returns:
- code: Complete class code
- start_line/end_line: Precise boundaries
- language: Detected language
```
### 4. `get_lines` - Extract Specific Line Ranges
Get exact line ranges when you know the line numbers.
```
Parameters:
- file_path: Path to the source file
- start_line: Starting line (1-based)
- end_line: Ending line (inclusive)
Returns:
- code: Extracted lines
- line numbers and metadata
```
### 5. `get_signature` - Get Function Signatures
Quickly get just the function signature without the body.
```
Parameters:
- file_path: Path to the source file
- function_name: Name of the function
Returns:
- signature: Function signature only
- start_line: Where the function starts
```
## Usage Examples
### Example 1: Exploring a Python File
```python
# First, see what's in the file
symbols = get_symbols("src/main.py")
# Returns: List of all functions and classes with line numbers
# Extract a specific function
result = get_function("src/main.py", "process_data")
# Returns: Complete function code with line numbers
# Get just a function signature
sig = get_signature("src/main.py", "process_data")
# Returns: "def process_data(input_file: str, output_dir: Path) -> Dict[str, Any]:"
```
### Example 2: Working with Classes
```python
# Extract an entire class
result = get_class("models/user.py", "User")
# Returns: Complete User class with all methods
# Get specific lines (e.g., just the __init__ method)
lines = get_lines("models/user.py", 10, 25)
# Returns: Lines 10-25 of the file
```
### Example 3: Multi-Language Support
```javascript
// Works with JavaScript/TypeScript
symbols = get_symbols("app.ts")
func = get_function("app.ts", "handleRequest")
```
```go
// Works with Go
symbols = get_symbols("main.go")
method = get_function("main.go", "ServeHTTP")
```
## Supported Languages
- Python, JavaScript, TypeScript, JSX/TSX
- Go, Rust, C, C++, C#, Java
- Ruby, PHP, Swift, Kotlin, Scala
- Bash, PowerShell, SQL
- Haskell, OCaml, Elixir, Clojure
- And many more...
## Best Practices
1. **Always use `get_symbols` first** when exploring a new file
2. **Use `get_function/get_class`** instead of reading entire files
3. **Use `get_lines`** when you know exact line numbers
4. **Use `get_signature`** for quick API exploration
## Why Not Just Use Read?
Traditional file reading tools require you to:
- Read entire files (inefficient for large files)
- Manually parse code to find functions/classes
- Count lines manually for extraction
- Deal with complex syntax and edge cases
MCP Server Code Extractor:
- ✅ Extracts exactly what you need
- ✅ Provides structured data with metadata
- ✅ Handles complex syntax automatically
- ✅ Works across 30+ languages consistently
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
MIT License - see LICENSE file for details.
## Acknowledgments
- Built on [tree-sitter](https://tree-sitter.github.io/) for robust parsing
- Uses [tree-sitter-languages](https://github.com/grantjenks/py-tree-sitter-languages) for language support
- Implements the [Model Context Protocol](https://modelcontextprotocol.io/) specification
Raw data
{
"_id": null,
"home_page": null,
"name": "mcp-server-code-extractor",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "Christopher Toth <q.alpha@gmail.com>",
"keywords": "mcp, model-context-protocol, code-extraction, tree-sitter, ai-tools",
"author": null,
"author_email": "Christopher Toth <q.alpha@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/7c/2c/bfc061e19cb96da058aed7fb7d55129ccbdd74df6fa749a2fb312367d64a/mcp_server_code_extractor-0.2.2.tar.gz",
"platform": null,
"description": "# MCP Server Code Extractor\n\nA Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing. Extract functions, classes, and code snippets from 30+ programming languages without manual parsing.\n\n## Why MCP Server Code Extractor?\n\nWhen working with AI coding assistants like Claude, you often need to:\n- Extract specific functions or classes from large codebases\n- Get an overview of what's in a file without reading the entire thing\n- Retrieve precise code snippets with accurate line numbers\n- Avoid manual parsing and grep/sed/awk gymnastics\n\nMCP Server Code Extractor solves these problems by providing structured, tree-sitter-powered code extraction tools directly within your AI assistant.\n\n## Features\n\n- **\ud83c\udfaf Precise Extraction**: Uses tree-sitter parsing for accurate code boundary detection\n- **\ud83c\udf0d 30+ Languages**: Supports Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, and many more\n- **\ud83d\udccd Line Numbers**: Every extraction includes precise line number information\n- **\ud83d\udd0d Code Discovery**: List all functions and classes in a file before extracting\n- **\u26a1 Fast & Lightweight**: Single-file implementation with minimal dependencies\n- **\ud83e\udd16 AI-Optimized**: Designed specifically for use with AI coding assistants\n\n## Installation\n\n### Quick Start with UV (Recommended)\n\n```bash\n# Install UV if you haven't already\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Clone this repository\ngit clone https://github.com/ctoth/mcp_server_code_extractor\ncd mcp_server_code_extractor\n\n# Run directly with UV (no installation needed!)\nuv run mcp_server_code_extractor.py\n```\n\n### Traditional Installation\n\n```bash\npip install mcp[cli] tree-sitter-languages tree-sitter==0.21.3\n```\n\n### Configure with Claude Desktop\n\nAdd to your Claude Desktop configuration:\n\n```json\n{\n \"mcpServers\": {\n \"mcp-server-code-extractor\": {\n \"command\": \"uv\",\n \"args\": [\"run\", \"/path/to/mcp_server_code_extractor.py\"]\n }\n }\n}\n```\n\nOr with traditional Python:\n\n```json\n{\n \"mcpServers\": {\n \"mcp-server-code-extractor\": {\n \"command\": \"python\",\n \"args\": [\"/path/to/mcp_server_code_extractor.py\"]\n }\n }\n}\n```\n\n## Available Tools\n\n### 1. `get_symbols` - Discover Code Structure\nList all functions, classes, and other symbols in a file.\n\n```\nReturns:\n- name: Symbol name\n- type: function/class/method/etc\n- start_line/end_line: Line numbers\n- preview: First line of the symbol\n```\n\n### 2. `get_function` - Extract Complete Functions\nExtract a complete function with all its code.\n\n```\nParameters:\n- file_path: Path to the source file\n- function_name: Name of the function to extract\n\nReturns:\n- code: Complete function code\n- start_line/end_line: Precise boundaries\n- language: Detected language\n```\n\n### 3. `get_class` - Extract Complete Classes\nExtract an entire class definition including all methods.\n\n```\nParameters:\n- file_path: Path to the source file\n- class_name: Name of the class to extract\n\nReturns:\n- code: Complete class code\n- start_line/end_line: Precise boundaries\n- language: Detected language\n```\n\n### 4. `get_lines` - Extract Specific Line Ranges\nGet exact line ranges when you know the line numbers.\n\n```\nParameters:\n- file_path: Path to the source file\n- start_line: Starting line (1-based)\n- end_line: Ending line (inclusive)\n\nReturns:\n- code: Extracted lines\n- line numbers and metadata\n```\n\n### 5. `get_signature` - Get Function Signatures\nQuickly get just the function signature without the body.\n\n```\nParameters:\n- file_path: Path to the source file\n- function_name: Name of the function\n\nReturns:\n- signature: Function signature only\n- start_line: Where the function starts\n```\n\n## Usage Examples\n\n### Example 1: Exploring a Python File\n\n```python\n# First, see what's in the file\nsymbols = get_symbols(\"src/main.py\")\n# Returns: List of all functions and classes with line numbers\n\n# Extract a specific function\nresult = get_function(\"src/main.py\", \"process_data\")\n# Returns: Complete function code with line numbers\n\n# Get just a function signature\nsig = get_signature(\"src/main.py\", \"process_data\")\n# Returns: \"def process_data(input_file: str, output_dir: Path) -> Dict[str, Any]:\"\n```\n\n### Example 2: Working with Classes\n\n```python\n# Extract an entire class\nresult = get_class(\"models/user.py\", \"User\")\n# Returns: Complete User class with all methods\n\n# Get specific lines (e.g., just the __init__ method)\nlines = get_lines(\"models/user.py\", 10, 25)\n# Returns: Lines 10-25 of the file\n```\n\n### Example 3: Multi-Language Support\n\n```javascript\n// Works with JavaScript/TypeScript\nsymbols = get_symbols(\"app.ts\")\nfunc = get_function(\"app.ts\", \"handleRequest\")\n```\n\n```go\n// Works with Go\nsymbols = get_symbols(\"main.go\")\nmethod = get_function(\"main.go\", \"ServeHTTP\")\n```\n\n## Supported Languages\n\n- Python, JavaScript, TypeScript, JSX/TSX\n- Go, Rust, C, C++, C#, Java\n- Ruby, PHP, Swift, Kotlin, Scala\n- Bash, PowerShell, SQL\n- Haskell, OCaml, Elixir, Clojure\n- And many more...\n\n## Best Practices\n\n1. **Always use `get_symbols` first** when exploring a new file\n2. **Use `get_function/get_class`** instead of reading entire files\n3. **Use `get_lines`** when you know exact line numbers\n4. **Use `get_signature`** for quick API exploration\n\n## Why Not Just Use Read?\n\nTraditional file reading tools require you to:\n- Read entire files (inefficient for large files)\n- Manually parse code to find functions/classes\n- Count lines manually for extraction\n- Deal with complex syntax and edge cases\n\nMCP Server Code Extractor:\n- \u2705 Extracts exactly what you need\n- \u2705 Provides structured data with metadata\n- \u2705 Handles complex syntax automatically\n- \u2705 Works across 30+ languages consistently\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nMIT License - see LICENSE file for details.\n\n## Acknowledgments\n\n- Built on [tree-sitter](https://tree-sitter.github.io/) for robust parsing\n- Uses [tree-sitter-languages](https://github.com/grantjenks/py-tree-sitter-languages) for language support\n- Implements the [Model Context Protocol](https://modelcontextprotocol.io/) specification\n",
"bugtrack_url": null,
"license": null,
"summary": "A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing",
"version": "0.2.2",
"project_urls": {
"Documentation": "https://github.com/ctoth/mcp-code-extractor#readme",
"Homepage": "https://github.com/ctoth/mcp_server_code_extractor",
"Issues": "https://github.com/ctoth/mcp_server_code_extractor/issues",
"Repository": "https://github.com/ctoth/mcp_server_code_extractor"
},
"split_keywords": [
"mcp",
" model-context-protocol",
" code-extraction",
" tree-sitter",
" ai-tools"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d237c16dacbde440704f20fdaaf7b528ee3d6510c2dc2aba558dc9b25bee9bd1",
"md5": "32d3e0767dc98f2b0d4e2c35d53c6495",
"sha256": "7878c3bcf4a4360b6f7eaef0b04e6ad3d0dba00d2cedb90a7390ec503894e9fb"
},
"downloads": -1,
"filename": "mcp_server_code_extractor-0.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "32d3e0767dc98f2b0d4e2c35d53c6495",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 15064,
"upload_time": "2025-07-12T08:38:48",
"upload_time_iso_8601": "2025-07-12T08:38:48.826956Z",
"url": "https://files.pythonhosted.org/packages/d2/37/c16dacbde440704f20fdaaf7b528ee3d6510c2dc2aba558dc9b25bee9bd1/mcp_server_code_extractor-0.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7c2cbfc061e19cb96da058aed7fb7d55129ccbdd74df6fa749a2fb312367d64a",
"md5": "2240bf6b9d0709a376bfa734e9f25907",
"sha256": "8f46f1c395b52be66128d022945cb6d5b5e68de52711850681633999e89af0b5"
},
"downloads": -1,
"filename": "mcp_server_code_extractor-0.2.2.tar.gz",
"has_sig": false,
"md5_digest": "2240bf6b9d0709a376bfa734e9f25907",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 20314,
"upload_time": "2025-07-12T08:38:50",
"upload_time_iso_8601": "2025-07-12T08:38:50.017062Z",
"url": "https://files.pythonhosted.org/packages/7c/2c/bfc061e19cb96da058aed7fb7d55129ccbdd74df6fa749a2fb312367d64a/mcp_server_code_extractor-0.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-12 08:38:50",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ctoth",
"github_project": "mcp-code-extractor#readme",
"github_not_found": true,
"lcname": "mcp-server-code-extractor"
}