# RepoMap
A tool for generating intelligent repository maps showing code structure and relationships.
RepoMap analyzes your codebase to create a compact, context-aware map that highlights the most relevant parts of your code. It uses tree-sitter for parsing, PageRank for ranking importance, and intelligent filtering to create useful repository maps.
This is a standalone version extracted from the [aider](https://github.com/paul-gauthier/aider) project.
## Features
- **Smart code analysis**: Uses tree-sitter to parse code and understand symbols
- **Intelligent ranking**: Employs PageRank algorithm to identify important code sections
- **Language support**: Supports many programming languages through tree-sitter
- **Caching**: Fast incremental updates with intelligent caching
- **Customizable**: Configurable token limits and context windows
## Installation
### Using uv (recommended)
```bash
# Install as a standalone tool (creates isolated environment)
uv tool install sourcemap-cli
```
For development or building from source:
```bash
# Build the package
uv build
# Install the built wheel
uv pip install dist/sourcemap_cli-*.whl
# Or for development (editable install)
uv pip install -e .
uv pip install -e ".[dev]" # with dev dependencies
```
### Using pip
```bash
# Install from PyPI
pip install sourcemap-cli
# Or install from source
pip install .
# Development install
pip install -e .[dev]
```
## Usage
By default, running `sourcemap` with no arguments launches the interactive TUI. Supplying
arguments invokes the classic CLI.
### CLI Examples
Generate a repository map for specific files:
```bash
sourcemap file1.py file2.js directory/
```
### Options
- `--tokens, -t`: Maximum tokens for the map (default: 8192)
- `--verbose, -v`: Enable verbose output
- `--root, -r`: Root directory for the repository (default: current directory)
- `--refresh`: Cache refresh strategy (auto/always/files/manual, default: auto)
- `--max-context-window`: Maximum context window size
- `--output, -o`: Output file path (default: stdout)
- `--format, -f`: Output format - `text` or `json` (default: text)
- `--all-files`: Include all files regardless of ranking (ignores token limit)
- `--list-files`: Just list all files found, no analysis
- `--no-gitignore`: Include files that are gitignored
- `--git-staged`: Only include files with staged changes in git
- `--recent DAYS`: Only include files modified in the last N days
### Examples
Analyze Python files in a project:
```bash
sourcemap src/*.py --tokens 2048
```
Generate a map for an entire directory:
```bash
sourcemap . --verbose
```
Analyze multiple specific files:
```bash
sourcemap main.py utils.py tests/ --root /path/to/project
```
Save output to a file:
```bash
sourcemap src/ --output sourcemap.txt
```
Generate JSON output:
```bash
sourcemap src/ --format json --output sourcemap.json
```
Pipe JSON output to other tools:
```bash
sourcemap src/*.py --format json | jq '.files | keys'
```
List all source files in a directory:
```bash
sourcemap src/ --list-files
```
Include ALL files in the analysis (ignore token limit):
```bash
sourcemap src/ --all-files --output full-analysis.txt
```
Analyze only staged files (great for pre-commit):
```bash
sourcemap --git-staged
```
Analyze files modified in the last 7 days:
```bash
sourcemap --recent 7
```
Include gitignored files in the analysis:
```bash
sourcemap src/ --no-gitignore
```
Combine filters - staged files from the last 3 days:
```bash
sourcemap --git-staged --recent 3
```
### Python Library
Use RepoMap as a library to generate text or JSON maps programmatically:
```python
from sourcemap_cli import generate_map, MapOptions
files = ["."] # files or directories
opts = MapOptions(tokens=2048, root=".")
# Text output
text_map = generate_map(files, options=opts, format="text")
# JSON output (as Python dict)
json_map = generate_map(files, options=opts, format="json")
```
### Interactive TUI
A Rich/Textual-based TUI is included. After installing the package:
```bash
sourcemap-tui
# or just run with no args
sourcemap
```
Controls: G (Generate), S (Save), Q (Quit). Edit Root/Tokens/Format fields in the top bar, then press Generate. The main pane is scrollable.
The TUI uses Textual (built on Rich), so it works cross‑platform without curses.
### Prompts Mode (fallback)
If your terminal cannot run the TUI, `sourcemap` falls back to a simple prompts mode (Typer).
You’ll be asked for the root directory, token budget, and output format; the result prints
to the terminal. You can always use the full CLI via `sourcemap map ...`.
## How it Works
RepoMap generates a ranked map of your codebase by:
1. **Parsing code files** using tree-sitter to identify symbols and references
2. **Building a graph** of symbol definitions and references
3. **Ranking code sections** using PageRank based on reference patterns
4. **Filtering results** to fit within token limits while preserving important context
5. **Formatting output** as a concise, readable map
The tool prioritizes:
- Files with many incoming references
- Important symbols (classes, functions) that are frequently used
- Key configuration and documentation files
- Recently modified files when using cache
### Why some files might not appear
By default, SourceMap CLI shows only the most "important" files based on:
- **Token limit** (default 8192) - Only includes files that fit within this limit
- **PageRank score** - Files with more references from other files rank higher
- **File types** - Only source code files are analyzed (configurable extensions)
To see all files:
- Use `--all-files` to ignore token limits and include everything
- Use `--list-files` to see what files are being found
- Increase `--tokens` to include more files in the analysis
## Supported Languages
RepoMap supports all languages that have tree-sitter parsers and tag queries, including:
- Python
- JavaScript/TypeScript
- Java
- C/C++
- Go
- Rust
- Ruby
- And many more...
Run `sourcemap --supported-languages` to see the full list.
## Output Formats
### Text Format (default)
The text output is a human-readable map showing:
- File paths
- Important symbols and their locations
- Contextual code snippets
- `⋮` symbols indicating condensed sections
Example:
```
src/main.py:
│class Application:
│ def __init__(self):
│ self.config = Config()
│
│ def run(self):
│ ...
src/config.py:
│class Config:
│ def load(self):
│ ...
```
### JSON Format
The JSON output provides structured data about the codebase:
```json
{
"files": {
"src/main.py": {
"symbols": [
{
"name": "Application",
"kind": "def",
"line": 5
},
{
"name": "__init__",
"kind": "def",
"line": 7
}
]
}
},
"summary": {
"total_files": 10,
"tokens": 1024,
"root": "/path/to/project"
}
}
```
This format is useful for:
- Integration with other tools
- Generating documentation
- Code analysis pipelines
- Custom visualizations
## Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
## License
This project is licensed under the MIT License. See the LICENSE file for details.
## Acknowledgments
This tool is based on the repository map functionality from [aider](https://github.com/paul-gauthier/aider), an AI pair programming tool.
Raw data
{
"_id": null,
"home_page": null,
"name": "sourcemap-cli",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "analysis, ast, code, map, repository, tree-sitter",
"author": "Aditya Sharma",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/2a/72/b7fe11111f4db621f0239302c1765beb753191a9035af233c19eb6ce2254/sourcemap_cli-0.4.0.tar.gz",
"platform": null,
"description": "# RepoMap\n\nA tool for generating intelligent repository maps showing code structure and relationships.\n\nRepoMap analyzes your codebase to create a compact, context-aware map that highlights the most relevant parts of your code. It uses tree-sitter for parsing, PageRank for ranking importance, and intelligent filtering to create useful repository maps.\n\nThis is a standalone version extracted from the [aider](https://github.com/paul-gauthier/aider) project.\n\n## Features\n\n- **Smart code analysis**: Uses tree-sitter to parse code and understand symbols\n- **Intelligent ranking**: Employs PageRank algorithm to identify important code sections\n- **Language support**: Supports many programming languages through tree-sitter\n- **Caching**: Fast incremental updates with intelligent caching\n- **Customizable**: Configurable token limits and context windows\n\n## Installation\n\n### Using uv (recommended)\n\n```bash\n# Install as a standalone tool (creates isolated environment)\nuv tool install sourcemap-cli\n```\n\nFor development or building from source:\n\n```bash\n# Build the package\nuv build\n\n# Install the built wheel\nuv pip install dist/sourcemap_cli-*.whl\n\n# Or for development (editable install)\nuv pip install -e .\nuv pip install -e \".[dev]\" # with dev dependencies\n```\n\n### Using pip\n\n```bash\n# Install from PyPI\npip install sourcemap-cli\n\n# Or install from source\npip install .\n\n# Development install\npip install -e .[dev]\n```\n\n## Usage\n\nBy default, running `sourcemap` with no arguments launches the interactive TUI. Supplying\narguments invokes the classic CLI.\n\n### CLI Examples\n\nGenerate a repository map for specific files:\n\n```bash\nsourcemap file1.py file2.js directory/\n```\n\n### Options\n\n- `--tokens, -t`: Maximum tokens for the map (default: 8192)\n- `--verbose, -v`: Enable verbose output\n- `--root, -r`: Root directory for the repository (default: current directory)\n- `--refresh`: Cache refresh strategy (auto/always/files/manual, default: auto)\n- `--max-context-window`: Maximum context window size\n- `--output, -o`: Output file path (default: stdout)\n- `--format, -f`: Output format - `text` or `json` (default: text)\n- `--all-files`: Include all files regardless of ranking (ignores token limit)\n- `--list-files`: Just list all files found, no analysis\n- `--no-gitignore`: Include files that are gitignored\n- `--git-staged`: Only include files with staged changes in git\n- `--recent DAYS`: Only include files modified in the last N days\n\n### Examples\n\nAnalyze Python files in a project:\n\n```bash\nsourcemap src/*.py --tokens 2048\n```\n\nGenerate a map for an entire directory:\n\n```bash\nsourcemap . --verbose\n```\n\nAnalyze multiple specific files:\n\n```bash\nsourcemap main.py utils.py tests/ --root /path/to/project\n```\n\nSave output to a file:\n\n```bash\nsourcemap src/ --output sourcemap.txt\n```\n\nGenerate JSON output:\n\n```bash\nsourcemap src/ --format json --output sourcemap.json\n```\n\nPipe JSON output to other tools:\n\n```bash\nsourcemap src/*.py --format json | jq '.files | keys'\n```\n\nList all source files in a directory:\n\n```bash\nsourcemap src/ --list-files\n```\n\nInclude ALL files in the analysis (ignore token limit):\n\n```bash\nsourcemap src/ --all-files --output full-analysis.txt\n```\n\nAnalyze only staged files (great for pre-commit):\n\n```bash\nsourcemap --git-staged\n```\n\nAnalyze files modified in the last 7 days:\n\n```bash\nsourcemap --recent 7\n```\n\nInclude gitignored files in the analysis:\n\n```bash\nsourcemap src/ --no-gitignore\n```\n\nCombine filters - staged files from the last 3 days:\n\n```bash\nsourcemap --git-staged --recent 3\n```\n\n### Python Library\n\nUse RepoMap as a library to generate text or JSON maps programmatically:\n\n```python\nfrom sourcemap_cli import generate_map, MapOptions\n\nfiles = [\".\"] # files or directories\nopts = MapOptions(tokens=2048, root=\".\")\n\n# Text output\ntext_map = generate_map(files, options=opts, format=\"text\")\n\n# JSON output (as Python dict)\njson_map = generate_map(files, options=opts, format=\"json\")\n```\n\n### Interactive TUI\n\nA Rich/Textual-based TUI is included. After installing the package:\n\n```bash\nsourcemap-tui\n# or just run with no args\nsourcemap\n```\n\nControls: G (Generate), S (Save), Q (Quit). Edit Root/Tokens/Format fields in the top bar, then press Generate. The main pane is scrollable.\nThe TUI uses Textual (built on Rich), so it works cross\u2011platform without curses.\n\n### Prompts Mode (fallback)\n\nIf your terminal cannot run the TUI, `sourcemap` falls back to a simple prompts mode (Typer).\nYou\u2019ll be asked for the root directory, token budget, and output format; the result prints\nto the terminal. You can always use the full CLI via `sourcemap map ...`.\n\n## How it Works\n\nRepoMap generates a ranked map of your codebase by:\n\n1. **Parsing code files** using tree-sitter to identify symbols and references\n2. **Building a graph** of symbol definitions and references\n3. **Ranking code sections** using PageRank based on reference patterns\n4. **Filtering results** to fit within token limits while preserving important context\n5. **Formatting output** as a concise, readable map\n\nThe tool prioritizes:\n- Files with many incoming references\n- Important symbols (classes, functions) that are frequently used\n- Key configuration and documentation files\n- Recently modified files when using cache\n\n### Why some files might not appear\n\nBy default, SourceMap CLI shows only the most \"important\" files based on:\n- **Token limit** (default 8192) - Only includes files that fit within this limit\n- **PageRank score** - Files with more references from other files rank higher\n- **File types** - Only source code files are analyzed (configurable extensions)\n\nTo see all files:\n- Use `--all-files` to ignore token limits and include everything\n- Use `--list-files` to see what files are being found\n- Increase `--tokens` to include more files in the analysis\n\n## Supported Languages\n\nRepoMap supports all languages that have tree-sitter parsers and tag queries, including:\n\n- Python\n- JavaScript/TypeScript \n- Java\n- C/C++\n- Go\n- Rust\n- Ruby\n- And many more...\n\nRun `sourcemap --supported-languages` to see the full list.\n\n## Output Formats\n\n### Text Format (default)\n\nThe text output is a human-readable map showing:\n- File paths\n- Important symbols and their locations\n- Contextual code snippets\n- `\u22ee` symbols indicating condensed sections\n\nExample:\n\n```\nsrc/main.py:\n\u2502class Application:\n\u2502 def __init__(self):\n\u2502 self.config = Config()\n\u2502 \n\u2502 def run(self):\n\u2502 ...\n\nsrc/config.py:\n\u2502class Config:\n\u2502 def load(self):\n\u2502 ...\n```\n\n### JSON Format\n\nThe JSON output provides structured data about the codebase:\n\n```json\n{\n \"files\": {\n \"src/main.py\": {\n \"symbols\": [\n {\n \"name\": \"Application\",\n \"kind\": \"def\",\n \"line\": 5\n },\n {\n \"name\": \"__init__\",\n \"kind\": \"def\", \n \"line\": 7\n }\n ]\n }\n },\n \"summary\": {\n \"total_files\": 10,\n \"tokens\": 1024,\n \"root\": \"/path/to/project\"\n }\n}\n```\n\nThis format is useful for:\n- Integration with other tools\n- Generating documentation\n- Code analysis pipelines\n- Custom visualizations\n\n## Contributing\n\nContributions are welcome! Please feel free to submit issues or pull requests.\n\n## License\n\nThis project is licensed under the MIT License. See the LICENSE file for details.\n\n## Acknowledgments\n\nThis tool is based on the repository map functionality from [aider](https://github.com/paul-gauthier/aider), an AI pair programming tool.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A tool for generating intelligent repository maps showing code structure and relationships",
"version": "0.4.0",
"project_urls": {
"Documentation": "https://github.com/BumpyClock/repomap#readme",
"Homepage": "https://github.com/BumpyClock/repomap",
"Issues": "https://github.com/BumpyClock/repomap/issues",
"Repository": "https://github.com/BumpyClock/repomap.git"
},
"split_keywords": [
"analysis",
" ast",
" code",
" map",
" repository",
" tree-sitter"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "ac6356029d2a71fdb0e742338068fa428e23a76a59b02ef76aa01a81f65d1259",
"md5": "6f0141b94e4f31b63331743f7e94a46f",
"sha256": "0e9806a289d54299f9c709e392d17173cc0e1a691e3a7b389b012c5cf46dbc14"
},
"downloads": -1,
"filename": "sourcemap_cli-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6f0141b94e4f31b63331743f7e94a46f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 63561,
"upload_time": "2025-08-08T13:18:03",
"upload_time_iso_8601": "2025-08-08T13:18:03.769484Z",
"url": "https://files.pythonhosted.org/packages/ac/63/56029d2a71fdb0e742338068fa428e23a76a59b02ef76aa01a81f65d1259/sourcemap_cli-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "2a72b7fe11111f4db621f0239302c1765beb753191a9035af233c19eb6ce2254",
"md5": "03081399dee7241ccd8beb987ed61081",
"sha256": "635ad4c0e3b2614c85a7c351bccc0c23ab13b8152c965314ad86c0245d2bf03f"
},
"downloads": -1,
"filename": "sourcemap_cli-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "03081399dee7241ccd8beb987ed61081",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 43437,
"upload_time": "2025-08-08T13:18:04",
"upload_time_iso_8601": "2025-08-08T13:18:04.956443Z",
"url": "https://files.pythonhosted.org/packages/2a/72/b7fe11111f4db621f0239302c1765beb753191a9035af233c19eb6ce2254/sourcemap_cli-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-08 13:18:04",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "BumpyClock",
"github_project": "repomap#readme",
"github_not_found": true,
"lcname": "sourcemap-cli"
}