# CodeCollector
CodeCollector is a powerful CLI tool designed to help developers easily aggregate and organize code files from complex projects, specifically tailored for providing context to Large Language Models (LLMs) in development workflows.
<div align="center">
<img src="./files/codecollector.gif" alt="CodeCollector in action" width="1200"/>
</div>
## Purpose
When working on large, complex projects, it can be challenging to provide comprehensive context to LLMs about your codebase. Manually copying and pasting relevant files from various parts of your project is time-consuming and error-prone. CodeCollector solves this problem by allowing you to easily select and aggregate the most relevant code files, creating a consolidated view of your project that can be readily shared with an LLM for more accurate and context-aware assistance.
## Features
- Aggregate code files from specified directories
- Interactive mode for selecting specific files and directories
- Customizable file type filtering
- Recursive directory traversal
- Ignore patterns support (similar to .gitignore)
- Configuration file support
- Optimized for providing context to LLMs
## Installation
You can install CodeCollector using pip:
```
pip install codecollector
```
## Usage
### Basic Usage
To run CodeCollector in its default mode:
```
codecollector
```
This will start the interactive mode in the current directory, allowing you to select the files you want to include in your LLM context.
### Command-line Options
```
codecollector [OPTIONS]
```
Options:
- `-d, --directory TEXT`: Base directory to start searching from (default: current directory)
- `-o, --output TEXT`: Output file name (default: aggregated_output.txt)
- `-r, --recursive / --no-recursive`: Enable/disable recursive search (default: recursive)
- `-t, --file-types TEXT`: File types to include (can be used multiple times, default: .py)
- `-i, --interactive`: Launch interactive mode (default: False in CLI, True when run without arguments)
- `--version`: Show the version and exit
- `--help`: Show this message and exit
### Examples
1. Interaactive mode starting from current dir
```
codecollector -i
```
You should then be able to navigate through the project tree and select files whose content you want to include.
![Interactive mode of codecollector](/files/codecollector.png)
2. Collect Python files recursively from the current directory for LLM context:
```
codecollector
```
3. Collect JavaScript and TypeScript files from a specific project for LLM analysis:
```
codecollector -d /path/to/project -t .js -t .ts
```
4. Non-recursive collection of Ruby files with a custom output name for focused LLM input:
```
codecollector --no-recursive -t .rb -o ruby_context.txt
```
5. Interactive mode starting from a specific directory to selectively choose files for LLM context:
```
codecollector -i -d /path/to/project
```
### Interactive Mode
In interactive mode, use the following keys to select the most relevant files for your LLM context:
- ↑/k: Move cursor up
- ↓/j: Move cursor down
- Space: Expand/Collapse directory
- Enter: Select/Deselect file or directory
- f: Finish selection and process files
- q: Quit without processing
### Configuration File
You can create a `codecollector.yaml` file in your project root to set default options:
```yaml
directory: /path/to/project
output: llm_context.txt
recursive: true
file_types:
- .py
- .js
- .ts
interactive: true
```
### Ignore Patterns
Create a `.ccignore` file in your project root to specify ignore patterns:
```
**/.git/**
**/__pycache__/**
**/*.egg-info/**
**/.pytest_cache/**
**/.vscode/**
**/.idea/**
```
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Thanks to all contributors who have helped shape CodeCollector
- Inspired by the need for better context provision to LLMs in complex development projects
Raw data
{
"_id": null,
"home_page": "https://github.com/DLOVRIC2/code_collector",
"name": "codecollector",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "code aggregator organizer cli development",
"author": "Domagoj Lovric",
"author_email": "dominic@algorise.co.uk",
"download_url": "https://files.pythonhosted.org/packages/3c/db/46864988fe668d8b71aa2bdc53a7df21889aa154244a9b6b1a87aebd5fc6/codecollector-0.1.0.tar.gz",
"platform": null,
"description": "# CodeCollector\n\nCodeCollector is a powerful CLI tool designed to help developers easily aggregate and organize code files from complex projects, specifically tailored for providing context to Large Language Models (LLMs) in development workflows.\n\n<div align=\"center\">\n <img src=\"./files/codecollector.gif\" alt=\"CodeCollector in action\" width=\"1200\"/>\n</div>\n\n## Purpose\n\nWhen working on large, complex projects, it can be challenging to provide comprehensive context to LLMs about your codebase. Manually copying and pasting relevant files from various parts of your project is time-consuming and error-prone. CodeCollector solves this problem by allowing you to easily select and aggregate the most relevant code files, creating a consolidated view of your project that can be readily shared with an LLM for more accurate and context-aware assistance.\n\n## Features\n\n- Aggregate code files from specified directories\n- Interactive mode for selecting specific files and directories\n- Customizable file type filtering\n- Recursive directory traversal\n- Ignore patterns support (similar to .gitignore)\n- Configuration file support\n- Optimized for providing context to LLMs\n\n## Installation\n\nYou can install CodeCollector using pip:\n\n```\npip install codecollector\n```\n\n## Usage\n\n### Basic Usage\n\nTo run CodeCollector in its default mode:\n\n```\ncodecollector\n```\n\nThis will start the interactive mode in the current directory, allowing you to select the files you want to include in your LLM context.\n\n### Command-line Options\n\n```\ncodecollector [OPTIONS]\n```\n\nOptions:\n- `-d, --directory TEXT`: Base directory to start searching from (default: current directory)\n- `-o, --output TEXT`: Output file name (default: aggregated_output.txt)\n- `-r, --recursive / --no-recursive`: Enable/disable recursive search (default: recursive)\n- `-t, --file-types TEXT`: File types to include (can be used multiple times, default: .py)\n- `-i, --interactive`: Launch interactive mode (default: False in CLI, True when run without arguments)\n- `--version`: Show the version and exit\n- `--help`: Show this message and exit\n\n### Examples\n\n1. Interaactive mode starting from current dir\n\n ```\n codecollector -i\n ```\n\n You should then be able to navigate through the project tree and select files whose content you want to include.\n\n ![Interactive mode of codecollector](/files/codecollector.png)\n\n\n2. Collect Python files recursively from the current directory for LLM context:\n ```\n codecollector\n ```\n\n3. Collect JavaScript and TypeScript files from a specific project for LLM analysis:\n ```\n codecollector -d /path/to/project -t .js -t .ts\n ```\n\n4. Non-recursive collection of Ruby files with a custom output name for focused LLM input:\n ```\n codecollector --no-recursive -t .rb -o ruby_context.txt\n ```\n\n5. Interactive mode starting from a specific directory to selectively choose files for LLM context:\n ```\n codecollector -i -d /path/to/project\n ```\n\n### Interactive Mode\n\nIn interactive mode, use the following keys to select the most relevant files for your LLM context:\n- \u2191/k: Move cursor up\n- \u2193/j: Move cursor down\n- Space: Expand/Collapse directory\n- Enter: Select/Deselect file or directory\n- f: Finish selection and process files\n- q: Quit without processing\n\n### Configuration File\n\nYou can create a `codecollector.yaml` file in your project root to set default options:\n\n```yaml\ndirectory: /path/to/project\noutput: llm_context.txt\nrecursive: true\nfile_types:\n - .py\n - .js\n - .ts\ninteractive: true\n```\n\n### Ignore Patterns\n\nCreate a `.ccignore` file in your project root to specify ignore patterns:\n\n```\n**/.git/**\n**/__pycache__/**\n**/*.egg-info/**\n**/.pytest_cache/**\n**/.vscode/**\n**/.idea/**\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- Thanks to all contributors who have helped shape CodeCollector\n- Inspired by the need for better context provision to LLMs in complex development projects\n",
"bugtrack_url": null,
"license": null,
"summary": "A CLI tool to aggregate and organize code files from complex projects",
"version": "0.1.0",
"project_urls": {
"Bug Tracker": "https://github.com/DLOVRIC2/code_collector/issues",
"Homepage": "https://github.com/DLOVRIC2/code_collector"
},
"split_keywords": [
"code",
"aggregator",
"organizer",
"cli",
"development"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "92422cdea33e1f40dbb634e42306f40bfcbe8fbd2acb47c3393f943d01380fa4",
"md5": "61db28f9f074dfed068282ae7d0b747e",
"sha256": "8002688971ebda904e3b527d1ad6b289ddf010c606604e88e1453be99d58db14"
},
"downloads": -1,
"filename": "codecollector-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "61db28f9f074dfed068282ae7d0b747e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 8252,
"upload_time": "2024-08-25T16:56:02",
"upload_time_iso_8601": "2024-08-25T16:56:02.136691Z",
"url": "https://files.pythonhosted.org/packages/92/42/2cdea33e1f40dbb634e42306f40bfcbe8fbd2acb47c3393f943d01380fa4/codecollector-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3cdb46864988fe668d8b71aa2bdc53a7df21889aa154244a9b6b1a87aebd5fc6",
"md5": "499b055efbb35b826649b13440358dda",
"sha256": "599bf77da5b859d594f68b75b8820597c8269c7013f494b66ae326d25116c413"
},
"downloads": -1,
"filename": "codecollector-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "499b055efbb35b826649b13440358dda",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 8923,
"upload_time": "2024-08-25T16:56:03",
"upload_time_iso_8601": "2024-08-25T16:56:03.907851Z",
"url": "https://files.pythonhosted.org/packages/3c/db/46864988fe668d8b71aa2bdc53a7df21889aa154244a9b6b1a87aebd5fc6/codecollector-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-25 16:56:03",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DLOVRIC2",
"github_project": "code_collector",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "codecollector"
}