ctxl


Namectxl JSON
Version 0.2.1 PyPI version JSON
download
home_pageNone
SummaryA CLI tool to dump projects into an LLM ideal format.
upload_time2024-07-06 21:45:39
maintainerNone
docs_urlNone
authorNone
requires_python>=3.6
licenseApache-2.0
keywords cli folder structure
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ctxl: Contextual

[![PyPI version](https://badge.fury.io/py/ctxl.svg)](https://badge.fury.io/py/ctxl)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

ctxl is a CLI tool designed to transform project directories into a structured XML format suitable for language models and AI analysis. It intelligently extracts file contents and directory structures while respecting gitignore rules and custom filters. A key feature of ctxl is its ability to automatically detect project types (such as Python, JavaScript, or web projects) based on the files present in the directory. This auto-detection enables ctxl to make smart decisions about which files to include or exclude, ensuring that the output is relevant and concise. Users can also override this auto-detection with custom presets if needed.

The tool creates a comprehensive project snapshot that can be easily parsed by LLMs, complete with a customizable task specification. This task specification acts as a prompt, priming the LLM to provide more targeted and relevant assistance with your project.

ctxl was developed through a bootstrapping process, where each version was used to generate context for an LLM in developing the next version.

## Table of Contents
- [Why ctxl?](#why-ctxl)
- [Installation](#installation)
- [Quick Start](#quick-start)
  - [Workflow](#workflow)
- [How It Works](#how-it-works)
- [Usage](#usage)
  - [Basic Usage](#basic-usage)
  - [Command-line Options](#command-line-options)
  - [Presets](#presets)
- [Features](#features)
- [Output Example](#output-example)
- [Integration with LLMs](#integration-with-llms)
- [Project Structure](#project-structure)
- [Troubleshooting](#troubleshooting)
- [Contributing](#contributing)
- [License](#license)

## Why ctxl?

ctxl streamlines the process of providing project context to LLMs. Instead of manually copying and pasting file contents or explaining your project structure, ctxl automatically generates a comprehensive, structured representation of your project. This allows LLMs to have a more complete understanding of your codebase, leading to more accurate and context-aware assistance.

## Installation

To install ctxl, you can use pip:

```bash
pip install ctxl
```

## Quick Start

After installation, you can quickly generate an XML representation of your project:

```bash
ctxl /path/to/your/project > project_context.xml
```

This command will create an XML file which you can then provide to an LLM for analysis or assistance. The XML output includes file contents, directory structure, and a default task description that guides the LLM in analyzing your project.

### Workflow

This is how I've been using it - essentially as a iterative process.

1. Paste this directly into your LLM's chat interface (or via API/CLI) and let it respond first. I've found the latest Claude models (Sonnet 3.5 as of writing this) to work best.
2. The LLM will respond with a thorough breakdown and summary of the project first, which helps to prime the LLM with a better contextual understanding of the project, frameworks/libraries used, and overall user/data flow. 
3. Chat with it as normal after. You can ask for things like:
    >I'd like to update the frontend to show live progress when the backend is processing documents.

4. The LLM will use the context of your entire project to suggest refactors/updates to all the relevant files involved to fulfill that ask.
5. Update those files/sections, see if it works/if you like it, if not give feedback/error messages back to the model and keep iterating on.

Future improvements to ctxl will likely automate #4 and #5 of this process.

## How It Works

ctxl operates in several steps:
1. It scans the specified directory to detect the project type(s).
2. Based on the detected type(s) or user-specified presets, it determines which files to include or exclude.
3. It reads the contents of included files and constructs a directory structure.
4. All this information is then formatted into an XML structure, along with the specified task.
5. The resulting XML is output to stdout or a specified file.

## Usage

### Basic Usage

To use ctxl, simply run the following command in your terminal:

```bash
ctxl /path/to/your/project
```

By default, this will output the XML representation of your project to stdout. This allows for piping the output into other CLI tools, for example with [LLM](https://github.com/simonw/llm):

```bash
ctxl /path/to/your/project | llm
```

To output to a file:

```bash
ctxl /path/to/your/project > context.xml
```

or 

```bash
ctxl /path/to/your/project -o context.xml
```

### Command-line Options

ctxl offers several command-line options to customize its behavior:

- `-o, --output`: Specify the output file path (default: stdout)
- `--presets`: Choose preset project types to combine (default: auto-detect)
- `--filter`: Filter patterns to include or exclude (!) files. Example: `'*.py !__pycache__'`
- `--include-dotfiles`: Include dotfiles and folders in the output
- `--gitignore`: Specify a custom .gitignore file path
- `--task`: Include a custom task description in the output
- `--no-auto-detect`: Disable auto-detection of project types
- `--view-presets`: Display all available presets (both built-in and custom)
- `--save-presets`: Save the built-in presets to a YAML file for easy customization
- `-v, --verbose`: Enable verbose logging for more detailed output

Example:

Use existing presets with additional filters to include `.log` and `.txt` and exclude a `temp` directory.

```bash
ctxl /path/to/your/project --presets python javascript --output project_context.xml --task "Analyze this project for potential security vulnerabilities" --filter *.log *.txt !temp
```

Don't use any presets and fully control what to include/exclude.

```bash
ctxl /path/to/your/project --no-auto-detect --output project_context.xml --task "Analyze this project for potential security vulnerabilities" --filter *.py *.js *.md !node_modules
```

To view all available presets:

```bash
ctxl --view-presets
```

To save the built-in presets to a YAML file. You can then modify these/add your own, if this file exists in the directory you're running ctxl on then they'll automatically be loaded in and used instead of the defaults.

```bash
ctxl --save-presets
```

To enable verbose logging:

```bash
ctxl /path/to/your/project -v
```

### Presets

ctxl includes presets for common project types:

- python: Includes .py, .pyi, .pyx, .ipynb files, ignores common Python-specific directories and files
- javascript: Includes .js, .jsx, .mjs, .cjs files, ignores node_modules and other JS-specific files
- typescript: Includes .ts, .tsx files, similar ignores to JavaScript
- web: Includes .html, .css, .scss, .sass, .less, .vue files
- java: Includes .java files, ignores common Java build directories
- csharp: Includes .cs, .csx, .csproj files, ignores common C# build artifacts
- go: Includes .go files, ignores vendor directory
- ruby: Includes .rb, .rake, .gemspec files, ignores bundle-related directories
- php: Includes .php files, ignores vendor directory
- rust: Includes .rs files, ignores target directory and Cargo.lock
- swift: Includes .swift files, ignores .build and Packages directories
- kotlin: Includes .kt, .kts files, ignores common Kotlin/Java build directories
- scala: Includes .scala, .sc files, ignores common Scala build directories
- docker: Includes Dockerfile, .dockerignore, and docker-compose files
- misc: Includes common configuration and documentation file types

The tool automatically detects project types, but you can also specify them manually using the `--presets` option.

## Features

- Auto-detects project types and respects .gitignore rules to determine what files to include/exclude
- Fully customizable file inclusion/exclusion via `--filter` if you need more control
- Generates AI-ready XML output with custom task descriptions
- Simple CLI for easy integration into development workflows
- Efficiently handles large, polyglot projects
- Supports a wide range of programming languages and frameworks
- Customizable presets with ability to view and save presets
- Verbose logging option for detailed process information

## Output Example

Here's an example of what the XML output might look like when run on the ctxl project itself.

To generate this output, you would run:

```bash
ctxl /path/to/ctxl/repository --output ctxl_context.xml
```

The resulting `ctxl_context.xml` would look something like this:

```xml
<root>
  <project_context>
    <file path="src/ctxl/ctxl.py">
      <content>
        ...truncated for examples sake...
      </content>
    </file>
    <file path="src/ctxl/__init__.py">
      <content>
        ...truncated for examples sake...
      </content>
    </file>
    <file path="pyproject.toml">
      <content>
        ...truncated for examples sake...
      </content>
    </file>
    <directory_structure>
      <directory path=".">
        <file path="README.md" />
        <file path="pyproject.toml" />
        <directory path="src">
          <directory path="src/ctxl">
            <file path="src/ctxl/ctxl.py" />
            <file path="src/ctxl/__init__.py" />
          </directory>
        </directory>
      </directory>
    </directory_structure>
  </project_context>
  <task>Describe this project in detail. Pay special attention to the structure of the code, the design of the project, any frameworks/UI frameworks used, and the overall structure/workflow. If artifacts are available, then use workflow and sequence diagrams to help describe the project.</task>
</root>
```

The XML output provides a comprehensive view of the ctxl project, including file contents, structure, and a task description. This format allows LLMs to easily parse and understand the project context, enabling them to provide more accurate and relevant assistance.

## Project Structure

The ctxl project has the following structure:

```
ctxl/
├── src/
│   └── ctxl/
│       ├── __init__.py
│       ├── ctxl.py
│       └── preset_manager.py
├── README.md
└── pyproject.toml
```

The main functionality is implemented in `src/ctxl/ctxl.py`, with preset management handled in `src/ctxl/preset_manager.py`.

## Troubleshooting

- **Issue**: ctxl is not detecting my project type correctly.
  **Solution**: Use the `--presets` option to manually specify the project type(s).

- **Issue**: ctxl is including/excluding files I don't want.
  **Solution**: Use the `--filter` option, if you want full control use `--filter` with `--no-auto-detect`.

- **Issue**: The XML output is too large for my LLM to process.
  **Solution**: Try using more specific presets or custom ignore patterns to reduce the amount of included content.

- **Issue**: I need more information about what ctxl is doing.
  **Solution**: Use the `-v` or `--verbose` flag to enable verbose logging for more detailed output.

## Contributing

Contributions to ctxl are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ctxl",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "cli, folder, structure",
    "author": null,
    "author_email": "Binal Patel <binalkp91@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/40/0f/3c54d0f253cd3ad8cd3590878c668834e3ccc8dec226b05419022522459e/ctxl-0.2.1.tar.gz",
    "platform": null,
    "description": "# ctxl: Contextual\n\n[![PyPI version](https://badge.fury.io/py/ctxl.svg)](https://badge.fury.io/py/ctxl)\n[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nctxl is a CLI tool designed to transform project directories into a structured XML format suitable for language models and AI analysis. It intelligently extracts file contents and directory structures while respecting gitignore rules and custom filters. A key feature of ctxl is its ability to automatically detect project types (such as Python, JavaScript, or web projects) based on the files present in the directory. This auto-detection enables ctxl to make smart decisions about which files to include or exclude, ensuring that the output is relevant and concise. Users can also override this auto-detection with custom presets if needed.\n\nThe tool creates a comprehensive project snapshot that can be easily parsed by LLMs, complete with a customizable task specification. This task specification acts as a prompt, priming the LLM to provide more targeted and relevant assistance with your project.\n\nctxl was developed through a bootstrapping process, where each version was used to generate context for an LLM in developing the next version.\n\n## Table of Contents\n- [Why ctxl?](#why-ctxl)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n  - [Workflow](#workflow)\n- [How It Works](#how-it-works)\n- [Usage](#usage)\n  - [Basic Usage](#basic-usage)\n  - [Command-line Options](#command-line-options)\n  - [Presets](#presets)\n- [Features](#features)\n- [Output Example](#output-example)\n- [Integration with LLMs](#integration-with-llms)\n- [Project Structure](#project-structure)\n- [Troubleshooting](#troubleshooting)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Why ctxl?\n\nctxl streamlines the process of providing project context to LLMs. Instead of manually copying and pasting file contents or explaining your project structure, ctxl automatically generates a comprehensive, structured representation of your project. This allows LLMs to have a more complete understanding of your codebase, leading to more accurate and context-aware assistance.\n\n## Installation\n\nTo install ctxl, you can use pip:\n\n```bash\npip install ctxl\n```\n\n## Quick Start\n\nAfter installation, you can quickly generate an XML representation of your project:\n\n```bash\nctxl /path/to/your/project > project_context.xml\n```\n\nThis command will create an XML file which you can then provide to an LLM for analysis or assistance. The XML output includes file contents, directory structure, and a default task description that guides the LLM in analyzing your project.\n\n### Workflow\n\nThis is how I've been using it - essentially as a iterative process.\n\n1. Paste this directly into your LLM's chat interface (or via API/CLI) and let it respond first. I've found the latest Claude models (Sonnet 3.5 as of writing this) to work best.\n2. The LLM will respond with a thorough breakdown and summary of the project first, which helps to prime the LLM with a better contextual understanding of the project, frameworks/libraries used, and overall user/data flow. \n3. Chat with it as normal after. You can ask for things like:\n    >I'd like to update the frontend to show live progress when the backend is processing documents.\n\n4. The LLM will use the context of your entire project to suggest refactors/updates to all the relevant files involved to fulfill that ask.\n5. Update those files/sections, see if it works/if you like it, if not give feedback/error messages back to the model and keep iterating on.\n\nFuture improvements to ctxl will likely automate #4 and #5 of this process.\n\n## How It Works\n\nctxl operates in several steps:\n1. It scans the specified directory to detect the project type(s).\n2. Based on the detected type(s) or user-specified presets, it determines which files to include or exclude.\n3. It reads the contents of included files and constructs a directory structure.\n4. All this information is then formatted into an XML structure, along with the specified task.\n5. The resulting XML is output to stdout or a specified file.\n\n## Usage\n\n### Basic Usage\n\nTo use ctxl, simply run the following command in your terminal:\n\n```bash\nctxl /path/to/your/project\n```\n\nBy default, this will output the XML representation of your project to stdout. This allows for piping the output into other CLI tools, for example with [LLM](https://github.com/simonw/llm):\n\n```bash\nctxl /path/to/your/project | llm\n```\n\nTo output to a file:\n\n```bash\nctxl /path/to/your/project > context.xml\n```\n\nor \n\n```bash\nctxl /path/to/your/project -o context.xml\n```\n\n### Command-line Options\n\nctxl offers several command-line options to customize its behavior:\n\n- `-o, --output`: Specify the output file path (default: stdout)\n- `--presets`: Choose preset project types to combine (default: auto-detect)\n- `--filter`: Filter patterns to include or exclude (!) files. Example: `'*.py !__pycache__'`\n- `--include-dotfiles`: Include dotfiles and folders in the output\n- `--gitignore`: Specify a custom .gitignore file path\n- `--task`: Include a custom task description in the output\n- `--no-auto-detect`: Disable auto-detection of project types\n- `--view-presets`: Display all available presets (both built-in and custom)\n- `--save-presets`: Save the built-in presets to a YAML file for easy customization\n- `-v, --verbose`: Enable verbose logging for more detailed output\n\nExample:\n\nUse existing presets with additional filters to include `.log` and `.txt` and exclude a `temp` directory.\n\n```bash\nctxl /path/to/your/project --presets python javascript --output project_context.xml --task \"Analyze this project for potential security vulnerabilities\" --filter *.log *.txt !temp\n```\n\nDon't use any presets and fully control what to include/exclude.\n\n```bash\nctxl /path/to/your/project --no-auto-detect --output project_context.xml --task \"Analyze this project for potential security vulnerabilities\" --filter *.py *.js *.md !node_modules\n```\n\nTo view all available presets:\n\n```bash\nctxl --view-presets\n```\n\nTo save the built-in presets to a YAML file. You can then modify these/add your own, if this file exists in the directory you're running ctxl on then they'll automatically be loaded in and used instead of the defaults.\n\n```bash\nctxl --save-presets\n```\n\nTo enable verbose logging:\n\n```bash\nctxl /path/to/your/project -v\n```\n\n### Presets\n\nctxl includes presets for common project types:\n\n- python: Includes .py, .pyi, .pyx, .ipynb files, ignores common Python-specific directories and files\n- javascript: Includes .js, .jsx, .mjs, .cjs files, ignores node_modules and other JS-specific files\n- typescript: Includes .ts, .tsx files, similar ignores to JavaScript\n- web: Includes .html, .css, .scss, .sass, .less, .vue files\n- java: Includes .java files, ignores common Java build directories\n- csharp: Includes .cs, .csx, .csproj files, ignores common C# build artifacts\n- go: Includes .go files, ignores vendor directory\n- ruby: Includes .rb, .rake, .gemspec files, ignores bundle-related directories\n- php: Includes .php files, ignores vendor directory\n- rust: Includes .rs files, ignores target directory and Cargo.lock\n- swift: Includes .swift files, ignores .build and Packages directories\n- kotlin: Includes .kt, .kts files, ignores common Kotlin/Java build directories\n- scala: Includes .scala, .sc files, ignores common Scala build directories\n- docker: Includes Dockerfile, .dockerignore, and docker-compose files\n- misc: Includes common configuration and documentation file types\n\nThe tool automatically detects project types, but you can also specify them manually using the `--presets` option.\n\n## Features\n\n- Auto-detects project types and respects .gitignore rules to determine what files to include/exclude\n- Fully customizable file inclusion/exclusion via `--filter` if you need more control\n- Generates AI-ready XML output with custom task descriptions\n- Simple CLI for easy integration into development workflows\n- Efficiently handles large, polyglot projects\n- Supports a wide range of programming languages and frameworks\n- Customizable presets with ability to view and save presets\n- Verbose logging option for detailed process information\n\n## Output Example\n\nHere's an example of what the XML output might look like when run on the ctxl project itself.\n\nTo generate this output, you would run:\n\n```bash\nctxl /path/to/ctxl/repository --output ctxl_context.xml\n```\n\nThe resulting `ctxl_context.xml` would look something like this:\n\n```xml\n<root>\n  <project_context>\n    <file path=\"src/ctxl/ctxl.py\">\n      <content>\n        ...truncated for examples sake...\n      </content>\n    </file>\n    <file path=\"src/ctxl/__init__.py\">\n      <content>\n        ...truncated for examples sake...\n      </content>\n    </file>\n    <file path=\"pyproject.toml\">\n      <content>\n        ...truncated for examples sake...\n      </content>\n    </file>\n    <directory_structure>\n      <directory path=\".\">\n        <file path=\"README.md\" />\n        <file path=\"pyproject.toml\" />\n        <directory path=\"src\">\n          <directory path=\"src/ctxl\">\n            <file path=\"src/ctxl/ctxl.py\" />\n            <file path=\"src/ctxl/__init__.py\" />\n          </directory>\n        </directory>\n      </directory>\n    </directory_structure>\n  </project_context>\n  <task>Describe this project in detail. Pay special attention to the structure of the code, the design of the project, any frameworks/UI frameworks used, and the overall structure/workflow. If artifacts are available, then use workflow and sequence diagrams to help describe the project.</task>\n</root>\n```\n\nThe XML output provides a comprehensive view of the ctxl project, including file contents, structure, and a task description. This format allows LLMs to easily parse and understand the project context, enabling them to provide more accurate and relevant assistance.\n\n## Project Structure\n\nThe ctxl project has the following structure:\n\n```\nctxl/\n\u251c\u2500\u2500 src/\n\u2502   \u2514\u2500\u2500 ctxl/\n\u2502       \u251c\u2500\u2500 __init__.py\n\u2502       \u251c\u2500\u2500 ctxl.py\n\u2502       \u2514\u2500\u2500 preset_manager.py\n\u251c\u2500\u2500 README.md\n\u2514\u2500\u2500 pyproject.toml\n```\n\nThe main functionality is implemented in `src/ctxl/ctxl.py`, with preset management handled in `src/ctxl/preset_manager.py`.\n\n## Troubleshooting\n\n- **Issue**: ctxl is not detecting my project type correctly.\n  **Solution**: Use the `--presets` option to manually specify the project type(s).\n\n- **Issue**: ctxl is including/excluding files I don't want.\n  **Solution**: Use the `--filter` option, if you want full control use `--filter` with `--no-auto-detect`.\n\n- **Issue**: The XML output is too large for my LLM to process.\n  **Solution**: Try using more specific presets or custom ignore patterns to reduce the amount of included content.\n\n- **Issue**: I need more information about what ctxl is doing.\n  **Solution**: Use the `-v` or `--verbose` flag to enable verbose logging for more detailed output.\n\n## Contributing\n\nContributions to ctxl are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "A CLI tool to dump projects into an LLM ideal format.",
    "version": "0.2.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/caesarnine/ctxl/issues",
        "Documentation": "https://github.com/caesarnine/ctxl#readme",
        "Homepage": "https://github.com/caesarnine/ctxl",
        "Repository": "https://github.com/caesarnine/ctxl.git"
    },
    "split_keywords": [
        "cli",
        " folder",
        " structure"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6a5b3ff647ccb70703333440f87eb811d5c8f533506b4e990476ecd3760f708a",
                "md5": "01e6f940ac8c62648a463e2cbcb60046",
                "sha256": "53b029ba1e449309b60a28d94064c3468bc3eb4edfc88cebf63f22619da62cb2"
            },
            "downloads": -1,
            "filename": "ctxl-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "01e6f940ac8c62648a463e2cbcb60046",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 10267,
            "upload_time": "2024-07-06T21:45:37",
            "upload_time_iso_8601": "2024-07-06T21:45:37.867160Z",
            "url": "https://files.pythonhosted.org/packages/6a/5b/3ff647ccb70703333440f87eb811d5c8f533506b4e990476ecd3760f708a/ctxl-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "400f3c54d0f253cd3ad8cd3590878c668834e3ccc8dec226b05419022522459e",
                "md5": "1a58547c7d0b5fac595d801ef9216a8a",
                "sha256": "e341a86ae64390feba2251a9714bc6b3ec1a88d6d5af9873db76e89ac00575f6"
            },
            "downloads": -1,
            "filename": "ctxl-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "1a58547c7d0b5fac595d801ef9216a8a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 16070,
            "upload_time": "2024-07-06T21:45:39",
            "upload_time_iso_8601": "2024-07-06T21:45:39.282879Z",
            "url": "https://files.pythonhosted.org/packages/40/0f/3c54d0f253cd3ad8cd3590878c668834e3ccc8dec226b05419022522459e/ctxl-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-06 21:45:39",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "caesarnine",
    "github_project": "ctxl",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "ctxl"
}
        
Elapsed time: 2.17455s