ctxl


Namectxl JSON
Version 0.0.9 PyPI version JSON
download
home_pageNone
SummaryA CLI tool to dump projects into an LLM ideal format.
upload_time2024-07-03 19:15:56
maintainerNone
docs_urlNone
authorNone
requires_python>=3.6
licenseApache-2.0
keywords cli folder structure
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ctxl: Contextual

[![PyPI version](https://badge.fury.io/py/ctxl.svg)](https://badge.fury.io/py/ctxl)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

ctxl is a CLI tool designed to transform project directories into a structured XML format suitable for language models and AI analysis. It intelligently extracts file contents and directory structures while respecting gitignore rules and custom filters. A key feature of ctxl is its ability to automatically detect project types (such as Python, JavaScript, or web projects) based on the files present in the directory. This auto-detection enables ctxl to make smart decisions about which files to include or exclude, ensuring that the output is relevant and concise. Users can also override this auto-detection with custom presets if needed.

The tool creates a comprehensive project snapshot that can be easily parsed by LLMs, complete with a customizable task specification. This task specification acts as a prompt, priming the LLM to provide more targeted and relevant assistance with your project.

ctxl was developed through a bootstrapping process, where each version was used to generate context for an LLM in developing the next version.

## Table of Contents
- [Why ctxl?](#why-ctxl)
- [Installation](#installation)
- [Quick Start](#quick-start)
  - [Workflow](#workflow)
- [How It Works](#how-it-works)
- [Usage](#usage)
  - [Basic Usage](#basic-usage)
  - [Command-line Options](#command-line-options)
  - [Presets](#presets)
- [Features](#features)
- [Output Example](#output-example)
- [Integration with LLMs](#integration-with-llms)
- [Project Structure](#project-structure)
- [Troubleshooting](#troubleshooting)
- [Contributing](#contributing)
- [License](#license)

## Why ctxl?

ctxl streamlines the process of providing project context to LLMs. Instead of manually copying and pasting file contents or explaining your project structure, ctxl automatically generates a comprehensive, structured representation of your project. This allows LLMs to have a more complete understanding of your codebase, leading to more accurate and context-aware assistance.

## Installation

To install ctxl, you can use pip:

```bash
pip install ctxl
```

## Quick Start

After installation, you can quickly generate an XML representation of your project:

```bash
ctxl /path/to/your/project > project_context.xml
```

This command will create an XML file which you can then provide to an LLM for analysis or assistance. The XML output includes file contents, directory structure, and a default task description that guides the LLM in analyzing your project.

### Workflow

This is how I've been using it - essentially as a iterative process.

1. Paste this directly into your LLM's chat interface (or via API/CLI) and let it respond first. I've found the latest Claude models (Sonnet 3.5 as of writing this) to work best.
2. The LLM will respond with a thorough breakdown and summary of the project first, which helps to prime the LLM with a better contextual understanding of the project, frameworks/libraries used, and overall user/data flow. 
3. Chat with it as normal after. You can ask for things like:
    >I'd like to update the frontend to show live progress when the backend is processing documents.

4. The LLM will use the context of your entire project to suggest refactors/updates to all the relevant files involved to fulfill that ask.
5. Update those files/sections, see if it works/if you like it, if not give feedback/error messages back to the model and keep iterating on.

Future improvements to ctxl will likely automate #4 and #5 of this process.


## How It Works

ctxl operates in several steps:
1. It scans the specified directory to detect the project type(s).
2. Based on the detected type(s) or user-specified presets, it determines which files to include or exclude.
3. It reads the contents of included files and constructs a directory structure.
4. All this information is then formatted into an XML structure, along with the specified task.
5. The resulting XML is output to stdout or a specified file.

## Usage

### Basic Usage

To use ctxl, simply run the following command in your terminal:

```bash
ctxl /path/to/your/project
```

By default, this will output the XML representation of your project to stdout. This allows for piping the output into other CLI tools, for example with [LLM](https://github.com/simonw/llm):

```bash
ctxl /path/to/your/project | llm
```

To output to a file:

```bash
ctxl /path/to/your/project > context.xml
```

or 

```bash
ctxl /path/to/your/project -o context.xml
```

### Command-line Options

ctxl offers several command-line options to customize its behavior:

- `-o, --output`: Specify the output file path (default: stdout)
- `--presets`: Choose preset project types to combine (default: auto-detect)
- `--suffixes`: Specify allowed file suffixes (overrides presets)
- `--ignore`: Add additional folders/files to ignore
- `--include-dotfiles`: Include dotfiles and folders in the output
- `--gitignore`: Specify a custom .gitignore file path
- `--task`: Include a custom task description in the output
- `--no-auto-detect`: Disable auto-detection of project types

Example:

```bash
ctxl /path/to/your/project --presets python javascript --output project_context.xml --task "Analyze this project for potential security vulnerabilities"
```

### Presets

ctxl includes presets for common project types:

- python: Includes .py, .pyi, .pyx, .ipynb files, ignores common Python-specific directories and files
- javascript: Includes .js, .jsx, .mjs, .cjs files, ignores node_modules and other JS-specific files
- typescript: Includes .ts, .tsx files, similar ignores to JavaScript
- web: Includes .html, .css, .scss, .sass, .less files
- misc: Includes common configuration and documentation file types

The tool can automatically detect project types, or you can specify them manually.

## Features

- Extracts project structure and file contents
- Supports multiple programming languages and project types
- Customizable file inclusion/exclusion
- Respects .gitignore rules
- Generates XML output for easy parsing
- Auto-detects project types
- Allows custom task specifications for LLM priming

## Output Example

Here's an example of what the XML output might look like when run on the ctxl project itself.

To generate this output, you would run:

```bash
ctxl /path/to/ctxl/repository --output ctxl_context.xml
```

The resulting `ctxl_context.xml` would look something like this:

```xml
<root>
  <project_context>
    <file path="src/ctxl/ctxl.py">
      <content>
        ...truncated for examples sake...
      </content>
    </file>
    <file path="src/ctxl/__init__.py">
      <content>
        ...truncated for examples sake...
      </content>
    </file>
    <file path="pyproject.toml">
      <content>
        ...truncated for examples sake...
      </content>
    </file>
    <directory_structure>
      <directory path=".">
        <file path="README.md" />
        <file path="pyproject.toml" />
        <directory path="src">
          <directory path="src/ctxl">
            <file path="src/ctxl/ctxl.py" />
            <file path="src/ctxl/__init__.py" />
          </directory>
        </directory>
      </directory>
    </directory_structure>
  </project_context>
  <task>Describe this project in detail. Pay special attention to the structure of the code, the design of the project, any frameworks/UI frameworks used, and the overall structure/workflow. If artifacts are available, then use workflow and sequence diagrams to help describe the project.</task>
</root>
```

The XML output provides a comprehensive view of the ctxl project, including file contents, structure, and a task description. This format allows LLMs to easily parse and understand the project context, enabling them to provide more accurate and relevant assistance.

## Project Structure

The ctxl project has the following structure:

```
ctxl/
├── src/
│   └── ctxl/
│       ├── __init__.py
│       └── ctxl.py
├── README.md
└── pyproject.toml
```

The main functionality is implemented in `src/ctxl/ctxl.py`.

## Troubleshooting

- **Issue**: ctxl is not detecting my project type correctly.
  **Solution**: Use the `--presets` option to manually specify the project type(s).

- **Issue**: ctxl is including/excluding files I don't want.
  **Solution**: Use the `--suffixes` and `--ignore` options to customize file selection.

- **Issue**: The XML output is too large for my LLM to process.
  **Solution**: Try using more specific presets or custom ignore patterns to reduce the amount of included content.

## Contributing

Contributions to ctxl are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ctxl",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "cli, folder, structure",
    "author": null,
    "author_email": "Binal Patel <binalkp91@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/9c/b8/eadae4c4ff9ffd750b5ed86dccf2cc94d20a59be691ffb362904dd38500d/ctxl-0.0.9.tar.gz",
    "platform": null,
    "description": "# ctxl: Contextual\n\n[![PyPI version](https://badge.fury.io/py/ctxl.svg)](https://badge.fury.io/py/ctxl)\n[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nctxl is a CLI tool designed to transform project directories into a structured XML format suitable for language models and AI analysis. It intelligently extracts file contents and directory structures while respecting gitignore rules and custom filters. A key feature of ctxl is its ability to automatically detect project types (such as Python, JavaScript, or web projects) based on the files present in the directory. This auto-detection enables ctxl to make smart decisions about which files to include or exclude, ensuring that the output is relevant and concise. Users can also override this auto-detection with custom presets if needed.\n\nThe tool creates a comprehensive project snapshot that can be easily parsed by LLMs, complete with a customizable task specification. This task specification acts as a prompt, priming the LLM to provide more targeted and relevant assistance with your project.\n\nctxl was developed through a bootstrapping process, where each version was used to generate context for an LLM in developing the next version.\n\n## Table of Contents\n- [Why ctxl?](#why-ctxl)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n  - [Workflow](#workflow)\n- [How It Works](#how-it-works)\n- [Usage](#usage)\n  - [Basic Usage](#basic-usage)\n  - [Command-line Options](#command-line-options)\n  - [Presets](#presets)\n- [Features](#features)\n- [Output Example](#output-example)\n- [Integration with LLMs](#integration-with-llms)\n- [Project Structure](#project-structure)\n- [Troubleshooting](#troubleshooting)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Why ctxl?\n\nctxl streamlines the process of providing project context to LLMs. Instead of manually copying and pasting file contents or explaining your project structure, ctxl automatically generates a comprehensive, structured representation of your project. This allows LLMs to have a more complete understanding of your codebase, leading to more accurate and context-aware assistance.\n\n## Installation\n\nTo install ctxl, you can use pip:\n\n```bash\npip install ctxl\n```\n\n## Quick Start\n\nAfter installation, you can quickly generate an XML representation of your project:\n\n```bash\nctxl /path/to/your/project > project_context.xml\n```\n\nThis command will create an XML file which you can then provide to an LLM for analysis or assistance. The XML output includes file contents, directory structure, and a default task description that guides the LLM in analyzing your project.\n\n### Workflow\n\nThis is how I've been using it - essentially as a iterative process.\n\n1. Paste this directly into your LLM's chat interface (or via API/CLI) and let it respond first. I've found the latest Claude models (Sonnet 3.5 as of writing this) to work best.\n2. The LLM will respond with a thorough breakdown and summary of the project first, which helps to prime the LLM with a better contextual understanding of the project, frameworks/libraries used, and overall user/data flow. \n3. Chat with it as normal after. You can ask for things like:\n    >I'd like to update the frontend to show live progress when the backend is processing documents.\n\n4. The LLM will use the context of your entire project to suggest refactors/updates to all the relevant files involved to fulfill that ask.\n5. Update those files/sections, see if it works/if you like it, if not give feedback/error messages back to the model and keep iterating on.\n\nFuture improvements to ctxl will likely automate #4 and #5 of this process.\n\n\n## How It Works\n\nctxl operates in several steps:\n1. It scans the specified directory to detect the project type(s).\n2. Based on the detected type(s) or user-specified presets, it determines which files to include or exclude.\n3. It reads the contents of included files and constructs a directory structure.\n4. All this information is then formatted into an XML structure, along with the specified task.\n5. The resulting XML is output to stdout or a specified file.\n\n## Usage\n\n### Basic Usage\n\nTo use ctxl, simply run the following command in your terminal:\n\n```bash\nctxl /path/to/your/project\n```\n\nBy default, this will output the XML representation of your project to stdout. This allows for piping the output into other CLI tools, for example with [LLM](https://github.com/simonw/llm):\n\n```bash\nctxl /path/to/your/project | llm\n```\n\nTo output to a file:\n\n```bash\nctxl /path/to/your/project > context.xml\n```\n\nor \n\n```bash\nctxl /path/to/your/project -o context.xml\n```\n\n### Command-line Options\n\nctxl offers several command-line options to customize its behavior:\n\n- `-o, --output`: Specify the output file path (default: stdout)\n- `--presets`: Choose preset project types to combine (default: auto-detect)\n- `--suffixes`: Specify allowed file suffixes (overrides presets)\n- `--ignore`: Add additional folders/files to ignore\n- `--include-dotfiles`: Include dotfiles and folders in the output\n- `--gitignore`: Specify a custom .gitignore file path\n- `--task`: Include a custom task description in the output\n- `--no-auto-detect`: Disable auto-detection of project types\n\nExample:\n\n```bash\nctxl /path/to/your/project --presets python javascript --output project_context.xml --task \"Analyze this project for potential security vulnerabilities\"\n```\n\n### Presets\n\nctxl includes presets for common project types:\n\n- python: Includes .py, .pyi, .pyx, .ipynb files, ignores common Python-specific directories and files\n- javascript: Includes .js, .jsx, .mjs, .cjs files, ignores node_modules and other JS-specific files\n- typescript: Includes .ts, .tsx files, similar ignores to JavaScript\n- web: Includes .html, .css, .scss, .sass, .less files\n- misc: Includes common configuration and documentation file types\n\nThe tool can automatically detect project types, or you can specify them manually.\n\n## Features\n\n- Extracts project structure and file contents\n- Supports multiple programming languages and project types\n- Customizable file inclusion/exclusion\n- Respects .gitignore rules\n- Generates XML output for easy parsing\n- Auto-detects project types\n- Allows custom task specifications for LLM priming\n\n## Output Example\n\nHere's an example of what the XML output might look like when run on the ctxl project itself.\n\nTo generate this output, you would run:\n\n```bash\nctxl /path/to/ctxl/repository --output ctxl_context.xml\n```\n\nThe resulting `ctxl_context.xml` would look something like this:\n\n```xml\n<root>\n  <project_context>\n    <file path=\"src/ctxl/ctxl.py\">\n      <content>\n        ...truncated for examples sake...\n      </content>\n    </file>\n    <file path=\"src/ctxl/__init__.py\">\n      <content>\n        ...truncated for examples sake...\n      </content>\n    </file>\n    <file path=\"pyproject.toml\">\n      <content>\n        ...truncated for examples sake...\n      </content>\n    </file>\n    <directory_structure>\n      <directory path=\".\">\n        <file path=\"README.md\" />\n        <file path=\"pyproject.toml\" />\n        <directory path=\"src\">\n          <directory path=\"src/ctxl\">\n            <file path=\"src/ctxl/ctxl.py\" />\n            <file path=\"src/ctxl/__init__.py\" />\n          </directory>\n        </directory>\n      </directory>\n    </directory_structure>\n  </project_context>\n  <task>Describe this project in detail. Pay special attention to the structure of the code, the design of the project, any frameworks/UI frameworks used, and the overall structure/workflow. If artifacts are available, then use workflow and sequence diagrams to help describe the project.</task>\n</root>\n```\n\nThe XML output provides a comprehensive view of the ctxl project, including file contents, structure, and a task description. This format allows LLMs to easily parse and understand the project context, enabling them to provide more accurate and relevant assistance.\n\n## Project Structure\n\nThe ctxl project has the following structure:\n\n```\nctxl/\n\u251c\u2500\u2500 src/\n\u2502   \u2514\u2500\u2500 ctxl/\n\u2502       \u251c\u2500\u2500 __init__.py\n\u2502       \u2514\u2500\u2500 ctxl.py\n\u251c\u2500\u2500 README.md\n\u2514\u2500\u2500 pyproject.toml\n```\n\nThe main functionality is implemented in `src/ctxl/ctxl.py`.\n\n## Troubleshooting\n\n- **Issue**: ctxl is not detecting my project type correctly.\n  **Solution**: Use the `--presets` option to manually specify the project type(s).\n\n- **Issue**: ctxl is including/excluding files I don't want.\n  **Solution**: Use the `--suffixes` and `--ignore` options to customize file selection.\n\n- **Issue**: The XML output is too large for my LLM to process.\n  **Solution**: Try using more specific presets or custom ignore patterns to reduce the amount of included content.\n\n## Contributing\n\nContributions to ctxl are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "A CLI tool to dump projects into an LLM ideal format.",
    "version": "0.0.9",
    "project_urls": {
        "Bug Tracker": "https://github.com/caesarnine/ctxl/issues",
        "Documentation": "https://github.com/caesarnine/ctxl#readme",
        "Homepage": "https://github.com/caesarnine/ctxl",
        "Repository": "https://github.com/caesarnine/ctxl.git"
    },
    "split_keywords": [
        "cli",
        " folder",
        " structure"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6bf166003229566fd0ad325a853335337aede40a44ed0041c074ceff181da086",
                "md5": "d21ba1a120769342077c2144d30366be",
                "sha256": "c4ed2a3834e8107f25936bc2458b89637568e91370b8c75e7e4e5233f0e58c72"
            },
            "downloads": -1,
            "filename": "ctxl-0.0.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d21ba1a120769342077c2144d30366be",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 7850,
            "upload_time": "2024-07-03T19:15:55",
            "upload_time_iso_8601": "2024-07-03T19:15:55.213799Z",
            "url": "https://files.pythonhosted.org/packages/6b/f1/66003229566fd0ad325a853335337aede40a44ed0041c074ceff181da086/ctxl-0.0.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9cb8eadae4c4ff9ffd750b5ed86dccf2cc94d20a59be691ffb362904dd38500d",
                "md5": "9b6d29ec04aa89520ea5dac35ddbb4d1",
                "sha256": "5bfdfbc835158d29df48a8b9e1f9ee7cd57382beba7c51aa1231d0573bfadac6"
            },
            "downloads": -1,
            "filename": "ctxl-0.0.9.tar.gz",
            "has_sig": false,
            "md5_digest": "9b6d29ec04aa89520ea5dac35ddbb4d1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 13040,
            "upload_time": "2024-07-03T19:15:56",
            "upload_time_iso_8601": "2024-07-03T19:15:56.222274Z",
            "url": "https://files.pythonhosted.org/packages/9c/b8/eadae4c4ff9ffd750b5ed86dccf2cc94d20a59be691ffb362904dd38500d/ctxl-0.0.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-03 19:15:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "caesarnine",
    "github_project": "ctxl",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "ctxl"
}
        
Elapsed time: 0.31773s