ccontext


Nameccontext JSON
Version 0.3.2 PyPI version JSON
download
home_pagehttps://github.com/NicolasArnouts/ccontext
Summarycollect-context: Makes the process of collecting and sending context to an LLM like ChatGPT-4o as easy as possible.
upload_time2024-08-19 11:15:56
maintainerNone
docs_urlNone
authorNicolas Arnouts
requires_python<4,>=3.8
licenseMIT
keywords context ccontenxt collect context llm chatgpt
VCS
bugtrack_url
requirements tiktoken colorama pyperclip pypdf pathspec reportlab poetry wcmatch pypdf
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ccontext

**ccontext (collect-context)** is a cross-platform utility designed to streamline the process of gathering and sending the context of a directory to large language models (LLMs) like ChatGPT-4o. Our mission is to make collecting and sending context to an LLM as easy as possible.

## Features

- 🌟 **Easy Setup**: Quick installation and configuration.
- 🔧 **Configurable Exclusions and Inclusions**: Flexibly specify which files and directories to include or exclude.
- ✂️ **Tokenization and Chunking**: Automatically handles tokenization and chunking to stay within LLM token limits.
- 🌍 **Cross-Platform Support**: Supports Windows, macOS, and Linux.
- 🗣️ **Verbose Output**: Optional verbose mode for detailed output and debugging.
- 📄 **Markdown and PDF Generation**: Generate detailed Markdown and PDF files of the directory structure and file contents.
- 🌐 **Crawling of (documentation) Sites**: Crawl and gather data from multiple sites using a specified list of URLs.
- 📝 **Prompt Templates** (Upcoming): Create and use custom templates for different types of prompts.


## Example output:
```sh
kiko@lappie:~/.USER_SCRIPTS/ccontext$ ccontext
Using user config file: /home/kiko/.ccontext/config.json
Root Path: /home/kiko/.USER_SCRIPTS/ccontext

📁 ccontext
    📁 .github
        📁 workflows
            📄 751 publish-to-pypi.yml
    📄 1664 .gitignore
    📁 .vscode
        📄 13 settings.json
    📄 9 MANIFEST.in
    📄 1392 README.md
    📁 ccontext
        📄 0 Helvetica.ttf
        📄 0 NotoEmoji-VariableFont_wght.ttf
        📄 0 __init__.py
        📄 18 __main__.py
        📄 452 argument_parser.py
        📄 79 cli.py
        📄 699 clipboard.py
        📄 216 config.json
        📄 231 configurator.py
        📄 168 content_handler.py
        📄 209 file_node.py
        📄 779 file_system.py
        📄 655 file_tree.py
        📄 1089 main.py
        📄 800 md_generator.py
        📄 774 output_handler.py
        📄 1536 pdf_generator.py
        📄 975 tokenizer.py
        📄 414 utils.py
    📄 135 ideas.MD
    📄 56 requirements.txt
    📄 87 run_ccontext.sh
    📄 291 setup.py


Tokens: 14,718/32,000

Output copied to clipboard!
```

## Installation

### Using pip

ccontext is available on PyPI and can be installed using pip:

```sh
pipx install ccontext
```

### From Source

1. Clone the repository:

    ```sh
    git clone https://github.com/oxillix/ccontext.git
    cd ccontext
    ```

2. Set up a virtual environment:

    ```sh
    python3 -m venv venv
    source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
    ```

3. Install dependencies:

    ```sh
    pip install -r requirements.txt
    ```

4. Install the package:

    ```sh
    pip install .
    ```

## Usage

### Basic Usage

1. Run `ccontext` in the current folder with default settings defined in `~/.ccontext/config.json`:

    ```sh
    ccontext
    ```

2. Specify a root path, exclusions, and inclusions:

    ```sh
    ccontext -p /path/to/directory -e ".git|node_modules" -i "important_file.txt|docs"
    ```

### Command-Line Arguments
- `-h, --help`: Show help message.
- `-p, --root_path`: The root path to start the directory tree (default: current directory).
- `-e, --excludes`: Additional files or directories to exclude, separated by `|`, e.g., `node_modules|.git`.
- `-i, --includes`: Files or directories to include, separated by `|`, e.g., `important_file.txt|docs`.
- `-m, --max_tokens`: Maximum number of tokens allowed before chunking.
- `-c, --config`: Path to a custom configuration file.
- `-v, --verbose`: Enable verbose output to stdout.
- `-ig, --ignore_gitignore`: Ignore the `.gitignore` file for exclusions.
- `-g, --generate-pdf`: Generate a PDF of the directory tree and file contents.
- `-gm, --generate-md`: Generate a Markdown file of the directory tree and file contents.
- `--crawl`: Crawls the sites specified in the config
 

### Example

```sh
ccontext -p /home/user/project -e ".git|build" -i "README.md|src"
```

### Configuration

You can customize the behavior of `ccontext` by creating a configuration file. The default configuration file is `config.json` located in the user's home directory under `.ccontext`. You can also provide a custom configuration file via the `-c` argument.

### Sample `config.json`

```json
{
  "verbose": false, // prints more data on the screen
  "max_tokens": 32000, // max token size of input prompt / maximum size of the chunks
  "model_type": "gpt-4o", // sets tiktoken.encoding_for_model()
  "buffer_size": 0.05, // a buffer for max_tokens that limits how full the chunks can be
  "excluded_folders_files": [
    ".git",
    "bin",
    "build",
    "node_modules",
    "venv",
    "__pycache__",
    "package-lock.json",
    "ccontext.egg-info",
    "dist",
    "__tests__",
    "coverage",
    ".next",
    "pnpm-lock.yaml",
    "poetry.lock",
    "ccontext-output.pdf",
    "ccontext-output.md",
    ".phpstorm.meta.php",
    "*.min.js",
    "composer.lock",
    "*.lock",
    "vendor",
    "laravel_access.log"
  ],
  "included_folders_files": [],
  "context_prompt": "[[SYSTEM INSTRUCTIONS]] The following output represents a detailed directory structure and file contents from a specified root path. The file tree includes both excluded and included files and directories, clearly marking exclusions. Each file's content is displayed with comprehensive headings and separators to enhance readability and facilitate detailed parsing for extracting hierarchical and content-related insights. If the data represents a codebase, interpret and handle it as such, providing appropriate assistance as a programmer AI assistant. [[END SYSTEM INSTRUCTIONS]]",
  "urls_to_crawl": [
    {
      "url": "https://www.django-rest-framework.org/",
      "match": [
        "https://www.django-rest-framework.org/**"
      ],
      "exclude": [
        "https://www.django-rest-framework.org/community/**"
      ],
      "selector": "",
      "maxPagesToCrawl": 100,
      "outputFileName": "django-rest-framework.org.json",
      "maxTokens": 2000000
    }
  ]
}
```

## Use Cases

- **Codebase Context**: Send the entire codebase as context to an LLM in one go, avoiding the need to copy and paste snippets manually.
- **Document Generation**: Generate detailed Markdown and PDF files of your directory structure and file contents, to easily RAG upon.]
- **Documentation crawling**: crawl any (documentation) site there is, and use it for sending context

## Contributing

We welcome contributions to `ccontext`! Please follow these steps to contribute:

1. Fork the repository.
2. Create a new branch for your feature or bug fix.
3. Commit your changes and push them to your branch.
4. Submit a pull request with a description of your changes.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- Inspired by the need to streamline the process of providing context to LLMs.
- Thanks to the contributors and users who have provided valuable feedback and suggestions.

## Future Ideas

Here are some ideas that might be implemented in future versions of `ccontext`:

- **Document Support**: Incorporate the ability to handle documents such as PDFs and image files in prompts.
- **Binary File Handling**: Introduce mechanisms to manage non-text file types effectively.

---

Feel free to raise issues or contribute to the project. We appreciate your support!

**Nicolas Arnouts**  
[arnouts.software@gmail.com](mailto:arnouts.software@gmail.com)

[GitHub Repository](https://github.com/NicolasArnouts/ccontext)

---

### Badges

[![PyPI version](https://badge.fury.io/py/ccontext.svg)](https://badge.fury.io/py/ccontext)
[![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/NicolasArnouts/ccontext/blob/main/LICENSE)
[![Platform](https://img.shields.io/badge/platform-Windows%20|%20macOS%20|%20Linux-lightgrey.svg)]()


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/NicolasArnouts/ccontext",
    "name": "ccontext",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4,>=3.8",
    "maintainer_email": null,
    "keywords": "context, ccontenxt, collect context, llm, chatgpt",
    "author": "Nicolas Arnouts",
    "author_email": "arnouts.software@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/6b/56/955cf8018e4c3f7fb6d37e13bb3298c1465ccea5a0f40e76a24c449a2b72/ccontext-0.3.2.tar.gz",
    "platform": null,
    "description": "# ccontext\n\n**ccontext (collect-context)** is a cross-platform utility designed to streamline the process of gathering and sending the context of a directory to large language models (LLMs) like ChatGPT-4o. Our mission is to make collecting and sending context to an LLM as easy as possible.\n\n## Features\n\n- \ud83c\udf1f **Easy Setup**: Quick installation and configuration.\n- \ud83d\udd27 **Configurable Exclusions and Inclusions**: Flexibly specify which files and directories to include or exclude.\n- \u2702\ufe0f **Tokenization and Chunking**: Automatically handles tokenization and chunking to stay within LLM token limits.\n- \ud83c\udf0d **Cross-Platform Support**: Supports Windows, macOS, and Linux.\n- \ud83d\udde3\ufe0f **Verbose Output**: Optional verbose mode for detailed output and debugging.\n- \ud83d\udcc4 **Markdown and PDF Generation**: Generate detailed Markdown and PDF files of the directory structure and file contents.\n- \ud83c\udf10 **Crawling of (documentation) Sites**: Crawl and gather data from multiple sites using a specified list of URLs.\n- \ud83d\udcdd **Prompt Templates** (Upcoming): Create and use custom templates for different types of prompts.\n\n\n## Example output:\n```sh\nkiko@lappie:~/.USER_SCRIPTS/ccontext$ ccontext\nUsing user config file: /home/kiko/.ccontext/config.json\nRoot Path: /home/kiko/.USER_SCRIPTS/ccontext\n\n\ud83d\udcc1 ccontext\n    \ud83d\udcc1 .github\n        \ud83d\udcc1 workflows\n            \ud83d\udcc4 751 publish-to-pypi.yml\n    \ud83d\udcc4 1664 .gitignore\n    \ud83d\udcc1 .vscode\n        \ud83d\udcc4 13 settings.json\n    \ud83d\udcc4 9 MANIFEST.in\n    \ud83d\udcc4 1392 README.md\n    \ud83d\udcc1 ccontext\n        \ud83d\udcc4 0 Helvetica.ttf\n        \ud83d\udcc4 0 NotoEmoji-VariableFont_wght.ttf\n        \ud83d\udcc4 0 __init__.py\n        \ud83d\udcc4 18 __main__.py\n        \ud83d\udcc4 452 argument_parser.py\n        \ud83d\udcc4 79 cli.py\n        \ud83d\udcc4 699 clipboard.py\n        \ud83d\udcc4 216 config.json\n        \ud83d\udcc4 231 configurator.py\n        \ud83d\udcc4 168 content_handler.py\n        \ud83d\udcc4 209 file_node.py\n        \ud83d\udcc4 779 file_system.py\n        \ud83d\udcc4 655 file_tree.py\n        \ud83d\udcc4 1089 main.py\n        \ud83d\udcc4 800 md_generator.py\n        \ud83d\udcc4 774 output_handler.py\n        \ud83d\udcc4 1536 pdf_generator.py\n        \ud83d\udcc4 975 tokenizer.py\n        \ud83d\udcc4 414 utils.py\n    \ud83d\udcc4 135 ideas.MD\n    \ud83d\udcc4 56 requirements.txt\n    \ud83d\udcc4 87 run_ccontext.sh\n    \ud83d\udcc4 291 setup.py\n\n\nTokens: 14,718/32,000\n\nOutput copied to clipboard!\n```\n\n## Installation\n\n### Using pip\n\nccontext is available on PyPI and can be installed using pip:\n\n```sh\npipx install ccontext\n```\n\n### From Source\n\n1. Clone the repository:\n\n    ```sh\n    git clone https://github.com/oxillix/ccontext.git\n    cd ccontext\n    ```\n\n2. Set up a virtual environment:\n\n    ```sh\n    python3 -m venv venv\n    source venv/bin/activate  # On Windows, use `venv\\Scripts\\activate`\n    ```\n\n3. Install dependencies:\n\n    ```sh\n    pip install -r requirements.txt\n    ```\n\n4. Install the package:\n\n    ```sh\n    pip install .\n    ```\n\n## Usage\n\n### Basic Usage\n\n1. Run `ccontext` in the current folder with default settings defined in `~/.ccontext/config.json`:\n\n    ```sh\n    ccontext\n    ```\n\n2. Specify a root path, exclusions, and inclusions:\n\n    ```sh\n    ccontext -p /path/to/directory -e \".git|node_modules\" -i \"important_file.txt|docs\"\n    ```\n\n### Command-Line Arguments\n- `-h, --help`: Show help message.\n- `-p, --root_path`: The root path to start the directory tree (default: current directory).\n- `-e, --excludes`: Additional files or directories to exclude, separated by `|`, e.g., `node_modules|.git`.\n- `-i, --includes`: Files or directories to include, separated by `|`, e.g., `important_file.txt|docs`.\n- `-m, --max_tokens`: Maximum number of tokens allowed before chunking.\n- `-c, --config`: Path to a custom configuration file.\n- `-v, --verbose`: Enable verbose output to stdout.\n- `-ig, --ignore_gitignore`: Ignore the `.gitignore` file for exclusions.\n- `-g, --generate-pdf`: Generate a PDF of the directory tree and file contents.\n- `-gm, --generate-md`: Generate a Markdown file of the directory tree and file contents.\n- `--crawl`: Crawls the sites specified in the config\n \n\n### Example\n\n```sh\nccontext -p /home/user/project -e \".git|build\" -i \"README.md|src\"\n```\n\n### Configuration\n\nYou can customize the behavior of `ccontext` by creating a configuration file. The default configuration file is `config.json` located in the user's home directory under `.ccontext`. You can also provide a custom configuration file via the `-c` argument.\n\n### Sample `config.json`\n\n```json\n{\n  \"verbose\": false, // prints more data on the screen\n  \"max_tokens\": 32000, // max token size of input prompt / maximum size of the chunks\n  \"model_type\": \"gpt-4o\", // sets tiktoken.encoding_for_model()\n  \"buffer_size\": 0.05, // a buffer for max_tokens that limits how full the chunks can be\n  \"excluded_folders_files\": [\n    \".git\",\n    \"bin\",\n    \"build\",\n    \"node_modules\",\n    \"venv\",\n    \"__pycache__\",\n    \"package-lock.json\",\n    \"ccontext.egg-info\",\n    \"dist\",\n    \"__tests__\",\n    \"coverage\",\n    \".next\",\n    \"pnpm-lock.yaml\",\n    \"poetry.lock\",\n    \"ccontext-output.pdf\",\n    \"ccontext-output.md\",\n    \".phpstorm.meta.php\",\n    \"*.min.js\",\n    \"composer.lock\",\n    \"*.lock\",\n    \"vendor\",\n    \"laravel_access.log\"\n  ],\n  \"included_folders_files\": [],\n  \"context_prompt\": \"[[SYSTEM INSTRUCTIONS]] The following output represents a detailed directory structure and file contents from a specified root path. The file tree includes both excluded and included files and directories, clearly marking exclusions. Each file's content is displayed with comprehensive headings and separators to enhance readability and facilitate detailed parsing for extracting hierarchical and content-related insights. If the data represents a codebase, interpret and handle it as such, providing appropriate assistance as a programmer AI assistant. [[END SYSTEM INSTRUCTIONS]]\",\n  \"urls_to_crawl\": [\n    {\n      \"url\": \"https://www.django-rest-framework.org/\",\n      \"match\": [\n        \"https://www.django-rest-framework.org/**\"\n      ],\n      \"exclude\": [\n        \"https://www.django-rest-framework.org/community/**\"\n      ],\n      \"selector\": \"\",\n      \"maxPagesToCrawl\": 100,\n      \"outputFileName\": \"django-rest-framework.org.json\",\n      \"maxTokens\": 2000000\n    }\n  ]\n}\n```\n\n## Use Cases\n\n- **Codebase Context**: Send the entire codebase as context to an LLM in one go, avoiding the need to copy and paste snippets manually.\n- **Document Generation**: Generate detailed Markdown and PDF files of your directory structure and file contents, to easily RAG upon.]\n- **Documentation crawling**: crawl any (documentation) site there is, and use it for sending context\n\n## Contributing\n\nWe welcome contributions to `ccontext`! Please follow these steps to contribute:\n\n1. Fork the repository.\n2. Create a new branch for your feature or bug fix.\n3. Commit your changes and push them to your branch.\n4. Submit a pull request with a description of your changes.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- Inspired by the need to streamline the process of providing context to LLMs.\n- Thanks to the contributors and users who have provided valuable feedback and suggestions.\n\n## Future Ideas\n\nHere are some ideas that might be implemented in future versions of `ccontext`:\n\n- **Document Support**: Incorporate the ability to handle documents such as PDFs and image files in prompts.\n- **Binary File Handling**: Introduce mechanisms to manage non-text file types effectively.\n\n---\n\nFeel free to raise issues or contribute to the project. We appreciate your support!\n\n**Nicolas Arnouts**  \n[arnouts.software@gmail.com](mailto:arnouts.software@gmail.com)\n\n[GitHub Repository](https://github.com/NicolasArnouts/ccontext)\n\n---\n\n### Badges\n\n[![PyPI version](https://badge.fury.io/py/ccontext.svg)](https://badge.fury.io/py/ccontext)\n[![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/NicolasArnouts/ccontext/blob/main/LICENSE)\n[![Platform](https://img.shields.io/badge/platform-Windows%20|%20macOS%20|%20Linux-lightgrey.svg)]()\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "collect-context: Makes the process of collecting and sending context to an LLM like ChatGPT-4o as easy as possible.",
    "version": "0.3.2",
    "project_urls": {
        "Documentation": "https://github.com/NicolasArnouts/ccontext",
        "Homepage": "https://github.com/NicolasArnouts/ccontext",
        "Issues": "https://github.com/NicolasArnouts/ccontext/issues",
        "Repository": "https://github.com/NicolasArnouts/ccontext"
    },
    "split_keywords": [
        "context",
        " ccontenxt",
        " collect context",
        " llm",
        " chatgpt"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "90eb53c7fbc62fda4f8c222d4713edaf8dd2721554fb22833067baa7ad19e671",
                "md5": "28127b70a0c37ab2c57280ead89a3a48",
                "sha256": "41297ef4d296177cfb9ce5fba11702988a40226d39d9883ed5bd036ae5878b5b"
            },
            "downloads": -1,
            "filename": "ccontext-0.3.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "28127b70a0c37ab2c57280ead89a3a48",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.8",
            "size": 1412837,
            "upload_time": "2024-08-19T11:15:54",
            "upload_time_iso_8601": "2024-08-19T11:15:54.256643Z",
            "url": "https://files.pythonhosted.org/packages/90/eb/53c7fbc62fda4f8c222d4713edaf8dd2721554fb22833067baa7ad19e671/ccontext-0.3.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6b56955cf8018e4c3f7fb6d37e13bb3298c1465ccea5a0f40e76a24c449a2b72",
                "md5": "e028267312518ab586e541b3edb4f17e",
                "sha256": "19cac0f7f407b5d45ba3d4e89327b4cf74aceda277e3d787dc8e8e4b79015a1e"
            },
            "downloads": -1,
            "filename": "ccontext-0.3.2.tar.gz",
            "has_sig": false,
            "md5_digest": "e028267312518ab586e541b3edb4f17e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.8",
            "size": 1409145,
            "upload_time": "2024-08-19T11:15:56",
            "upload_time_iso_8601": "2024-08-19T11:15:56.214055Z",
            "url": "https://files.pythonhosted.org/packages/6b/56/955cf8018e4c3f7fb6d37e13bb3298c1465ccea5a0f40e76a24c449a2b72/ccontext-0.3.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-19 11:15:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "NicolasArnouts",
    "github_project": "ccontext",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "tiktoken",
            "specs": [
                [
                    "==",
                    "0.7.0"
                ]
            ]
        },
        {
            "name": "colorama",
            "specs": [
                [
                    "==",
                    "0.4.6"
                ]
            ]
        },
        {
            "name": "pyperclip",
            "specs": [
                [
                    "==",
                    "1.9.0"
                ]
            ]
        },
        {
            "name": "pypdf",
            "specs": [
                [
                    "==",
                    "4.2.0"
                ]
            ]
        },
        {
            "name": "pathspec",
            "specs": [
                [
                    "==",
                    "0.12.1"
                ]
            ]
        },
        {
            "name": "reportlab",
            "specs": [
                [
                    "==",
                    "4.2.0"
                ]
            ]
        },
        {
            "name": "poetry",
            "specs": [
                [
                    "==",
                    "1.8.3"
                ]
            ]
        },
        {
            "name": "wcmatch",
            "specs": [
                [
                    "==",
                    "9.0"
                ]
            ]
        },
        {
            "name": "pypdf",
            "specs": [
                [
                    "==",
                    "4.3.1"
                ]
            ]
        }
    ],
    "lcname": "ccontext"
}
        
Elapsed time: 0.38795s