repo-to-singlefile


Namerepo-to-singlefile JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/Oni-giri/repo-to-singlefile
SummaryA tool to convert code repositories into text format for LLM context
upload_time2024-11-14 17:59:03
maintainerNone
docs_urlNone
authorYakitori
requires_python<4.0,>=3.8
licenseMIT
keywords repository text llm context converter
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Repo to Single File

A command-line tool that converts code repositories into text format, making them suitable for use as context in Large Language Models (LLMs). Supports both local repositories and GitHub remote repositories.

## Features

- Convert local Git repositories to text format
- Convert GitHub repositories to text format (public and private)
- Process specific subfolders in monorepos
- Respect `.gitignore` patterns for local repositories
- Skip binary files automatically
- Structured output with clear file demarcation
- Token counting with OpenAI tokenizer
- Cost estimation for GPT-3.5 and GPT-4

## Installation

```bash
pip install repo-to-singlefile
```

## Usage

### Basic Usage

1. Convert a local repository:
```bash
repo-to-singlefile /path/to/local/repo output.txt
```

2. Convert a public GitHub repository:
```bash
repo-to-singlefile https://github.com/owner/repo output.txt
```

3. Convert a private GitHub repository:
```bash
repo-to-singlefile https://github.com/owner/repo output.txt --github-token YOUR_GITHUB_TOKEN
```

### Monorepo Support

Process only specific subfolders in a repository:

1. Local monorepo:
```bash
repo-to-singlefile /path/to/repo output.txt --subfolder packages/mylib
```

2. GitHub monorepo:
```bash
repo-to-singlefile https://github.com/owner/repo output.txt --subfolder packages/mylib
```

### Output Format

The generated text file contains the contents of all text files in the repository, with clear headers separating each file:

```
### File: src/main.py ###
[content of main.py]

### File: src/utils.py ###
[content of utils.py]

...
```

After processing, you'll see a summary that includes:
- Total token count
- Total character count
- Estimated costs for GPT-3.5 and GPT-4 usage

Example summary:
```
==================================================
CONVERSION SUMMARY
==================================================
Total tokens: 15,234
Total characters: 45,678

Estimated costs (based on current OpenAI pricing):
GPT-4:
  - Input cost: $0.46
  - Output cost: $0.91
GPT-3.5:
  - Input cost: $0.02
  - Output cost: $0.03
==================================================
```

## Configuration

The tool automatically:
- Respects `.gitignore` patterns in local repositories
- Skips binary files
- Processes common text file extensions:
  - Python (.py)
  - JavaScript (.js)
  - Java (.java)
  - C++ (.cpp, .h)
  - Web (.html, .css)
  - Documentation (.md)
  - Config files (.yml, .yaml, .json)
  - Shell scripts (.sh)
  - Text files (.txt)
  - XML files (.xml)

## GitHub Authentication

For private repositories, you'll need a GitHub personal access token:

1. Generate a token at https://github.com/settings/tokens
2. Use the token with the --github-token option:
```bash
repo-to-singlefile https://github.com/owner/private-repo output.txt --github-token YOUR_TOKEN
```

## Error Handling

The tool provides clear error messages for common issues:
- Invalid repository paths or URLs
- Missing subfolders
- Permission denied errors
- Binary file skipping
- Token counting errors

## Development

### Setup Development Environment

1. Clone the repository:
```bash
git clone https://github.com/yourusername/repo-to-singlefile.git
cd repo-to-singlefile
```

2. Install dependencies:
```bash
pip install -e .
```

### Running Tests

```bash
pytest
```

## Common Issues

### Permission Denied
When accessing private GitHub repositories, make sure your token has the necessary permissions:
- For public repositories: No token needed
- For private repositories: Token needs `repo` scope

### Subfolder Not Found
When specifying a subfolder:
- Ensure the path is relative to the repository root
- Use forward slashes (/) even on Windows
- Check that the subfolder exists in the repository

### Large Repositories
For very large repositories:
- Consider processing specific subfolders
- Be aware of rate limits when using GitHub API
- Monitor token costs for large codebases

## Contributing

1. Fork the repository
2. Create a feature branch
3. Commit your changes
4. Push to the branch
5. Create a pull request

## License

This project is licensed under the MIT License

## Contact

- Report bugs through GitHub issues
- Submit feature requests through GitHub issues
- For security issues, please see SECURITY.md
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Oni-giri/repo-to-singlefile",
    "name": "repo-to-singlefile",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "repository, text, llm, context, converter",
    "author": "Yakitori",
    "author_email": "mers_etanche.0n@icloud.com",
    "download_url": "https://files.pythonhosted.org/packages/b3/c5/18828f1f4e5dfe17d6b8d32bca7d4c782a102b78a363cb129f261794afcf/repo_to_singlefile-0.1.0.tar.gz",
    "platform": null,
    "description": "# Repo to Single File\n\nA command-line tool that converts code repositories into text format, making them suitable for use as context in Large Language Models (LLMs). Supports both local repositories and GitHub remote repositories.\n\n## Features\n\n- Convert local Git repositories to text format\n- Convert GitHub repositories to text format (public and private)\n- Process specific subfolders in monorepos\n- Respect `.gitignore` patterns for local repositories\n- Skip binary files automatically\n- Structured output with clear file demarcation\n- Token counting with OpenAI tokenizer\n- Cost estimation for GPT-3.5 and GPT-4\n\n## Installation\n\n```bash\npip install repo-to-singlefile\n```\n\n## Usage\n\n### Basic Usage\n\n1. Convert a local repository:\n```bash\nrepo-to-singlefile /path/to/local/repo output.txt\n```\n\n2. Convert a public GitHub repository:\n```bash\nrepo-to-singlefile https://github.com/owner/repo output.txt\n```\n\n3. Convert a private GitHub repository:\n```bash\nrepo-to-singlefile https://github.com/owner/repo output.txt --github-token YOUR_GITHUB_TOKEN\n```\n\n### Monorepo Support\n\nProcess only specific subfolders in a repository:\n\n1. Local monorepo:\n```bash\nrepo-to-singlefile /path/to/repo output.txt --subfolder packages/mylib\n```\n\n2. GitHub monorepo:\n```bash\nrepo-to-singlefile https://github.com/owner/repo output.txt --subfolder packages/mylib\n```\n\n### Output Format\n\nThe generated text file contains the contents of all text files in the repository, with clear headers separating each file:\n\n```\n### File: src/main.py ###\n[content of main.py]\n\n### File: src/utils.py ###\n[content of utils.py]\n\n...\n```\n\nAfter processing, you'll see a summary that includes:\n- Total token count\n- Total character count\n- Estimated costs for GPT-3.5 and GPT-4 usage\n\nExample summary:\n```\n==================================================\nCONVERSION SUMMARY\n==================================================\nTotal tokens: 15,234\nTotal characters: 45,678\n\nEstimated costs (based on current OpenAI pricing):\nGPT-4:\n  - Input cost: $0.46\n  - Output cost: $0.91\nGPT-3.5:\n  - Input cost: $0.02\n  - Output cost: $0.03\n==================================================\n```\n\n## Configuration\n\nThe tool automatically:\n- Respects `.gitignore` patterns in local repositories\n- Skips binary files\n- Processes common text file extensions:\n  - Python (.py)\n  - JavaScript (.js)\n  - Java (.java)\n  - C++ (.cpp, .h)\n  - Web (.html, .css)\n  - Documentation (.md)\n  - Config files (.yml, .yaml, .json)\n  - Shell scripts (.sh)\n  - Text files (.txt)\n  - XML files (.xml)\n\n## GitHub Authentication\n\nFor private repositories, you'll need a GitHub personal access token:\n\n1. Generate a token at https://github.com/settings/tokens\n2. Use the token with the --github-token option:\n```bash\nrepo-to-singlefile https://github.com/owner/private-repo output.txt --github-token YOUR_TOKEN\n```\n\n## Error Handling\n\nThe tool provides clear error messages for common issues:\n- Invalid repository paths or URLs\n- Missing subfolders\n- Permission denied errors\n- Binary file skipping\n- Token counting errors\n\n## Development\n\n### Setup Development Environment\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/yourusername/repo-to-singlefile.git\ncd repo-to-singlefile\n```\n\n2. Install dependencies:\n```bash\npip install -e .\n```\n\n### Running Tests\n\n```bash\npytest\n```\n\n## Common Issues\n\n### Permission Denied\nWhen accessing private GitHub repositories, make sure your token has the necessary permissions:\n- For public repositories: No token needed\n- For private repositories: Token needs `repo` scope\n\n### Subfolder Not Found\nWhen specifying a subfolder:\n- Ensure the path is relative to the repository root\n- Use forward slashes (/) even on Windows\n- Check that the subfolder exists in the repository\n\n### Large Repositories\nFor very large repositories:\n- Consider processing specific subfolders\n- Be aware of rate limits when using GitHub API\n- Monitor token costs for large codebases\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Commit your changes\n4. Push to the branch\n5. Create a pull request\n\n## License\n\nThis project is licensed under the MIT License\n\n## Contact\n\n- Report bugs through GitHub issues\n- Submit feature requests through GitHub issues\n- For security issues, please see SECURITY.md",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A tool to convert code repositories into text format for LLM context",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/Oni-giri/repo-to-singlefile",
        "Repository": "https://github.com/Oni-giri/repo-to-singlefile"
    },
    "split_keywords": [
        "repository",
        " text",
        " llm",
        " context",
        " converter"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c7f6b18e75bfc5c7d5eea1ed2811d186dfa9b911ca2cd6edde0396142d5c6c8c",
                "md5": "b9529e4388c13808a77743abbf6f595d",
                "sha256": "63653afb18cc2eda7074f5a74032e02a0646e1487e1e751dad0b3013c42d0f5b"
            },
            "downloads": -1,
            "filename": "repo_to_singlefile-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b9529e4388c13808a77743abbf6f595d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 7759,
            "upload_time": "2024-11-14T17:59:02",
            "upload_time_iso_8601": "2024-11-14T17:59:02.120146Z",
            "url": "https://files.pythonhosted.org/packages/c7/f6/b18e75bfc5c7d5eea1ed2811d186dfa9b911ca2cd6edde0396142d5c6c8c/repo_to_singlefile-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b3c518828f1f4e5dfe17d6b8d32bca7d4c782a102b78a363cb129f261794afcf",
                "md5": "ed3168417e81e6df1a65a250d58af3aa",
                "sha256": "61c2166571d2fac1baf1bd250e03a6743ff4f72cb9739f1e587637477872eb7b"
            },
            "downloads": -1,
            "filename": "repo_to_singlefile-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "ed3168417e81e6df1a65a250d58af3aa",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 5938,
            "upload_time": "2024-11-14T17:59:03",
            "upload_time_iso_8601": "2024-11-14T17:59:03.884223Z",
            "url": "https://files.pythonhosted.org/packages/b3/c5/18828f1f4e5dfe17d6b8d32bca7d4c782a102b78a363cb129f261794afcf/repo_to_singlefile-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-14 17:59:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Oni-giri",
    "github_project": "repo-to-singlefile",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "repo-to-singlefile"
}
        
Elapsed time: 0.43053s