dir2text

Name	dir2text JSON
Version	3.0.1 JSON
	download
home_page	https://github.com/rlichtenwalter/dir2text.git
Summary	A Python library and command-line tool for expressing directory structures and file contents in formats suitable for Large Language Models (LLMs). It combines directory tree visualization with file contents in a memory-efficient, streaming format.
upload_time	2025-08-07 23:01:08
maintainer	Ryan N. Lichtenwalter
docs_url	None
author	Ryan N. Lichtenwalter
requires_python	<4.0,>=3.9.1
license	MIT
keywords	large language model llm token tokenizer sloc tree code assistant
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # dir2text

A Python library and command-line tool for expressing directory structures and file contents in formats suitable for Large Language Models (LLMs). It combines directory tree visualization with file contents in a memory-efficient, streaming format.

## Features

- Tree-style directory structure visualization
- Complete file contents with proper escaping
- Memory-efficient streaming processing
- Multiple output formats (XML, JSON)
- Easy extensibility for new formats
- Support for exclusion patterns (e.g., .gitignore rules)
- Proper symbolic link handling and loop detection
- Optional token counting for LLM context management
- Summary reporting with configurable output destination
- Safe handling of large files and directories

## Installation

This project uses [Poetry](https://python-poetry.org/) for dependency management. We recommend using Poetry for the best development experience, but we also provide traditional pip installation.

### Using Poetry (Recommended)

1. First, [install Poetry](https://python-poetry.org/docs/#installation) if you haven't already.
2. Install dir2text:
   ```bash
   poetry add dir2text
   ```

### Using pip

```bash
pip install dir2text
```

### Optional Features

Install with token counting support (for LLM context management):
```bash
# With Poetry
poetry add "dir2text[token_counting]"

# With pip
pip install "dir2text[token_counting]"
```

**Note:** The `token_counting` feature requires the `tiktoken` package, which needs a Rust compiler (e.g., `rustc`) and Cargo to be available during installation.

## Usage

### Command Line Interface

Basic usage:
```bash
dir2text /path/to/project

# Show version information
dir2text --version

# Exclude files matching patterns from one or more exclusion files
dir2text -e .gitignore /path/to/project
dir2text -e .gitignore -e .npmignore -e custom-ignore /path/to/project

# Exclude files with direct patterns
dir2text -i "*.pyc" -i "node_modules/" /path/to/project

# Enable token counting for LLM context management
dir2text -t gpt-4 /path/to/project

# Generate JSON output and save to file
dir2text -f json -o output.json /path/to/project

# Follow symbolic links
dir2text -L /path/to/project

# Skip tree or content sections
dir2text -T /path/to/project     # Skip tree visualization
dir2text -C /path/to/project     # Skip file contents

# Handle binary files
dir2text -B ignore /path/to/project   # Skip binary files silently (default)
dir2text -B warn /path/to/project     # Skip binary files with warnings
dir2text -B encode /path/to/project   # Include binary files as base64
dir2text -B fail /path/to/project     # Stop on binary files
```

### Symbolic Link Handling

By default, symbolic links are represented as symlinks without following them:

```bash
dir2text /path/to/project
```

This shows symlinks clearly marked with their targets in the tree output, and as separate elements in content output.

To follow symbolic links during traversal (similar to Unix `find -L`):

```bash
dir2text -L /path/to/project
```

This includes the content that symlinks point to, while still protecting against symlink loops.

### Summary Reporting

Dir2text can generate a summary describing the processed directory including file counts, line counts, and optionally token counts. You can control where this information is displayed:

```bash
# Print summary to stderr
dir2text -s stderr /path/to/project

# Print summary to stdout
dir2text -s stdout /path/to/project

# Include summary in the output file
dir2text -s file -o output.txt /path/to/project

# Include token counts in summary by specifying a tokenizer model
dir2text -s stderr -t gpt-4 /path/to/project
```

Summary includes counts of directories, files, symlinks, lines, and characters. Token counts are only included when a tokenizer model is specified with the `-t` option.

### Python API

Basic usage:
```python
from dir2text import StreamingDir2Text
from dir2text.exclusion_rules.git_rules import GitIgnoreExclusionRules

# Create exclusion rules (optional)
rules = GitIgnoreExclusionRules()
rules.add_rule("*.pyc")  # Add rules directly
# OR load from files
rules.load_rules(".gitignore")

# Initialize the analyzer
analyzer = StreamingDir2Text("path/to/project", exclusion_rules=rules)

# Stream the directory tree
for line in analyzer.stream_tree():
    print(line, end='')

# Stream file contents
for chunk in analyzer.stream_contents():
    print(chunk, end='')

# Get metrics
print(f"Processed {analyzer.file_count} files in {analyzer.directory_count} directories")
print(f"Found {analyzer.symlink_count} symbolic links")
```

Memory-efficient processing with token counting:
```python
from dir2text import StreamingDir2Text
from dir2text.exclusion_rules.git_rules import GitIgnoreExclusionRules

# Create exclusion rules from multiple files
rules = GitIgnoreExclusionRules()
rules.load_rules(".gitignore")
rules.load_rules(".npmignore")
rules.add_rule("custom.ignore")

# Initialize with options
analyzer = StreamingDir2Text(
    directory="path/to/project",
    exclusion_rules=rules,
    output_format="json",
    tokenizer_model="gpt-4",
    follow_symlinks=False,  # Default behavior, don't follow symlinks
    binary_action="ignore"  # How to handle binary files: "ignore", "warn", "encode", or "fail"
)

# Process content incrementally
with open("output.json", "w") as f:
    for line in analyzer.stream_tree():
        f.write(line)
    for chunk in analyzer.stream_contents():
        f.write(chunk)

# Print statistics
print(f"Files: {analyzer.file_count}")
print(f"Directories: {analyzer.directory_count}")
print(f"Symlinks: {analyzer.symlink_count}")
print(f"Lines: {analyzer.line_count}")
print(f"Tokens: {analyzer.token_count}")
print(f"Characters: {analyzer.character_count}")
```

Immediate processing (for smaller directories):
```python
from dir2text import Dir2Text
from dir2text.exclusion_rules.git_rules import GitIgnoreExclusionRules

# Create exclusion rules
rules = GitIgnoreExclusionRules()
rules.load_rules(".gitignore")

# Process everything immediately
analyzer = Dir2Text(
    "path/to/project", 
    exclusion_rules=rules,
    follow_symlinks=True,  # Optionally follow symlinks
    binary_action="encode"  # Include binary files as base64
)

# Access complete content
print(analyzer.tree_string)
print(analyzer.content_string)
```

## Output Formats

### XML Format
```xml
<file path="relative/path/to/file.py" content_type="text" tokens="150">
def example():
    print("Hello, world!")
</file>
<symlink path="docs/api.md" target="../README.md" />
```

### JSON Format
```json
{
  "type": "file",
  "path": "relative/path/to/file.py",
  "content_type": "text",
  "content": "def example():\n    print(\"Hello, world!\")",
  "tokens": 150
}
{
  "type": "symlink",
  "path": "docs/api.md",
  "target": "../README.md"
}
```

## Signal Handling

When using dir2text as a command-line tool, it handles system signals gracefully to ensure proper resource management and clean exits:

- **SIGPIPE**: When piping output to programs like `head`, `less`, or `grep` that may terminate before reading all input, dir2text detects the closed pipe and exits cleanly without error messages.
- **SIGINT** (Ctrl+C): Properly handles user interruption, ensuring all resources are cleaned up.

This means you can safely pipe dir2text output to other commands without worrying about error messages when those commands exit:

```bash
# The first 10 lines of output
dir2text /path/to/project | head -n 10

# Only files containing "function"
dir2text /path/to/project | grep "function"
```

## Development

### Setup Development Environment

1. Clone the repository:
   ```bash
   git clone https://github.com/rlichtenwalter/dir2text.git
   cd dir2text
   ```

2. Install development dependencies:
   ```bash
   poetry install --with dev
   ```

3. Install pre-commit hooks:
   ```bash
   poetry run pre-commit install
   ```

### Running Tests

```bash
# Run specific quality control categories
poetry run tox -e format    # Run formatters
poetry run tox -e lint      # Run linters
poetry run tox -e test      # Run tests
poetry run tox -e coverage  # Run test coverage analysis
```

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create a new branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Run the test suite
5. Commit your changes (`git commit -m 'Add some amazing feature'`)
6. Push to the branch (`git push origin feature/amazing-feature`)
7. Open a Pull Request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- This project uses [anytree](https://github.com/c0fec0de/anytree) for tree data structures
- .gitignore pattern matching uses [pathspec](https://github.com/cpburnz/python-pathspec)
- Token counting functionality is provided by OpenAI's [tiktoken](https://github.com/openai/tiktoken)

## Requirements

- Python 3.9+
- Poetry (recommended) or pip
- Optional: Rust compiler and Cargo (for token counting feature)

## Project Status

This project is actively maintained. Issues and pull requests are welcome.

## FAQ

**Q: Why use streaming processing?**  
A: Streaming allows processing of large directories and files with constant memory usage, making it suitable for processing repositories of any size.

**Q: How does dir2text handle symbolic links?**  
A: By default, dir2text represents symlinks as symbolic links in both tree and content output without following them. With the `-L` option, it follows symlinks similar to Unix tools like `find -L`. In both modes, symlink loop detection prevents infinite recursion.

**Q: Can I use this with binary files?**  
A: Yes! dir2text provides flexible binary file handling with the `-B/--binary-action` option:
- `ignore` (default): Skip binary files silently
- `warn`: Skip binary files with warnings to stderr  
- `encode`: Include binary files as base64-encoded content
- `fail`: Stop processing when a binary file is encountered

You can also exclude binary files entirely using the exclusion rules feature for better performance.

**Q: What models are supported for token counting?**  
A: The token counting feature uses OpenAI's tiktoken library with the following primary models and encodings:
- cl100k_base encoding:
  - GPT-4 models (gpt-4, gpt-4-32k)
  - GPT-3.5-Turbo models (gpt-3.5-turbo)
- p50k_base encoding:
  - Text Davinci models (text-davinci-003)

For other language models, using a similar model's tokenizer (like gpt-4) can provide useful approximations of token counts. While the counts may not exactly match your target model's tokenization, they can give a good general estimate. The default model is "gpt-4", which uses cl100k_base encoding and provides a good general-purpose tokenization.

**Q: What happens if I specify a model that doesn't have a dedicated tokenizer?**  
A: The library will suggest using a well-supported model like 'gpt-4' or 'text-davinci-003' for token counting. While token counts may not exactly match your target model, they can provide useful approximations for most modern language models.

**Q: How can I control where summary information is displayed?**  
A: Use the `-s/--summary` option to control where summary information is displayed:
  - `-s stderr`: Print summary to stderr
  - `-s stdout`: Print summary to stdout
  - `-s file`: Include summary in the output file (requires `-o`)

**Q: Is token counting required for summary reporting?**  
A: No. Basic statistics (e.g., file count, directory count, etc.,) are available without token counting. Including token counts in summary requires the `-t/--tokenizer` option to be specified along with `-s/--summary`.

## Contact

Ryan N. Lichtenwalter - rlichtenwalter@gmail.com

Project Link: [https://github.com/rlichtenwalter/dir2text](https://github.com/rlichtenwalter/dir2text)

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/rlichtenwalter/dir2text.git",
    "name": "dir2text",
    "maintainer": "Ryan N. Lichtenwalter",
    "docs_url": null,
    "requires_python": "<4.0,>=3.9.1",
    "maintainer_email": "rlichtenwalter@gmail.com",
    "keywords": "large language model, LLM, token, tokenizer, SLOC, tree, code assistant",
    "author": "Ryan N. Lichtenwalter",
    "author_email": "rlichtenwalter@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/7b/e9/f084c01ea3d08f4be1df71a2f131395e78194cf6e1e91ab5c4f0db6692c4/dir2text-3.0.1.tar.gz",
    "platform": null,
    "description": "# dir2text\n\nA Python library and command-line tool for expressing directory structures and file contents in formats suitable for Large Language Models (LLMs). It combines directory tree visualization with file contents in a memory-efficient, streaming format.\n\n## Features\n\n- Tree-style directory structure visualization\n- Complete file contents with proper escaping\n- Memory-efficient streaming processing\n- Multiple output formats (XML, JSON)\n- Easy extensibility for new formats\n- Support for exclusion patterns (e.g., .gitignore rules)\n- Proper symbolic link handling and loop detection\n- Optional token counting for LLM context management\n- Summary reporting with configurable output destination\n- Safe handling of large files and directories\n\n## Installation\n\nThis project uses [Poetry](https://python-poetry.org/) for dependency management. We recommend using Poetry for the best development experience, but we also provide traditional pip installation.\n\n### Using Poetry (Recommended)\n\n1. First, [install Poetry](https://python-poetry.org/docs/#installation) if you haven't already.\n2. Install dir2text:\n   ```bash\n   poetry add dir2text\n   ```\n\n### Using pip\n\n```bash\npip install dir2text\n```\n\n### Optional Features\n\nInstall with token counting support (for LLM context management):\n```bash\n# With Poetry\npoetry add \"dir2text[token_counting]\"\n\n# With pip\npip install \"dir2text[token_counting]\"\n```\n\n**Note:** The `token_counting` feature requires the `tiktoken` package, which needs a Rust compiler (e.g., `rustc`) and Cargo to be available during installation.\n\n## Usage\n\n### Command Line Interface\n\nBasic usage:\n```bash\ndir2text /path/to/project\n\n# Show version information\ndir2text --version\n\n# Exclude files matching patterns from one or more exclusion files\ndir2text -e .gitignore /path/to/project\ndir2text -e .gitignore -e .npmignore -e custom-ignore /path/to/project\n\n# Exclude files with direct patterns\ndir2text -i \"*.pyc\" -i \"node_modules/\" /path/to/project\n\n# Enable token counting for LLM context management\ndir2text -t gpt-4 /path/to/project\n\n# Generate JSON output and save to file\ndir2text -f json -o output.json /path/to/project\n\n# Follow symbolic links\ndir2text -L /path/to/project\n\n# Skip tree or content sections\ndir2text -T /path/to/project     # Skip tree visualization\ndir2text -C /path/to/project     # Skip file contents\n\n# Handle binary files\ndir2text -B ignore /path/to/project   # Skip binary files silently (default)\ndir2text -B warn /path/to/project     # Skip binary files with warnings\ndir2text -B encode /path/to/project   # Include binary files as base64\ndir2text -B fail /path/to/project     # Stop on binary files\n```\n\n### Symbolic Link Handling\n\nBy default, symbolic links are represented as symlinks without following them:\n\n```bash\ndir2text /path/to/project\n```\n\nThis shows symlinks clearly marked with their targets in the tree output, and as separate elements in content output.\n\nTo follow symbolic links during traversal (similar to Unix `find -L`):\n\n```bash\ndir2text -L /path/to/project\n```\n\nThis includes the content that symlinks point to, while still protecting against symlink loops.\n\n### Summary Reporting\n\nDir2text can generate a summary describing the processed directory including file counts, line counts, and optionally token counts. You can control where this information is displayed:\n\n```bash\n# Print summary to stderr\ndir2text -s stderr /path/to/project\n\n# Print summary to stdout\ndir2text -s stdout /path/to/project\n\n# Include summary in the output file\ndir2text -s file -o output.txt /path/to/project\n\n# Include token counts in summary by specifying a tokenizer model\ndir2text -s stderr -t gpt-4 /path/to/project\n```\n\nSummary includes counts of directories, files, symlinks, lines, and characters. Token counts are only included when a tokenizer model is specified with the `-t` option.\n\n### Python API\n\nBasic usage:\n```python\nfrom dir2text import StreamingDir2Text\nfrom dir2text.exclusion_rules.git_rules import GitIgnoreExclusionRules\n\n# Create exclusion rules (optional)\nrules = GitIgnoreExclusionRules()\nrules.add_rule(\"*.pyc\")  # Add rules directly\n# OR load from files\nrules.load_rules(\".gitignore\")\n\n# Initialize the analyzer\nanalyzer = StreamingDir2Text(\"path/to/project\", exclusion_rules=rules)\n\n# Stream the directory tree\nfor line in analyzer.stream_tree():\n    print(line, end='')\n\n# Stream file contents\nfor chunk in analyzer.stream_contents():\n    print(chunk, end='')\n\n# Get metrics\nprint(f\"Processed {analyzer.file_count} files in {analyzer.directory_count} directories\")\nprint(f\"Found {analyzer.symlink_count} symbolic links\")\n```\n\nMemory-efficient processing with token counting:\n```python\nfrom dir2text import StreamingDir2Text\nfrom dir2text.exclusion_rules.git_rules import GitIgnoreExclusionRules\n\n# Create exclusion rules from multiple files\nrules = GitIgnoreExclusionRules()\nrules.load_rules(\".gitignore\")\nrules.load_rules(\".npmignore\")\nrules.add_rule(\"custom.ignore\")\n\n# Initialize with options\nanalyzer = StreamingDir2Text(\n    directory=\"path/to/project\",\n    exclusion_rules=rules,\n    output_format=\"json\",\n    tokenizer_model=\"gpt-4\",\n    follow_symlinks=False,  # Default behavior, don't follow symlinks\n    binary_action=\"ignore\"  # How to handle binary files: \"ignore\", \"warn\", \"encode\", or \"fail\"\n)\n\n# Process content incrementally\nwith open(\"output.json\", \"w\") as f:\n    for line in analyzer.stream_tree():\n        f.write(line)\n    for chunk in analyzer.stream_contents():\n        f.write(chunk)\n\n# Print statistics\nprint(f\"Files: {analyzer.file_count}\")\nprint(f\"Directories: {analyzer.directory_count}\")\nprint(f\"Symlinks: {analyzer.symlink_count}\")\nprint(f\"Lines: {analyzer.line_count}\")\nprint(f\"Tokens: {analyzer.token_count}\")\nprint(f\"Characters: {analyzer.character_count}\")\n```\n\nImmediate processing (for smaller directories):\n```python\nfrom dir2text import Dir2Text\nfrom dir2text.exclusion_rules.git_rules import GitIgnoreExclusionRules\n\n# Create exclusion rules\nrules = GitIgnoreExclusionRules()\nrules.load_rules(\".gitignore\")\n\n# Process everything immediately\nanalyzer = Dir2Text(\n    \"path/to/project\", \n    exclusion_rules=rules,\n    follow_symlinks=True,  # Optionally follow symlinks\n    binary_action=\"encode\"  # Include binary files as base64\n)\n\n# Access complete content\nprint(analyzer.tree_string)\nprint(analyzer.content_string)\n```\n\n## Output Formats\n\n### XML Format\n```xml\n<file path=\"relative/path/to/file.py\" content_type=\"text\" tokens=\"150\">\ndef example():\n    print(\"Hello, world!\")\n</file>\n<symlink path=\"docs/api.md\" target=\"../README.md\" />\n```\n\n### JSON Format\n```json\n{\n  \"type\": \"file\",\n  \"path\": \"relative/path/to/file.py\",\n  \"content_type\": \"text\",\n  \"content\": \"def example():\\n    print(\\\"Hello, world!\\\")\",\n  \"tokens\": 150\n}\n{\n  \"type\": \"symlink\",\n  \"path\": \"docs/api.md\",\n  \"target\": \"../README.md\"\n}\n```\n\n## Signal Handling\n\nWhen using dir2text as a command-line tool, it handles system signals gracefully to ensure proper resource management and clean exits:\n\n- **SIGPIPE**: When piping output to programs like `head`, `less`, or `grep` that may terminate before reading all input, dir2text detects the closed pipe and exits cleanly without error messages.\n- **SIGINT** (Ctrl+C): Properly handles user interruption, ensuring all resources are cleaned up.\n\nThis means you can safely pipe dir2text output to other commands without worrying about error messages when those commands exit:\n\n```bash\n# The first 10 lines of output\ndir2text /path/to/project | head -n 10\n\n# Only files containing \"function\"\ndir2text /path/to/project | grep \"function\"\n```\n\n## Development\n\n### Setup Development Environment\n\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/rlichtenwalter/dir2text.git\n   cd dir2text\n   ```\n\n2. Install development dependencies:\n   ```bash\n   poetry install --with dev\n   ```\n\n3. Install pre-commit hooks:\n   ```bash\n   poetry run pre-commit install\n   ```\n\n### Running Tests\n\n```bash\n# Run specific quality control categories\npoetry run tox -e format    # Run formatters\npoetry run tox -e lint      # Run linters\npoetry run tox -e test      # Run tests\npoetry run tox -e coverage  # Run test coverage analysis\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n1. Fork the repository\n2. Create a new branch (`git checkout -b feature/amazing-feature`)\n3. Make your changes\n4. Run the test suite\n5. Commit your changes (`git commit -m 'Add some amazing feature'`)\n6. Push to the branch (`git push origin feature/amazing-feature`)\n7. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- This project uses [anytree](https://github.com/c0fec0de/anytree) for tree data structures\n- .gitignore pattern matching uses [pathspec](https://github.com/cpburnz/python-pathspec)\n- Token counting functionality is provided by OpenAI's [tiktoken](https://github.com/openai/tiktoken)\n\n## Requirements\n\n- Python 3.9+\n- Poetry (recommended) or pip\n- Optional: Rust compiler and Cargo (for token counting feature)\n\n## Project Status\n\nThis project is actively maintained. Issues and pull requests are welcome.\n\n## FAQ\n\n**Q: Why use streaming processing?**  \nA: Streaming allows processing of large directories and files with constant memory usage, making it suitable for processing repositories of any size.\n\n**Q: How does dir2text handle symbolic links?**  \nA: By default, dir2text represents symlinks as symbolic links in both tree and content output without following them. With the `-L` option, it follows symlinks similar to Unix tools like `find -L`. In both modes, symlink loop detection prevents infinite recursion.\n\n**Q: Can I use this with binary files?**  \nA: Yes! dir2text provides flexible binary file handling with the `-B/--binary-action` option:\n- `ignore` (default): Skip binary files silently\n- `warn`: Skip binary files with warnings to stderr  \n- `encode`: Include binary files as base64-encoded content\n- `fail`: Stop processing when a binary file is encountered\n\nYou can also exclude binary files entirely using the exclusion rules feature for better performance.\n\n**Q: What models are supported for token counting?**  \nA: The token counting feature uses OpenAI's tiktoken library with the following primary models and encodings:\n- cl100k_base encoding:\n  - GPT-4 models (gpt-4, gpt-4-32k)\n  - GPT-3.5-Turbo models (gpt-3.5-turbo)\n- p50k_base encoding:\n  - Text Davinci models (text-davinci-003)\n\nFor other language models, using a similar model's tokenizer (like gpt-4) can provide useful approximations of token counts. While the counts may not exactly match your target model's tokenization, they can give a good general estimate. The default model is \"gpt-4\", which uses cl100k_base encoding and provides a good general-purpose tokenization.\n\n**Q: What happens if I specify a model that doesn't have a dedicated tokenizer?**  \nA: The library will suggest using a well-supported model like 'gpt-4' or 'text-davinci-003' for token counting. While token counts may not exactly match your target model, they can provide useful approximations for most modern language models.\n\n**Q: How can I control where summary information is displayed?**  \nA: Use the `-s/--summary` option to control where summary information is displayed:\n  - `-s stderr`: Print summary to stderr\n  - `-s stdout`: Print summary to stdout\n  - `-s file`: Include summary in the output file (requires `-o`)\n\n**Q: Is token counting required for summary reporting?**  \nA: No. Basic statistics (e.g., file count, directory count, etc.,) are available without token counting. Including token counts in summary requires the `-t/--tokenizer` option to be specified along with `-s/--summary`.\n\n## Contact\n\nRyan N. Lichtenwalter - rlichtenwalter@gmail.com\n\nProject Link: [https://github.com/rlichtenwalter/dir2text](https://github.com/rlichtenwalter/dir2text)",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python library and command-line tool for expressing directory structures and file contents in formats suitable for Large Language Models (LLMs). It combines directory tree visualization with file contents in a memory-efficient, streaming format.",
    "version": "3.0.1",
    "project_urls": {
        "Homepage": "https://github.com/rlichtenwalter/dir2text.git",
        "Repository": "https://github.com/rlichtenwalter/dir2text.git"
    },
    "split_keywords": [
        "large language model",
        " llm",
        " token",
        " tokenizer",
        " sloc",
        " tree",
        " code assistant"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "43f7e3b84be166bdf178f0ac48d23a35634b2ea115a92eb5be507a2b9bf895f7",
                "md5": "f28f5f561b770876f4957cbb508421a5",
                "sha256": "5abdc6b34e2455d08c3772dda9f5179a74d83414d623660817ede22ba80f813f"
            },
            "downloads": -1,
            "filename": "dir2text-3.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f28f5f561b770876f4957cbb508421a5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9.1",
            "size": 57820,
            "upload_time": "2025-08-07T23:01:07",
            "upload_time_iso_8601": "2025-08-07T23:01:07.616183Z",
            "url": "https://files.pythonhosted.org/packages/43/f7/e3b84be166bdf178f0ac48d23a35634b2ea115a92eb5be507a2b9bf895f7/dir2text-3.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7be9f084c01ea3d08f4be1df71a2f131395e78194cf6e1e91ab5c4f0db6692c4",
                "md5": "119415859125cd858e90fed759b476d2",
                "sha256": "9d9d99251b1157c9872d5b0af4ae1c806f287000eb90a9c8fb28d489e44cbcb5"
            },
            "downloads": -1,
            "filename": "dir2text-3.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "119415859125cd858e90fed759b476d2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9.1",
            "size": 48707,
            "upload_time": "2025-08-07T23:01:08",
            "upload_time_iso_8601": "2025-08-07T23:01:08.934437Z",
            "url": "https://files.pythonhosted.org/packages/7b/e9/f084c01ea3d08f4be1df71a2f131395e78194cf6e1e91ab5c4f0db6692c4/dir2text-3.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-07 23:01:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rlichtenwalter",
    "github_project": "dir2text",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "dir2text"
}

Ryan N. Lichtenwalter