txtify


Nametxtify JSON
Version 0.0.3 PyPI version JSON
download
home_pageNone
SummaryA versatile Python tool to convert documents (PPTX, DOCX, PDF, XLSX) to plain text, ideal for providing context to AI code assistants like GitHub Copilot and Amazon CodeWhisperer.
upload_time2025-07-15 00:58:37
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords ai code generation context converter document documentation excel github copilot llm pdf powerpoint text word
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 📄 txtify

**txtify** is a simple yet powerful command-line tool and Python library designed to effortlessly convert various document formats (PowerPoint, Word, PDF, and Excel) into clean, plain text files.  
It's ideal for extracting content for analysis, archiving, or providing crucial context to AI assistants like GitHub Copilot and Amazon CodeWhisperer, allowing them to better understand your project's domain knowledge, requirements, and existing documentation.

---

## 📚 Table of Contents

- [✨ Features](#-features)
- [🤖 Providing Context to AI Code Assistants](#-providing-context-to-ai-code-assistants)
- [🚀 Installation](#-installation)
- [💡 Usage (Command Line Interface)](#-usage-command-line-interface)
  - [Convert a Single File](#convert-a-single-file)
  - [Convert Multiple Files](#convert-multiple-files)
  - [Convert an Entire Directory](#convert-an-entire-directory)
  - [Specify an Output Directory](#specify-an-output-directory)
- [📂 Supported File Formats](#-supported-file-formats)
- [🗄️ Output](#-output)
- [📜 License](#-license)

---

## ✨ Features

✅ **Multi-Format Support**: Converts `.pptx` (PowerPoint), `.docx` (Word), `.pdf` (Portable Document Format), and `.xlsx` (Excel) files.  
✅ **Batch Processing**: Convert multiple files or entire directories at once.  
✅ **Clean Text Output**: Extracts core textual content, making documents easily searchable and readable for both humans and AI.  
✅ **Intuitive CLI**: Simple command-line interface for quick and easy conversions.  
✅ **Preserves Structure**: When converting directories, the original folder structure is replicated in the output.

---

## 🤖 Providing Context to AI Code Assistants

One of the most powerful use cases for **txtify** is to prepare your project's non-code documentation (e.g., design documents, requirement specifications, meeting notes, data dictionaries) for consumption by AI code generation tools like GitHub Copilot, Amazon CodeWhisperer, or similar LLM-based assistants.

### Why this is useful

- **Expand AI's Knowledge Base**: Let the AI "read" and understand domain-specific terminology, project goals, architectural decisions, and detailed requirements that might otherwise be locked away in binary formats.
- **Improve Code Relevance**: The AI can generate more relevant and accurate code suggestions, function names, and comments by leveraging the textual context.
- **Reduce Hallucinations**: With more accurate information, the AI is less likely to "hallucinate" or generate incorrect assumptions.
- **Seamless Integration**: Place the converted `.txt` files in a directory accessible to your IDE, and they can often automatically index and use this information.

### Example Workflow

1. Convert your documentation:
   ```bash
   txtify ./docs_and_requirements/ -o ./ai_context/
````

2. Integrate with your project: Place the `ai_context/` folder directly within your main project repository.
3. Let your AI assistant learn: Your assistant will now have access to the wealth of information contained in these plain text files, enabling more intelligent and context-aware code suggestions.

---

## 🚀 Installation

You can install **txtify** directly from PyPI using pip:

```bash
pip install txtify
```

---

## 💡 Usage (Command Line Interface)

**txtify** can be used directly from your terminal.

### Convert a Single File

Pass the path to your document as an argument:

```bash
txtify my_project_spec.docx
```

This will create a plain text file named `my_project_spec.txt` inside a new `output/` directory by default.

---

### Convert Multiple Files

Specify several files at once:

```bash
txtify requirements.pdf architecture.pptx data_schema.xlsx
```

This will convert the specified files to `.txt` versions in the `output/` directory.

---

### Convert an Entire Directory

Provide the path to a directory, and **txtify** will scan it (and its subdirectories) for all supported document types:

```bash
txtify project_documentation/
```

All convertible files will be processed. The original directory structure will be mirrored in the `output/` folder.
For example:

```
project_documentation/meetings/q1_notes.pptx
```

becomes:

```
output/project_documentation/meetings/q1_notes.txt
```

---

### Specify an Output Directory

Use the `-o` or `--output` option to choose a different location for your converted files:

```bash
txtify legacy_reports/ -o contextual_data/
```

This saves all converted text files into the `contextual_data/` directory.

---

## 📂 Supported File Formats

**txtify** currently supports conversion for the following file types:

* PowerPoint Presentations: `.pptx`
* Word Documents: `.docx`
* PDF Documents: `.pdf`
* Excel Workbooks: `.xlsx`
  *(converted to a CSV-like plain text format, useful for data extraction)*

---

## 🗄️ Output

Converted files will always have a `.txt` extension.
By default, they are saved to a directory named `output/` in your current working directory.
You can customize this using the `-o` or `--output` option.

If converting an entire directory, the relative path from the input directory is preserved in the output.

---

## 📜 License

**txtify** is distributed under the terms of the MIT License.

---


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "txtify",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "ai, code generation, context, converter, document, documentation, excel, github copilot, llm, pdf, powerpoint, text, word",
    "author": null,
    "author_email": "Ray Rada <rayrada1@gmail.com>, Anthony Furst <anthony.furst@afs.com>",
    "download_url": "https://files.pythonhosted.org/packages/d5/e1/4b132b408aa7f2b4fc5b43e8a60fa063e22b70cdd755d2c69a5811693b33/txtify-0.0.3.tar.gz",
    "platform": null,
    "description": "# \ud83d\udcc4 txtify\n\n**txtify** is a simple yet powerful command-line tool and Python library designed to effortlessly convert various document formats (PowerPoint, Word, PDF, and Excel) into clean, plain text files.  \nIt's ideal for extracting content for analysis, archiving, or providing crucial context to AI assistants like GitHub Copilot and Amazon CodeWhisperer, allowing them to better understand your project's domain knowledge, requirements, and existing documentation.\n\n---\n\n## \ud83d\udcda Table of Contents\n\n- [\u2728 Features](#-features)\n- [\ud83e\udd16 Providing Context to AI Code Assistants](#-providing-context-to-ai-code-assistants)\n- [\ud83d\ude80 Installation](#-installation)\n- [\ud83d\udca1 Usage (Command Line Interface)](#-usage-command-line-interface)\n  - [Convert a Single File](#convert-a-single-file)\n  - [Convert Multiple Files](#convert-multiple-files)\n  - [Convert an Entire Directory](#convert-an-entire-directory)\n  - [Specify an Output Directory](#specify-an-output-directory)\n- [\ud83d\udcc2 Supported File Formats](#-supported-file-formats)\n- [\ud83d\uddc4\ufe0f Output](#-output)\n- [\ud83d\udcdc License](#-license)\n\n---\n\n## \u2728 Features\n\n\u2705 **Multi-Format Support**: Converts `.pptx` (PowerPoint), `.docx` (Word), `.pdf` (Portable Document Format), and `.xlsx` (Excel) files.  \n\u2705 **Batch Processing**: Convert multiple files or entire directories at once.  \n\u2705 **Clean Text Output**: Extracts core textual content, making documents easily searchable and readable for both humans and AI.  \n\u2705 **Intuitive CLI**: Simple command-line interface for quick and easy conversions.  \n\u2705 **Preserves Structure**: When converting directories, the original folder structure is replicated in the output.\n\n---\n\n## \ud83e\udd16 Providing Context to AI Code Assistants\n\nOne of the most powerful use cases for **txtify** is to prepare your project's non-code documentation (e.g., design documents, requirement specifications, meeting notes, data dictionaries) for consumption by AI code generation tools like GitHub Copilot, Amazon CodeWhisperer, or similar LLM-based assistants.\n\n### Why this is useful\n\n- **Expand AI's Knowledge Base**: Let the AI \"read\" and understand domain-specific terminology, project goals, architectural decisions, and detailed requirements that might otherwise be locked away in binary formats.\n- **Improve Code Relevance**: The AI can generate more relevant and accurate code suggestions, function names, and comments by leveraging the textual context.\n- **Reduce Hallucinations**: With more accurate information, the AI is less likely to \"hallucinate\" or generate incorrect assumptions.\n- **Seamless Integration**: Place the converted `.txt` files in a directory accessible to your IDE, and they can often automatically index and use this information.\n\n### Example Workflow\n\n1. Convert your documentation:\n   ```bash\n   txtify ./docs_and_requirements/ -o ./ai_context/\n````\n\n2. Integrate with your project: Place the `ai_context/` folder directly within your main project repository.\n3. Let your AI assistant learn: Your assistant will now have access to the wealth of information contained in these plain text files, enabling more intelligent and context-aware code suggestions.\n\n---\n\n## \ud83d\ude80 Installation\n\nYou can install **txtify** directly from PyPI using pip:\n\n```bash\npip install txtify\n```\n\n---\n\n## \ud83d\udca1 Usage (Command Line Interface)\n\n**txtify** can be used directly from your terminal.\n\n### Convert a Single File\n\nPass the path to your document as an argument:\n\n```bash\ntxtify my_project_spec.docx\n```\n\nThis will create a plain text file named `my_project_spec.txt` inside a new `output/` directory by default.\n\n---\n\n### Convert Multiple Files\n\nSpecify several files at once:\n\n```bash\ntxtify requirements.pdf architecture.pptx data_schema.xlsx\n```\n\nThis will convert the specified files to `.txt` versions in the `output/` directory.\n\n---\n\n### Convert an Entire Directory\n\nProvide the path to a directory, and **txtify** will scan it (and its subdirectories) for all supported document types:\n\n```bash\ntxtify project_documentation/\n```\n\nAll convertible files will be processed. The original directory structure will be mirrored in the `output/` folder.\nFor example:\n\n```\nproject_documentation/meetings/q1_notes.pptx\n```\n\nbecomes:\n\n```\noutput/project_documentation/meetings/q1_notes.txt\n```\n\n---\n\n### Specify an Output Directory\n\nUse the `-o` or `--output` option to choose a different location for your converted files:\n\n```bash\ntxtify legacy_reports/ -o contextual_data/\n```\n\nThis saves all converted text files into the `contextual_data/` directory.\n\n---\n\n## \ud83d\udcc2 Supported File Formats\n\n**txtify** currently supports conversion for the following file types:\n\n* PowerPoint Presentations: `.pptx`\n* Word Documents: `.docx`\n* PDF Documents: `.pdf`\n* Excel Workbooks: `.xlsx`\n  *(converted to a CSV-like plain text format, useful for data extraction)*\n\n---\n\n## \ud83d\uddc4\ufe0f Output\n\nConverted files will always have a `.txt` extension.\nBy default, they are saved to a directory named `output/` in your current working directory.\nYou can customize this using the `-o` or `--output` option.\n\nIf converting an entire directory, the relative path from the input directory is preserved in the output.\n\n---\n\n## \ud83d\udcdc License\n\n**txtify** is distributed under the terms of the MIT License.\n\n---\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A versatile Python tool to convert documents (PPTX, DOCX, PDF, XLSX) to plain text, ideal for providing context to AI code assistants like GitHub Copilot and Amazon CodeWhisperer.",
    "version": "0.0.3",
    "project_urls": null,
    "split_keywords": [
        "ai",
        " code generation",
        " context",
        " converter",
        " document",
        " documentation",
        " excel",
        " github copilot",
        " llm",
        " pdf",
        " powerpoint",
        " text",
        " word"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "456f27d96ae09d73ba715c4da071db37e065f961ddce5edf1c189ced566c0fb4",
                "md5": "abdbf1b8fa1ca62a3bcf86456ee01316",
                "sha256": "21a16880724514dba9603087fa1dccfef243d43d34968ceedd4981b6cc46ccdd"
            },
            "downloads": -1,
            "filename": "txtify-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "abdbf1b8fa1ca62a3bcf86456ee01316",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 6865,
            "upload_time": "2025-07-15T00:58:36",
            "upload_time_iso_8601": "2025-07-15T00:58:36.099007Z",
            "url": "https://files.pythonhosted.org/packages/45/6f/27d96ae09d73ba715c4da071db37e065f961ddce5edf1c189ced566c0fb4/txtify-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d5e14b132b408aa7f2b4fc5b43e8a60fa063e22b70cdd755d2c69a5811693b33",
                "md5": "483ed3fd6d3f62879ed6613224d71d8f",
                "sha256": "defa40b3fdbba66b7a4e28a0a54a2bb9c691115a5149379045a3fa62623d1bc6"
            },
            "downloads": -1,
            "filename": "txtify-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "483ed3fd6d3f62879ed6613224d71d8f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 5953,
            "upload_time": "2025-07-15T00:58:37",
            "upload_time_iso_8601": "2025-07-15T00:58:37.328307Z",
            "url": "https://files.pythonhosted.org/packages/d5/e1/4b132b408aa7f2b4fc5b43e8a60fa063e22b70cdd755d2c69a5811693b33/txtify-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-15 00:58:37",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "txtify"
}
        
Elapsed time: 1.00963s