pdf-splitter-cli


Namepdf-splitter-cli JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryA modern command-line tool to split PDF files into smaller chunks with progress bars and automatic filename generation
upload_time2025-07-17 01:37:12
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords pdf split cli command-line pdf-tools document-processing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PDF Splitter CLI

[![PyPI version](https://badge.fury.io/py/pdf-splitter-cli.svg)](https://badge.fury.io/py/pdf-splitter-cli)
[![Python Support](https://img.shields.io/pypi/pyversions/pdf-splitter-cli.svg)](https://pypi.org/project/pdf-splitter-cli/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A modern command-line tool to split BIG PDF files into smaller chunks with **real-time progress bars** and automatic filename generation.

## ✨ Features

- 📄 **Split PDF files** by specified number of pages per chunk
- 🎯 **Real-time progress bars** showing file creation progress
- 📁 **Smart filename generation** based on original filename
- 🔢 **Sequential numbering** (e.g., `document_1.pdf`, `document_2.pdf`)
- 📂 **Configurable output folders**
- 🖥️ **Modern CLI** with rich help and validation
- 📃 **Individual page splitting** support
- 🎨 **Colorized output** for better user experience
- 🛠️ **Robust error handling** with fallback methods (pdftk, qpdf)
- ⚡ **Memory-efficient** processing for large files
- 🔧 **Cross-platform** (Windows, macOS, Linux)

## 🚀 Installation

```bash
pip install pdf-splitter-cli
```

**Requirements:** Python 3.8+

## 📖 Quick Start

```bash
# Basic usage - split every 5 pages (default)
pdf-splitter document.pdf

# Custom chunk size - split every 10 pages
pdf-splitter document.pdf -p 10

# Custom output folder
pdf-splitter document.pdf -o my_chunks

# Split into individual pages
pdf-splitter document.pdf -p 1

# Disable progress bars (useful for scripts)
pdf-splitter document.pdf --no-progress
```

## 📋 Usage

### Command Structure
```bash
pdf-splitter <input_pdf> [OPTIONS]
```

### Options
- `-p, --pages-per-chunk INTEGER`: Pages per output file (default: 5)
- `-o, --output-folder TEXT`: Output folder (default: "output_chunks")
- `--no-progress`: Disable progress bars
- `--help`: Show help message

### Examples

#### Basic Splitting
```bash
pdf-splitter document.pdf
```
**Output:** `document_1.pdf`, `document_2.pdf`, etc. in `output_chunks/`

#### Custom Page Count
```bash
pdf-splitter document.pdf -p 10
pdf-splitter document.pdf --pages-per-chunk 10
```

#### Custom Output Folder
```bash
pdf-splitter document.pdf -p 3 -o my_output
```

#### Individual Pages
```bash
pdf-splitter report.pdf -p 1
```
**Output:** `report_1.pdf`, `report_2.pdf`, etc. (one page each)

## 🎯 Progress Bars

The tool shows real-time progress as files are created:

```
Creating PDF files [████████████████████] 100% (8/8 files) 00:00:15
```

- **File-based progress**: Tracks each output file completion
- **ETA display**: Shows estimated time remaining
- **Percentage complete**: Visual progress indicator
- **Disable option**: Use `--no-progress` for scripting

## 🛠️ Advanced Features

### Large File Support
- **Memory-efficient processing** for multi-GB files
- **Automatic garbage collection** after each chunk
- **Error recovery** continues processing if individual pages fail
- **File size warnings** for files >100MB

### Fallback Methods
If the primary PyPDF method fails, the tool automatically tries:
1. **pdftk** (if installed)
2. **qpdf** (if installed)

### Error Handling
- **Graceful degradation** for corrupted PDFs
- **Detailed error messages** with suggested solutions
- **Partial processing** continues even if some pages fail

## 📁 Output File Naming

Files are automatically named using the original filename:

| Input | Output |
|-------|--------|
| `document.pdf` | `document_1.pdf`, `document_2.pdf`, ... |
| `report.pdf` | `report_1.pdf`, `report_2.pdf`, ... |
| `/path/to/file.pdf` | `file_1.pdf`, `file_2.pdf`, ... |

## 🔧 Installation from Source

For development or latest features:

```bash
git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
pip install -e .
```

## 📄 License

MIT License - see [LICENSE](LICENSE) file for details.

## 🤝 Contributing

Contributions welcome! Please feel free to submit a Pull Request.

## 🐛 Issues

Found a bug or have a feature request? Please open an issue on [GitHub](https://github.com/jmxt3/pdf-splitter/issues).

## 📊 Dependencies

- **click**: Modern CLI framework
- **pypdf**: PDF processing library

## 🏷️ Version History

- **0.1.0**: Initial release with progress bars and robust error handling

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pdf-splitter-cli",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Joao Machete <jmachete@gmail.com>",
    "keywords": "pdf, split, cli, command-line, pdf-tools, document-processing",
    "author": null,
    "author_email": "Joao Machete <jmachete@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/40/04/16f2bba20a97aeb7ce9ffef68d34518fe49582cf732af474eba50b4952d1/pdf_splitter_cli-0.1.1.tar.gz",
    "platform": null,
    "description": "# PDF Splitter CLI\r\n\r\n[![PyPI version](https://badge.fury.io/py/pdf-splitter-cli.svg)](https://badge.fury.io/py/pdf-splitter-cli)\r\n[![Python Support](https://img.shields.io/pypi/pyversions/pdf-splitter-cli.svg)](https://pypi.org/project/pdf-splitter-cli/)\r\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\r\n\r\nA modern command-line tool to split BIG PDF files into smaller chunks with **real-time progress bars** and automatic filename generation.\r\n\r\n## \u2728 Features\r\n\r\n- \ud83d\udcc4 **Split PDF files** by specified number of pages per chunk\r\n- \ud83c\udfaf **Real-time progress bars** showing file creation progress\r\n- \ud83d\udcc1 **Smart filename generation** based on original filename\r\n- \ud83d\udd22 **Sequential numbering** (e.g., `document_1.pdf`, `document_2.pdf`)\r\n- \ud83d\udcc2 **Configurable output folders**\r\n- \ud83d\udda5\ufe0f **Modern CLI** with rich help and validation\r\n- \ud83d\udcc3 **Individual page splitting** support\r\n- \ud83c\udfa8 **Colorized output** for better user experience\r\n- \ud83d\udee0\ufe0f **Robust error handling** with fallback methods (pdftk, qpdf)\r\n- \u26a1 **Memory-efficient** processing for large files\r\n- \ud83d\udd27 **Cross-platform** (Windows, macOS, Linux)\r\n\r\n## \ud83d\ude80 Installation\r\n\r\n```bash\r\npip install pdf-splitter-cli\r\n```\r\n\r\n**Requirements:** Python 3.8+\r\n\r\n## \ud83d\udcd6 Quick Start\r\n\r\n```bash\r\n# Basic usage - split every 5 pages (default)\r\npdf-splitter document.pdf\r\n\r\n# Custom chunk size - split every 10 pages\r\npdf-splitter document.pdf -p 10\r\n\r\n# Custom output folder\r\npdf-splitter document.pdf -o my_chunks\r\n\r\n# Split into individual pages\r\npdf-splitter document.pdf -p 1\r\n\r\n# Disable progress bars (useful for scripts)\r\npdf-splitter document.pdf --no-progress\r\n```\r\n\r\n## \ud83d\udccb Usage\r\n\r\n### Command Structure\r\n```bash\r\npdf-splitter <input_pdf> [OPTIONS]\r\n```\r\n\r\n### Options\r\n- `-p, --pages-per-chunk INTEGER`: Pages per output file (default: 5)\r\n- `-o, --output-folder TEXT`: Output folder (default: \"output_chunks\")\r\n- `--no-progress`: Disable progress bars\r\n- `--help`: Show help message\r\n\r\n### Examples\r\n\r\n#### Basic Splitting\r\n```bash\r\npdf-splitter document.pdf\r\n```\r\n**Output:** `document_1.pdf`, `document_2.pdf`, etc. in `output_chunks/`\r\n\r\n#### Custom Page Count\r\n```bash\r\npdf-splitter document.pdf -p 10\r\npdf-splitter document.pdf --pages-per-chunk 10\r\n```\r\n\r\n#### Custom Output Folder\r\n```bash\r\npdf-splitter document.pdf -p 3 -o my_output\r\n```\r\n\r\n#### Individual Pages\r\n```bash\r\npdf-splitter report.pdf -p 1\r\n```\r\n**Output:** `report_1.pdf`, `report_2.pdf`, etc. (one page each)\r\n\r\n## \ud83c\udfaf Progress Bars\r\n\r\nThe tool shows real-time progress as files are created:\r\n\r\n```\r\nCreating PDF files [\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588] 100% (8/8 files) 00:00:15\r\n```\r\n\r\n- **File-based progress**: Tracks each output file completion\r\n- **ETA display**: Shows estimated time remaining\r\n- **Percentage complete**: Visual progress indicator\r\n- **Disable option**: Use `--no-progress` for scripting\r\n\r\n## \ud83d\udee0\ufe0f Advanced Features\r\n\r\n### Large File Support\r\n- **Memory-efficient processing** for multi-GB files\r\n- **Automatic garbage collection** after each chunk\r\n- **Error recovery** continues processing if individual pages fail\r\n- **File size warnings** for files >100MB\r\n\r\n### Fallback Methods\r\nIf the primary PyPDF method fails, the tool automatically tries:\r\n1. **pdftk** (if installed)\r\n2. **qpdf** (if installed)\r\n\r\n### Error Handling\r\n- **Graceful degradation** for corrupted PDFs\r\n- **Detailed error messages** with suggested solutions\r\n- **Partial processing** continues even if some pages fail\r\n\r\n## \ud83d\udcc1 Output File Naming\r\n\r\nFiles are automatically named using the original filename:\r\n\r\n| Input | Output |\r\n|-------|--------|\r\n| `document.pdf` | `document_1.pdf`, `document_2.pdf`, ... |\r\n| `report.pdf` | `report_1.pdf`, `report_2.pdf`, ... |\r\n| `/path/to/file.pdf` | `file_1.pdf`, `file_2.pdf`, ... |\r\n\r\n## \ud83d\udd27 Installation from Source\r\n\r\nFor development or latest features:\r\n\r\n```bash\r\ngit clone https://github.com/jmxt3/pdf-splitter.git\r\ncd pdf-splitter\r\npip install -e .\r\n```\r\n\r\n## \ud83d\udcc4 License\r\n\r\nMIT License - see [LICENSE](LICENSE) file for details.\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\nContributions welcome! Please feel free to submit a Pull Request.\r\n\r\n## \ud83d\udc1b Issues\r\n\r\nFound a bug or have a feature request? Please open an issue on [GitHub](https://github.com/jmxt3/pdf-splitter/issues).\r\n\r\n## \ud83d\udcca Dependencies\r\n\r\n- **click**: Modern CLI framework\r\n- **pypdf**: PDF processing library\r\n\r\n## \ud83c\udff7\ufe0f Version History\r\n\r\n- **0.1.0**: Initial release with progress bars and robust error handling\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A modern command-line tool to split PDF files into smaller chunks with progress bars and automatic filename generation",
    "version": "0.1.1",
    "project_urls": {
        "Documentation": "https://github.com/jmxt3/pdf-splitter#readme",
        "Homepage": "https://github.com/jmxt3/pdf-splitter",
        "Issues": "https://github.com/jmxt3/pdf-splitter/issues",
        "Repository": "https://github.com/jmxt3/pdf-splitter.git"
    },
    "split_keywords": [
        "pdf",
        " split",
        " cli",
        " command-line",
        " pdf-tools",
        " document-processing"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0d5da7e47cb8e73467f5a084b57ede03d1bc5df33a2f578b5edf4e477543c7ad",
                "md5": "fdc407a353661b3c58d2428515b58707",
                "sha256": "f24cfe8368900db40e48a0c4a1749f84e6e08f2f8901c7c2901d885dcd0b2ed4"
            },
            "downloads": -1,
            "filename": "pdf_splitter_cli-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fdc407a353661b3c58d2428515b58707",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 8930,
            "upload_time": "2025-07-17T01:37:11",
            "upload_time_iso_8601": "2025-07-17T01:37:11.237528Z",
            "url": "https://files.pythonhosted.org/packages/0d/5d/a7e47cb8e73467f5a084b57ede03d1bc5df33a2f578b5edf4e477543c7ad/pdf_splitter_cli-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "400416f2bba20a97aeb7ce9ffef68d34518fe49582cf732af474eba50b4952d1",
                "md5": "3b6e213297552f44f56194f93c35f525",
                "sha256": "43ca62b3facd80bf7f70b46dc94b86f2b9ea1e07246bcc232e7f945e65edcd48"
            },
            "downloads": -1,
            "filename": "pdf_splitter_cli-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3b6e213297552f44f56194f93c35f525",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 9922,
            "upload_time": "2025-07-17T01:37:12",
            "upload_time_iso_8601": "2025-07-17T01:37:12.868345Z",
            "url": "https://files.pythonhosted.org/packages/40/04/16f2bba20a97aeb7ce9ffef68d34518fe49582cf732af474eba50b4952d1/pdf_splitter_cli-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-17 01:37:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jmxt3",
    "github_project": "pdf-splitter#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pdf-splitter-cli"
}
        
Elapsed time: 0.91159s