docsumm-ai


Namedocsumm-ai JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryAudience-aware document summarizer for PDF/DOCX/TXT — optimized for context retention, not token count.
upload_time2025-10-06 16:41:18
maintainerNone
docs_urlNone
authorRohit Rajdev
requires_python>=3.9
licenseMIT
keywords ai llm documents docx pdf summarization
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # docsumm-ai  
**One-line, opinionated document summarizer for PDFs, Word, or text — optimized for context retention, not token count.**

![CI](https://github.com/RohitRajdev/docsumm-ai/actions/workflows/ci.yml/badge.svg)
![License](https://img.shields.io/badge/License-MIT-blue.svg)
![Python](https://img.shields.io/badge/python-3.9%20|%203.10%20|%203.11%20|%203.12-blue)
![Version](https://img.shields.io/badge/version-0.1.0-orange)

---

## Why docsumm-ai?

Summarizing long documents shouldn’t mean losing meaning.  
Most tools today **truncate context** just to fit into token limits — resulting in shallow, inaccurate summaries.

`docsumm-ai` was built to fix that.

We designed it for **researchers, analysts, and AI developers** who care about both **fidelity and efficiency**.  
It automatically adapts to document structure, ensuring retention of key insights from text, Word, or PDFs — in a single line.

---

## What Makes It Different

✅ **One-line summarize()** — clean summaries with context retention  
✅ **Handles PDFs, DOCX, TXT** — no format left behind  
✅ **Context-aware chunking** — semantic segmentation, not blind splitting  
✅ **Adaptive compression** — keeps the right level of detail per section  
✅ **CLI + Python API** — works both in scripts and terminal  
✅ **Transparent JSON + Markdown output** — reproducible and human-readable  

---

## Installation

```bash
pip install docsumm-ai

## Quickstart
1. Summarize a text file
from docsumm_ai import summarize

summary = summarize("annual_report.txt", mode="concise")
print(summary)

2. Summarize a PDF (CLI)
docsumm summarize my_report.pdf --mode detailed --out summary.md

## Output Example

Input:

“The study explores the correlation between urban growth and environmental impact across 32 global cities…”

Output:

“Analyzes 32 cities showing urban expansion drives higher emissions; highlights need for adaptive policies.”

---

## License

MIT License © 2025 Rohit Rajdev
Open for community collaboration and research integration.

🌐 Links

🔗 GitHub: https://github.com/RohitRajdev/docsumm-ai

✉️ Contact: rohitrajdev.com

🧠 Related project: dataprep-ai

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "docsumm-ai",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "AI, LLM, documents, docx, pdf, summarization",
    "author": "Rohit Rajdev",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/8c/07/083b5528828e941f8a382d26860c329fb838f1f6a921485813f54cb82fcb/docsumm_ai-0.1.0.tar.gz",
    "platform": null,
    "description": "# docsumm-ai  \n**One-line, opinionated document summarizer for PDFs, Word, or text \u2014 optimized for context retention, not token count.**\n\n![CI](https://github.com/RohitRajdev/docsumm-ai/actions/workflows/ci.yml/badge.svg)\n![License](https://img.shields.io/badge/License-MIT-blue.svg)\n![Python](https://img.shields.io/badge/python-3.9%20|%203.10%20|%203.11%20|%203.12-blue)\n![Version](https://img.shields.io/badge/version-0.1.0-orange)\n\n---\n\n## Why docsumm-ai?\n\nSummarizing long documents shouldn\u2019t mean losing meaning.  \nMost tools today **truncate context** just to fit into token limits \u2014 resulting in shallow, inaccurate summaries.\n\n`docsumm-ai` was built to fix that.\n\nWe designed it for **researchers, analysts, and AI developers** who care about both **fidelity and efficiency**.  \nIt automatically adapts to document structure, ensuring retention of key insights from text, Word, or PDFs \u2014 in a single line.\n\n---\n\n## What Makes It Different\n\n\u2705 **One-line summarize()** \u2014 clean summaries with context retention  \n\u2705 **Handles PDFs, DOCX, TXT** \u2014 no format left behind  \n\u2705 **Context-aware chunking** \u2014 semantic segmentation, not blind splitting  \n\u2705 **Adaptive compression** \u2014 keeps the right level of detail per section  \n\u2705 **CLI + Python API** \u2014 works both in scripts and terminal  \n\u2705 **Transparent JSON + Markdown output** \u2014 reproducible and human-readable  \n\n---\n\n## Installation\n\n```bash\npip install docsumm-ai\n\n## Quickstart\n1. Summarize a text file\nfrom docsumm_ai import summarize\n\nsummary = summarize(\"annual_report.txt\", mode=\"concise\")\nprint(summary)\n\n2. Summarize a PDF (CLI)\ndocsumm summarize my_report.pdf --mode detailed --out summary.md\n\n## Output Example\n\nInput:\n\n\u201cThe study explores the correlation between urban growth and environmental impact across 32 global cities\u2026\u201d\n\nOutput:\n\n\u201cAnalyzes 32 cities showing urban expansion drives higher emissions; highlights need for adaptive policies.\u201d\n\n---\n\n## License\n\nMIT License \u00a9 2025 Rohit Rajdev\nOpen for community collaboration and research integration.\n\n\ud83c\udf10 Links\n\n\ud83d\udd17 GitHub: https://github.com/RohitRajdev/docsumm-ai\n\n\u2709\ufe0f Contact: rohitrajdev.com\n\n\ud83e\udde0 Related project: dataprep-ai\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Audience-aware document summarizer for PDF/DOCX/TXT \u2014 optimized for context retention, not token count.",
    "version": "0.1.0",
    "project_urls": null,
    "split_keywords": [
        "ai",
        " llm",
        " documents",
        " docx",
        " pdf",
        " summarization"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ba281401f16c35a1d9a7cce7341a123a0e30eab7956b52f76e543c50cef32497",
                "md5": "74169186513f608c14722f462be282de",
                "sha256": "5469de08b76e652ec9b5ee26e05f8589bfd4379e0ef5d738ae39c59ff0feeaaa"
            },
            "downloads": -1,
            "filename": "docsumm_ai-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "74169186513f608c14722f462be282de",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 5287,
            "upload_time": "2025-10-06T16:41:17",
            "upload_time_iso_8601": "2025-10-06T16:41:17.330654Z",
            "url": "https://files.pythonhosted.org/packages/ba/28/1401f16c35a1d9a7cce7341a123a0e30eab7956b52f76e543c50cef32497/docsumm_ai-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8c07083b5528828e941f8a382d26860c329fb838f1f6a921485813f54cb82fcb",
                "md5": "57188e5ce8de5d05e3c7ea7345186c29",
                "sha256": "f6c05439695b95057bfd77df176c2398be234d9ef396b7d2b805154e07557104"
            },
            "downloads": -1,
            "filename": "docsumm_ai-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "57188e5ce8de5d05e3c7ea7345186c29",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 6539,
            "upload_time": "2025-10-06T16:41:18",
            "upload_time_iso_8601": "2025-10-06T16:41:18.539074Z",
            "url": "https://files.pythonhosted.org/packages/8c/07/083b5528828e941f8a382d26860c329fb838f1f6a921485813f54cb82fcb/docsumm_ai-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-06 16:41:18",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "docsumm-ai"
}
        
Elapsed time: 1.27814s