excel-extractor-v1


Nameexcel-extractor-v1 JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryExtract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings or openpyxl. CLI and Python API.
upload_time2025-08-20 14:42:30
maintainerNone
docs_urlNone
authorAmeer Techsoft
requires_python>=3.8
licenseNone
keywords excel xlwings openpyxl formulas metadata data validation hyperlinks notes
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Excel Extractor (xlwings)

A Python library to extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings. Provides a simple Python API and command-line tools.

## Install

- Requirements: Windows, Microsoft Excel, Python 3.8+
- Install deps in your project: `pip install xlwings`
- This repo includes a ready-to-package library under `excel_extractor/`.
- After publishing: `pip install excel-extractor-v1`

### Cross-platform (openpyxl engine)

- On macOS/Linux or Windows without Excel, use the `openpyxl` engine:
  - CLI: add `--engine openpyxl`
  - Programmatic: use `from excel_extractor import OpenpyxlExcelExtractor`
  - Note: display text and live calc values are limited because openpyxl does not evaluate formulas.

## CLI

- Basic (all formulas on active sheet):
```bash
python -m excel_extractor "Workbook.xlsx"  # defaults to xlwings on Windows, openpyxl elsewhere
```
- Force engine:
```bash
python -m excel_extractor "Workbook.xlsx" --engine openpyxl
python -m excel_extractor "Workbook.xlsx" --engine xlwings
```
- Specific worksheet:
```bash
python -m excel_extractor "Workbook.xlsx" --sheet "Sheet1"
```
- Range only:
```bash
python -m excel_extractor "Workbook.xlsx" --range "A1:D10"
```
- Formula dependencies for a cell:
```bash
python -m excel_extractor "Workbook.xlsx" --dependencies "B5"
```
- Full details (formatting, validations, hyperlinks, notes):
```bash
python -m excel_extractor "Workbook.xlsx" --full
```
- Full details for all sheets:
```bash
python -m excel_extractor "Workbook.xlsx" --full --all-sheets
```
- Text output instead of JSON:
```bash
python -m excel_extractor "Workbook.xlsx" --format text
```

- Convert a previously generated `*_full_details.json` into per-sheet CSV/JSON and index:
```bash
python -m excel_extractor.convert_excel_json "Workbook_full_details.json" --out exports --ndjson
```

## Python API

```python
from excel_extractor import ExcelFormulaExtractor, OpenpyxlExcelExtractor

# 1) Windows + Excel (xlwings)
with ExcelFormulaExtractor("Workbook.xlsx") as extractor:
    data = extractor.extract_sheet_full_details("Sheet1")

# 2) Cross-platform (openpyxl)
with OpenpyxlExcelExtractor("Workbook.xlsx") as extractor:
    data = extractor.extract_sheet_full_details("Sheet1")
```

## Public API (summary)

- Class `ExcelFormulaExtractor(excel_file_path: str)` (xlwings)
- Class `OpenpyxlExcelExtractor(excel_file_path: str)` (openpyxl)
  - Context manager: opens/quits workbook automatically
  - `get_worksheet_info(sheet_name: Optional[str]) -> dict`
  - `extract_formulas_from_range(start_cell: str, end_cell: Optional[str]) -> list[dict]`
  - `extract_all_formulas(sheet_name: Optional[str]) -> dict`
  - `extract_sheet_full_details(sheet_name: Optional[str]) -> dict`
  - `extract_workbook_full_details() -> dict`
  - `extract_formula_dependencies(cell_address: str) -> dict`
  - `export_to_json(data: dict, output_file: str) -> bool`
  - `export_to_text(data: dict, output_file: str) -> bool`

- Console scripts (after packaging):
  - `excel-extractor` → same as `python -m excel_extractor`
  - `excel-extractor-convert` → same as `python -m excel_extractor.convert_excel_json`

- Programmatic converter (optional):
  - `excel_extractor.tools.convert_full_details_json(input_json: Path, output_dir: Path, make_ndjson: bool) -> None`

## Notes

- Full-detail extraction returns, per cell: value, formula, display text (xlwings only), basic formatting (number format, font, alignment), fill color, hyperlink, note/comment, data validation (including resolved list items when possible), and merge info.
- xlwings automation requires local Excel. Set the app visible for debugging by editing the code path that creates `xw.App(visible=False)`.

## License

MIT License. See `LICENSE`. 

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "excel-extractor-v1",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "excel, xlwings, openpyxl, formulas, metadata, data validation, hyperlinks, notes",
    "author": "Ameer Techsoft",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/22/18/09f8a84e94a57b802158b78a5a1c80e3ff6e58acddb9e7d6be348effaa6a/excel_extractor_v1-0.1.0.tar.gz",
    "platform": null,
    "description": "# Excel Extractor (xlwings)\r\n\r\nA Python library to extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings. Provides a simple Python API and command-line tools.\r\n\r\n## Install\r\n\r\n- Requirements: Windows, Microsoft Excel, Python 3.8+\r\n- Install deps in your project: `pip install xlwings`\r\n- This repo includes a ready-to-package library under `excel_extractor/`.\r\n- After publishing: `pip install excel-extractor-v1`\r\n\r\n### Cross-platform (openpyxl engine)\r\n\r\n- On macOS/Linux or Windows without Excel, use the `openpyxl` engine:\r\n  - CLI: add `--engine openpyxl`\r\n  - Programmatic: use `from excel_extractor import OpenpyxlExcelExtractor`\r\n  - Note: display text and live calc values are limited because openpyxl does not evaluate formulas.\r\n\r\n## CLI\r\n\r\n- Basic (all formulas on active sheet):\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\"  # defaults to xlwings on Windows, openpyxl elsewhere\r\n```\r\n- Force engine:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --engine openpyxl\r\npython -m excel_extractor \"Workbook.xlsx\" --engine xlwings\r\n```\r\n- Specific worksheet:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --sheet \"Sheet1\"\r\n```\r\n- Range only:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --range \"A1:D10\"\r\n```\r\n- Formula dependencies for a cell:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --dependencies \"B5\"\r\n```\r\n- Full details (formatting, validations, hyperlinks, notes):\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --full\r\n```\r\n- Full details for all sheets:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --full --all-sheets\r\n```\r\n- Text output instead of JSON:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --format text\r\n```\r\n\r\n- Convert a previously generated `*_full_details.json` into per-sheet CSV/JSON and index:\r\n```bash\r\npython -m excel_extractor.convert_excel_json \"Workbook_full_details.json\" --out exports --ndjson\r\n```\r\n\r\n## Python API\r\n\r\n```python\r\nfrom excel_extractor import ExcelFormulaExtractor, OpenpyxlExcelExtractor\r\n\r\n# 1) Windows + Excel (xlwings)\r\nwith ExcelFormulaExtractor(\"Workbook.xlsx\") as extractor:\r\n    data = extractor.extract_sheet_full_details(\"Sheet1\")\r\n\r\n# 2) Cross-platform (openpyxl)\r\nwith OpenpyxlExcelExtractor(\"Workbook.xlsx\") as extractor:\r\n    data = extractor.extract_sheet_full_details(\"Sheet1\")\r\n```\r\n\r\n## Public API (summary)\r\n\r\n- Class `ExcelFormulaExtractor(excel_file_path: str)` (xlwings)\r\n- Class `OpenpyxlExcelExtractor(excel_file_path: str)` (openpyxl)\r\n  - Context manager: opens/quits workbook automatically\r\n  - `get_worksheet_info(sheet_name: Optional[str]) -> dict`\r\n  - `extract_formulas_from_range(start_cell: str, end_cell: Optional[str]) -> list[dict]`\r\n  - `extract_all_formulas(sheet_name: Optional[str]) -> dict`\r\n  - `extract_sheet_full_details(sheet_name: Optional[str]) -> dict`\r\n  - `extract_workbook_full_details() -> dict`\r\n  - `extract_formula_dependencies(cell_address: str) -> dict`\r\n  - `export_to_json(data: dict, output_file: str) -> bool`\r\n  - `export_to_text(data: dict, output_file: str) -> bool`\r\n\r\n- Console scripts (after packaging):\r\n  - `excel-extractor` \u2192 same as `python -m excel_extractor`\r\n  - `excel-extractor-convert` \u2192 same as `python -m excel_extractor.convert_excel_json`\r\n\r\n- Programmatic converter (optional):\r\n  - `excel_extractor.tools.convert_full_details_json(input_json: Path, output_dir: Path, make_ndjson: bool) -> None`\r\n\r\n## Notes\r\n\r\n- Full-detail extraction returns, per cell: value, formula, display text (xlwings only), basic formatting (number format, font, alignment), fill color, hyperlink, note/comment, data validation (including resolved list items when possible), and merge info.\r\n- xlwings automation requires local Excel. Set the app visible for debugging by editing the code path that creates `xw.App(visible=False)`.\r\n\r\n## License\r\n\r\nMIT License. See `LICENSE`. \r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings or openpyxl. CLI and Python API.",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/AmeerTechsoft/excel-extractor",
        "Issues": "https://github.com/AmeerTechsoft/excel-extractor",
        "Repository": "https://github.com/AmeerTechsoft/excel-extractor"
    },
    "split_keywords": [
        "excel",
        " xlwings",
        " openpyxl",
        " formulas",
        " metadata",
        " data validation",
        " hyperlinks",
        " notes"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "02bf91e3fde53bbe57dcc00553ad59a13a9649c1c2f1472d2ce402dc7b1f0047",
                "md5": "245d863bf2dee2800addcef01cc843a8",
                "sha256": "22d37b8f7ca2c3721770bd602afd87b5ef505ad89f470711da67b6cd6c106da4"
            },
            "downloads": -1,
            "filename": "excel_extractor_v1-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "245d863bf2dee2800addcef01cc843a8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 3981,
            "upload_time": "2025-08-20T14:42:28",
            "upload_time_iso_8601": "2025-08-20T14:42:28.697111Z",
            "url": "https://files.pythonhosted.org/packages/02/bf/91e3fde53bbe57dcc00553ad59a13a9649c1c2f1472d2ce402dc7b1f0047/excel_extractor_v1-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "221809f8a84e94a57b802158b78a5a1c80e3ff6e58acddb9e7d6be348effaa6a",
                "md5": "c65e625521d94dca1af0eaad3154cff6",
                "sha256": "0f6e21ccdbcc8e4d19e708100b1ef424bffbf818a63df62d64f371241aac65be"
            },
            "downloads": -1,
            "filename": "excel_extractor_v1-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c65e625521d94dca1af0eaad3154cff6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 3835,
            "upload_time": "2025-08-20T14:42:30",
            "upload_time_iso_8601": "2025-08-20T14:42:30.395751Z",
            "url": "https://files.pythonhosted.org/packages/22/18/09f8a84e94a57b802158b78a5a1c80e3ff6e58acddb9e7d6be348effaa6a/excel_extractor_v1-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-20 14:42:30",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AmeerTechsoft",
    "github_project": "excel-extractor",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "excel-extractor-v1"
}
        
Elapsed time: 1.46678s