# Excel Extractor (xlwings)
A Python library to extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings. Provides a simple Python API and command-line tools.
## Install
- Requirements: Windows, Microsoft Excel, Python 3.8+
- Install deps in your project: `pip install xlwings`
- This repo includes a ready-to-package library under `excel_extractor/`.
- After publishing: `pip install excel-extractor-v1`
### Cross-platform (openpyxl engine)
- On macOS/Linux or Windows without Excel, use the `openpyxl` engine:
- CLI: add `--engine openpyxl`
- Programmatic: use `from excel_extractor import OpenpyxlExcelExtractor`
- Note: display text and live calc values are limited because openpyxl does not evaluate formulas.
## CLI
- Basic (all formulas on active sheet):
```bash
python -m excel_extractor "Workbook.xlsx" # defaults to xlwings on Windows, openpyxl elsewhere
```
- Force engine:
```bash
python -m excel_extractor "Workbook.xlsx" --engine openpyxl
python -m excel_extractor "Workbook.xlsx" --engine xlwings
```
- Specific worksheet:
```bash
python -m excel_extractor "Workbook.xlsx" --sheet "Sheet1"
```
- Range only:
```bash
python -m excel_extractor "Workbook.xlsx" --range "A1:D10"
```
- Formula dependencies for a cell:
```bash
python -m excel_extractor "Workbook.xlsx" --dependencies "B5"
```
- Full details (formatting, validations, hyperlinks, notes):
```bash
python -m excel_extractor "Workbook.xlsx" --full
```
- Full details for all sheets:
```bash
python -m excel_extractor "Workbook.xlsx" --full --all-sheets
```
- Text output instead of JSON:
```bash
python -m excel_extractor "Workbook.xlsx" --format text
```
- Convert a previously generated `*_full_details.json` into per-sheet CSV/JSON and index:
```bash
python -m excel_extractor.convert_excel_json "Workbook_full_details.json" --out exports --ndjson
```
## Python API
```python
from excel_extractor import ExcelFormulaExtractor, OpenpyxlExcelExtractor
# 1) Windows + Excel (xlwings)
with ExcelFormulaExtractor("Workbook.xlsx") as extractor:
data = extractor.extract_sheet_full_details("Sheet1")
# 2) Cross-platform (openpyxl)
with OpenpyxlExcelExtractor("Workbook.xlsx") as extractor:
data = extractor.extract_sheet_full_details("Sheet1")
```
## Public API (summary)
- Class `ExcelFormulaExtractor(excel_file_path: str)` (xlwings)
- Class `OpenpyxlExcelExtractor(excel_file_path: str)` (openpyxl)
- Context manager: opens/quits workbook automatically
- `get_worksheet_info(sheet_name: Optional[str]) -> dict`
- `extract_formulas_from_range(start_cell: str, end_cell: Optional[str]) -> list[dict]`
- `extract_all_formulas(sheet_name: Optional[str]) -> dict`
- `extract_sheet_full_details(sheet_name: Optional[str]) -> dict`
- `extract_workbook_full_details() -> dict`
- `extract_formula_dependencies(cell_address: str) -> dict`
- `export_to_json(data: dict, output_file: str) -> bool`
- `export_to_text(data: dict, output_file: str) -> bool`
- Console scripts (after packaging):
- `excel-extractor` → same as `python -m excel_extractor`
- `excel-extractor-convert` → same as `python -m excel_extractor.convert_excel_json`
- Programmatic converter (optional):
- `excel_extractor.tools.convert_full_details_json(input_json: Path, output_dir: Path, make_ndjson: bool) -> None`
## Notes
- Full-detail extraction returns, per cell: value, formula, display text (xlwings only), basic formatting (number format, font, alignment), fill color, hyperlink, note/comment, data validation (including resolved list items when possible), and merge info.
- xlwings automation requires local Excel. Set the app visible for debugging by editing the code path that creates `xw.App(visible=False)`.
## License
MIT License. See `LICENSE`.
Raw data
{
"_id": null,
"home_page": null,
"name": "excel-extractor-v1",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "excel, xlwings, openpyxl, formulas, metadata, data validation, hyperlinks, notes",
"author": "Ameer Techsoft",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/22/18/09f8a84e94a57b802158b78a5a1c80e3ff6e58acddb9e7d6be348effaa6a/excel_extractor_v1-0.1.0.tar.gz",
"platform": null,
"description": "# Excel Extractor (xlwings)\r\n\r\nA Python library to extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings. Provides a simple Python API and command-line tools.\r\n\r\n## Install\r\n\r\n- Requirements: Windows, Microsoft Excel, Python 3.8+\r\n- Install deps in your project: `pip install xlwings`\r\n- This repo includes a ready-to-package library under `excel_extractor/`.\r\n- After publishing: `pip install excel-extractor-v1`\r\n\r\n### Cross-platform (openpyxl engine)\r\n\r\n- On macOS/Linux or Windows without Excel, use the `openpyxl` engine:\r\n - CLI: add `--engine openpyxl`\r\n - Programmatic: use `from excel_extractor import OpenpyxlExcelExtractor`\r\n - Note: display text and live calc values are limited because openpyxl does not evaluate formulas.\r\n\r\n## CLI\r\n\r\n- Basic (all formulas on active sheet):\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" # defaults to xlwings on Windows, openpyxl elsewhere\r\n```\r\n- Force engine:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --engine openpyxl\r\npython -m excel_extractor \"Workbook.xlsx\" --engine xlwings\r\n```\r\n- Specific worksheet:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --sheet \"Sheet1\"\r\n```\r\n- Range only:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --range \"A1:D10\"\r\n```\r\n- Formula dependencies for a cell:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --dependencies \"B5\"\r\n```\r\n- Full details (formatting, validations, hyperlinks, notes):\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --full\r\n```\r\n- Full details for all sheets:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --full --all-sheets\r\n```\r\n- Text output instead of JSON:\r\n```bash\r\npython -m excel_extractor \"Workbook.xlsx\" --format text\r\n```\r\n\r\n- Convert a previously generated `*_full_details.json` into per-sheet CSV/JSON and index:\r\n```bash\r\npython -m excel_extractor.convert_excel_json \"Workbook_full_details.json\" --out exports --ndjson\r\n```\r\n\r\n## Python API\r\n\r\n```python\r\nfrom excel_extractor import ExcelFormulaExtractor, OpenpyxlExcelExtractor\r\n\r\n# 1) Windows + Excel (xlwings)\r\nwith ExcelFormulaExtractor(\"Workbook.xlsx\") as extractor:\r\n data = extractor.extract_sheet_full_details(\"Sheet1\")\r\n\r\n# 2) Cross-platform (openpyxl)\r\nwith OpenpyxlExcelExtractor(\"Workbook.xlsx\") as extractor:\r\n data = extractor.extract_sheet_full_details(\"Sheet1\")\r\n```\r\n\r\n## Public API (summary)\r\n\r\n- Class `ExcelFormulaExtractor(excel_file_path: str)` (xlwings)\r\n- Class `OpenpyxlExcelExtractor(excel_file_path: str)` (openpyxl)\r\n - Context manager: opens/quits workbook automatically\r\n - `get_worksheet_info(sheet_name: Optional[str]) -> dict`\r\n - `extract_formulas_from_range(start_cell: str, end_cell: Optional[str]) -> list[dict]`\r\n - `extract_all_formulas(sheet_name: Optional[str]) -> dict`\r\n - `extract_sheet_full_details(sheet_name: Optional[str]) -> dict`\r\n - `extract_workbook_full_details() -> dict`\r\n - `extract_formula_dependencies(cell_address: str) -> dict`\r\n - `export_to_json(data: dict, output_file: str) -> bool`\r\n - `export_to_text(data: dict, output_file: str) -> bool`\r\n\r\n- Console scripts (after packaging):\r\n - `excel-extractor` \u2192 same as `python -m excel_extractor`\r\n - `excel-extractor-convert` \u2192 same as `python -m excel_extractor.convert_excel_json`\r\n\r\n- Programmatic converter (optional):\r\n - `excel_extractor.tools.convert_full_details_json(input_json: Path, output_dir: Path, make_ndjson: bool) -> None`\r\n\r\n## Notes\r\n\r\n- Full-detail extraction returns, per cell: value, formula, display text (xlwings only), basic formatting (number format, font, alignment), fill color, hyperlink, note/comment, data validation (including resolved list items when possible), and merge info.\r\n- xlwings automation requires local Excel. Set the app visible for debugging by editing the code path that creates `xw.App(visible=False)`.\r\n\r\n## License\r\n\r\nMIT License. See `LICENSE`. \r\n",
"bugtrack_url": null,
"license": null,
"summary": "Extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings or openpyxl. CLI and Python API.",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/AmeerTechsoft/excel-extractor",
"Issues": "https://github.com/AmeerTechsoft/excel-extractor",
"Repository": "https://github.com/AmeerTechsoft/excel-extractor"
},
"split_keywords": [
"excel",
" xlwings",
" openpyxl",
" formulas",
" metadata",
" data validation",
" hyperlinks",
" notes"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "02bf91e3fde53bbe57dcc00553ad59a13a9649c1c2f1472d2ce402dc7b1f0047",
"md5": "245d863bf2dee2800addcef01cc843a8",
"sha256": "22d37b8f7ca2c3721770bd602afd87b5ef505ad89f470711da67b6cd6c106da4"
},
"downloads": -1,
"filename": "excel_extractor_v1-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "245d863bf2dee2800addcef01cc843a8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 3981,
"upload_time": "2025-08-20T14:42:28",
"upload_time_iso_8601": "2025-08-20T14:42:28.697111Z",
"url": "https://files.pythonhosted.org/packages/02/bf/91e3fde53bbe57dcc00553ad59a13a9649c1c2f1472d2ce402dc7b1f0047/excel_extractor_v1-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "221809f8a84e94a57b802158b78a5a1c80e3ff6e58acddb9e7d6be348effaa6a",
"md5": "c65e625521d94dca1af0eaad3154cff6",
"sha256": "0f6e21ccdbcc8e4d19e708100b1ef424bffbf818a63df62d64f371241aac65be"
},
"downloads": -1,
"filename": "excel_extractor_v1-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "c65e625521d94dca1af0eaad3154cff6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 3835,
"upload_time": "2025-08-20T14:42:30",
"upload_time_iso_8601": "2025-08-20T14:42:30.395751Z",
"url": "https://files.pythonhosted.org/packages/22/18/09f8a84e94a57b802158b78a5a1c80e3ff6e58acddb9e7d6be348effaa6a/excel_extractor_v1-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-20 14:42:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AmeerTechsoft",
"github_project": "excel-extractor",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "excel-extractor-v1"
}