# Bible XML Parser




A Python package for parsing Bible texts in various XML formats (USFX, OSIS, ZEFANIA). This package provides both direct parsing and database-backed approaches for handling Bible data in your Python applications.
## Features
- 📖 Parse Bible texts in multiple formats (USFX, OSIS, ZEFANIA)
- 🔍 Automatic format detection
- 🚀 Memory-efficient streaming XML parsing using defusedxml
- 🗄️ SQLite database caching for improved performance
- 🔎 Full-text search functionality (FTS5)
- 🔒 Secure XML parsing (protected against XXE attacks)
- 📝 Type hints throughout for better IDE support
- 🐍 Python 3.8+ support
## Installation
```bash
pip install bible-xml-parser
```
### Development Installation
```bash
git clone https://github.com/Omarzintan/bible_parser_python.git
cd bible_parser_python
pip install -e ".[dev]"
```
## Quick Start
### Direct Parsing Approach
Parse a Bible file directly without database caching:
```python
from bible_parser import BibleParser
# Parse from file (format auto-detected)
parser = BibleParser('path/to/bible.xml')
# Or parse from string with explicit format
xml_content = open('bible.xml').read()
parser = BibleParser.from_string(xml_content, format='USFX')
# Iterate over books
for book in parser.books:
print(f"{book.title} ({book.id})")
print(f" Chapters: {len(book.chapters)}")
print(f" Verses: {len(book.verses)}")
# Or iterate over verses directly
for verse in parser.verses:
print(f"{verse.book_id} {verse.chapter_num}:{verse.num} - {verse.text}")
```
### Database Approach (Recommended for Production)
For better performance, use the database approach:
```python
from bible_parser import BibleRepository
# Create repository
repo = BibleRepository(xml_path='path/to/bible.xml', format='USFX')
# Initialize database (only needed once)
repo.initialize('my_bible.db')
# Get all books
books = repo.get_books()
for book in books:
print(f"{book.title} ({book.id})")
# Get verses from a specific chapter
verses = repo.get_verses('gen', 1) # Genesis chapter 1
for verse in verses:
print(f"{verse.num}. {verse.text}")
# Get a specific verse
verse = repo.get_verse('jhn', 3, 16) # John 3:16
if verse:
print(verse.text)
# Search for verses containing specific text
results = repo.search_verses('love')
print(f"Found {len(results)} verses containing 'love'")
# Don't forget to close
repo.close()
```
### Using Context Manager
```python
from bible_parser import BibleRepository
with BibleRepository(xml_path='bible.xml') as repo:
repo.initialize('my_bible.db')
# Use the repository
verses = repo.get_verses('mat', 5) # Matthew chapter 5
for verse in verses:
print(f"{verse.num}. {verse.text}")
# Search
results = repo.search_verses('faith hope love')
for verse in results:
print(f"{verse.book_id} {verse.chapter_num}:{verse.num}")
# Database automatically closed
```
## Supported Formats
### USFX (Unified Standard Format XML)
```xml
<usfx>
<book id="gen">
<c id="1"/>
<v id="1">In the beginning...</v>
</book>
</usfx>
```
### OSIS (Open Scripture Information Standard)
```xml
<osis>
<osisText>
<div type="book" osisID="Gen">
<verse osisID="Gen.1.1">In the beginning...</verse>
</div>
</osisText>
</osis>
```
### Zefania XML
```xml
<XMLBIBLE>
<BIBLEBOOK bnumber="1" bname="Genesis">
<CHAPTER cnumber="1">
<VERS vnumber="1">In the beginning...</VERS>
</CHAPTER>
</BIBLEBOOK>
</XMLBIBLE>
```
## API Reference
### BibleParser
Main parser class with automatic format detection.
**Methods:**
- `__init__(source, format=None)` - Initialize parser
- `from_string(xml_content, format=None)` - Create from XML string
- `books` - Property that yields Book objects
- `verses` - Property that yields Verse objects
### BibleRepository
Database-backed repository for efficient Bible data access.
**Methods:**
- `__init__(xml_path=None, xml_string=None, format=None)` - Initialize repository
- `initialize(database_name)` - Create/open database
- `get_books()` - Get all books
- `get_verses(book_id, chapter_num)` - Get verses from a chapter
- `get_verse(book_id, chapter_num, verse_num)` - Get a specific verse
- `get_chapter_count(book_id)` - Get number of chapters in a book
- `search_verses(query, limit=100)` - Full-text search
- `close()` - Close database connection
### Data Models
**Verse:**
- `num` (int) - Verse number
- `chapter_num` (int) - Chapter number
- `text` (str) - Verse text
- `book_id` (str) - Book identifier
**Chapter:**
- `num` (int) - Chapter number
- `verses` (List[Verse]) - List of verses
**Book:**
- `id` (str) - Book identifier (e.g., 'gen', 'mat')
- `num` (int) - Book number
- `title` (str) - Book title (e.g., 'Genesis', 'Matthew')
- `chapters` (List[Chapter]) - List of chapters
- `verses` (List[Verse]) - Flat list of all verses
## Performance Considerations
### Direct Parsing
**Pros:**
- Simple implementation
- No database setup required
- Always uses the latest source files
**Cons:**
- CPU and memory intensive
- Slower for repeated access
- Repeated parsing on each run
### Database Approach
**Pros:**
- Much faster access once data is loaded
- Lower memory usage during queries
- Efficient full-text search with FTS5
- Works offline without re-parsing
**Cons:**
- Initial setup time
- Requires disk space
- Additional complexity
## Security
This package uses `defusedxml` for secure XML parsing, protecting against:
- **XXE (XML External Entity) attacks** - Prevents reading local files or making network requests
- **Billion Laughs attack** - Prevents exponential entity expansion
- **Quadratic blowup** - Prevents memory exhaustion
All database queries use parameterized statements to prevent SQL injection.
## Examples
See the `examples/` directory for complete working examples:
- `direct_parsing.py` - Direct parsing example
- `database_approach.py` - Database caching example
- `search_example.py` - Full-text search example
## Testing
Run tests with pytest:
```bash
pytest
```
With coverage:
```bash
pytest --cov=bible_parser --cov-report=term-missing
```
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Inspired by the Ruby [bible_parser](https://github.com/seven1m/bible_parser) library
- Flutter [bible_parser_flutter](https://github.com/Omarzintan/bible_parser_flutter) implementation
- Bible XML files from the [open-bibles](https://github.com/seven1m/open-bibles) repository
## Changelog
See [CHANGELOG.md](CHANGELOG.md) for version history.
## Support
- 📫 Issues: [GitHub Issues](https://github.com/Omarzintan/bible_parser_python/issues)
- 📖 Documentation: [GitHub Wiki](https://github.com/Omarzintan/bible_parser_python/wiki)
Raw data
{
"_id": null,
"home_page": null,
"name": "bible-xml-parser",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "bible, parser, xml, usfx, osis, zefania, scripture",
"author": "Omar Zintan",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/aa/aa/dba909c97bee2e0f2122633522c4e7b1ae324f9481dd1a5bc2afb26fc1e3/bible_xml_parser-0.1.1.tar.gz",
"platform": null,
"description": "# Bible XML Parser\n\n\n\n\n\n\nA Python package for parsing Bible texts in various XML formats (USFX, OSIS, ZEFANIA). This package provides both direct parsing and database-backed approaches for handling Bible data in your Python applications.\n\n## Features\n\n- \ud83d\udcd6 Parse Bible texts in multiple formats (USFX, OSIS, ZEFANIA)\n- \ud83d\udd0d Automatic format detection\n- \ud83d\ude80 Memory-efficient streaming XML parsing using defusedxml\n- \ud83d\uddc4\ufe0f SQLite database caching for improved performance\n- \ud83d\udd0e Full-text search functionality (FTS5)\n- \ud83d\udd12 Secure XML parsing (protected against XXE attacks)\n- \ud83d\udcdd Type hints throughout for better IDE support\n- \ud83d\udc0d Python 3.8+ support\n\n## Installation\n\n```bash\npip install bible-xml-parser\n```\n\n### Development Installation\n\n```bash\ngit clone https://github.com/Omarzintan/bible_parser_python.git\ncd bible_parser_python\npip install -e \".[dev]\"\n```\n\n## Quick Start\n\n### Direct Parsing Approach\n\nParse a Bible file directly without database caching:\n\n```python\nfrom bible_parser import BibleParser\n\n# Parse from file (format auto-detected)\nparser = BibleParser('path/to/bible.xml')\n\n# Or parse from string with explicit format\nxml_content = open('bible.xml').read()\nparser = BibleParser.from_string(xml_content, format='USFX')\n\n# Iterate over books\nfor book in parser.books:\n print(f\"{book.title} ({book.id})\")\n print(f\" Chapters: {len(book.chapters)}\")\n print(f\" Verses: {len(book.verses)}\")\n\n# Or iterate over verses directly\nfor verse in parser.verses:\n print(f\"{verse.book_id} {verse.chapter_num}:{verse.num} - {verse.text}\")\n```\n\n### Database Approach (Recommended for Production)\n\nFor better performance, use the database approach:\n\n```python\nfrom bible_parser import BibleRepository\n\n# Create repository\nrepo = BibleRepository(xml_path='path/to/bible.xml', format='USFX')\n\n# Initialize database (only needed once)\nrepo.initialize('my_bible.db')\n\n# Get all books\nbooks = repo.get_books()\nfor book in books:\n print(f\"{book.title} ({book.id})\")\n\n# Get verses from a specific chapter\nverses = repo.get_verses('gen', 1) # Genesis chapter 1\nfor verse in verses:\n print(f\"{verse.num}. {verse.text}\")\n\n# Get a specific verse\nverse = repo.get_verse('jhn', 3, 16) # John 3:16\nif verse:\n print(verse.text)\n\n# Search for verses containing specific text\nresults = repo.search_verses('love')\nprint(f\"Found {len(results)} verses containing 'love'\")\n\n# Don't forget to close\nrepo.close()\n```\n\n### Using Context Manager\n\n```python\nfrom bible_parser import BibleRepository\n\nwith BibleRepository(xml_path='bible.xml') as repo:\n repo.initialize('my_bible.db')\n \n # Use the repository\n verses = repo.get_verses('mat', 5) # Matthew chapter 5\n for verse in verses:\n print(f\"{verse.num}. {verse.text}\")\n \n # Search\n results = repo.search_verses('faith hope love')\n for verse in results:\n print(f\"{verse.book_id} {verse.chapter_num}:{verse.num}\")\n\n# Database automatically closed\n```\n\n## Supported Formats\n\n### USFX (Unified Standard Format XML)\n```xml\n<usfx>\n <book id=\"gen\">\n <c id=\"1\"/>\n <v id=\"1\">In the beginning...</v>\n </book>\n</usfx>\n```\n\n### OSIS (Open Scripture Information Standard)\n```xml\n<osis>\n <osisText>\n <div type=\"book\" osisID=\"Gen\">\n <verse osisID=\"Gen.1.1\">In the beginning...</verse>\n </div>\n </osisText>\n</osis>\n```\n\n### Zefania XML\n```xml\n<XMLBIBLE>\n <BIBLEBOOK bnumber=\"1\" bname=\"Genesis\">\n <CHAPTER cnumber=\"1\">\n <VERS vnumber=\"1\">In the beginning...</VERS>\n </CHAPTER>\n </BIBLEBOOK>\n</XMLBIBLE>\n```\n\n## API Reference\n\n### BibleParser\n\nMain parser class with automatic format detection.\n\n**Methods:**\n- `__init__(source, format=None)` - Initialize parser\n- `from_string(xml_content, format=None)` - Create from XML string\n- `books` - Property that yields Book objects\n- `verses` - Property that yields Verse objects\n\n### BibleRepository\n\nDatabase-backed repository for efficient Bible data access.\n\n**Methods:**\n- `__init__(xml_path=None, xml_string=None, format=None)` - Initialize repository\n- `initialize(database_name)` - Create/open database\n- `get_books()` - Get all books\n- `get_verses(book_id, chapter_num)` - Get verses from a chapter\n- `get_verse(book_id, chapter_num, verse_num)` - Get a specific verse\n- `get_chapter_count(book_id)` - Get number of chapters in a book\n- `search_verses(query, limit=100)` - Full-text search\n- `close()` - Close database connection\n\n### Data Models\n\n**Verse:**\n- `num` (int) - Verse number\n- `chapter_num` (int) - Chapter number\n- `text` (str) - Verse text\n- `book_id` (str) - Book identifier\n\n**Chapter:**\n- `num` (int) - Chapter number\n- `verses` (List[Verse]) - List of verses\n\n**Book:**\n- `id` (str) - Book identifier (e.g., 'gen', 'mat')\n- `num` (int) - Book number\n- `title` (str) - Book title (e.g., 'Genesis', 'Matthew')\n- `chapters` (List[Chapter]) - List of chapters\n- `verses` (List[Verse]) - Flat list of all verses\n\n## Performance Considerations\n\n### Direct Parsing\n**Pros:**\n- Simple implementation\n- No database setup required\n- Always uses the latest source files\n\n**Cons:**\n- CPU and memory intensive\n- Slower for repeated access\n- Repeated parsing on each run\n\n### Database Approach\n**Pros:**\n- Much faster access once data is loaded\n- Lower memory usage during queries\n- Efficient full-text search with FTS5\n- Works offline without re-parsing\n\n**Cons:**\n- Initial setup time\n- Requires disk space\n- Additional complexity\n\n## Security\n\nThis package uses `defusedxml` for secure XML parsing, protecting against:\n- **XXE (XML External Entity) attacks** - Prevents reading local files or making network requests\n- **Billion Laughs attack** - Prevents exponential entity expansion\n- **Quadratic blowup** - Prevents memory exhaustion\n\nAll database queries use parameterized statements to prevent SQL injection.\n\n## Examples\n\nSee the `examples/` directory for complete working examples:\n- `direct_parsing.py` - Direct parsing example\n- `database_approach.py` - Database caching example\n- `search_example.py` - Full-text search example\n\n## Testing\n\nRun tests with pytest:\n\n```bash\npytest\n```\n\nWith coverage:\n\n```bash\npytest --cov=bible_parser --cov-report=term-missing\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- Inspired by the Ruby [bible_parser](https://github.com/seven1m/bible_parser) library\n- Flutter [bible_parser_flutter](https://github.com/Omarzintan/bible_parser_flutter) implementation\n- Bible XML files from the [open-bibles](https://github.com/seven1m/open-bibles) repository\n\n## Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for version history.\n\n## Support\n\n- \ud83d\udceb Issues: [GitHub Issues](https://github.com/Omarzintan/bible_parser_python/issues)\n- \ud83d\udcd6 Documentation: [GitHub Wiki](https://github.com/Omarzintan/bible_parser_python/wiki)\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python package for parsing Bible texts in various XML formats (USFX, OSIS, ZEFANIA)",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/Omarzintan/bible_parser_python",
"Issues": "https://github.com/Omarzintan/bible_parser_python/issues",
"Repository": "https://github.com/Omarzintan/bible_parser_python"
},
"split_keywords": [
"bible",
" parser",
" xml",
" usfx",
" osis",
" zefania",
" scripture"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4a4aeea3c9d3f2a8b973ec3be9c70446854d4273d077f2d3c483ff9a2d2cf37d",
"md5": "e4a6754908dedccc0ffee09456bfdd9f",
"sha256": "706f4c7fff774a90a669cdb1ceb8565f52c8ef64ebd4cf9455806103c47eb3e2"
},
"downloads": -1,
"filename": "bible_xml_parser-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e4a6754908dedccc0ffee09456bfdd9f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 20527,
"upload_time": "2025-10-25T22:49:26",
"upload_time_iso_8601": "2025-10-25T22:49:26.080930Z",
"url": "https://files.pythonhosted.org/packages/4a/4a/eea3c9d3f2a8b973ec3be9c70446854d4273d077f2d3c483ff9a2d2cf37d/bible_xml_parser-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "aaaadba909c97bee2e0f2122633522c4e7b1ae324f9481dd1a5bc2afb26fc1e3",
"md5": "eb2c5c8a50a9d661340f89bc9ecf4389",
"sha256": "5afeacd9e29549a9e6db7275af51e245fd7325239eb673d51e0573be31f5248d"
},
"downloads": -1,
"filename": "bible_xml_parser-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "eb2c5c8a50a9d661340f89bc9ecf4389",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 25394,
"upload_time": "2025-10-25T22:49:27",
"upload_time_iso_8601": "2025-10-25T22:49:27.355657Z",
"url": "https://files.pythonhosted.org/packages/aa/aa/dba909c97bee2e0f2122633522c4e7b1ae324f9481dd1a5bc2afb26fc1e3/bible_xml_parser-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-25 22:49:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Omarzintan",
"github_project": "bible_parser_python",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "bible-xml-parser"
}