# RDFMap - Semantic Model Data Mapper
Convert tabular and structured data (CSV, Excel, JSON, XML) into RDF triples aligned with OWL ontologies using intelligent SKOS-based semantic mapping.
## ✨ Features
### 📊 **Multi-Format Data Sources**
- **CSV/TSV**: Standard delimited files with configurable separators
- **Excel (XLSX)**: Multi-sheet workbooks with automatic type detection
- **JSON**: Complex nested structures with array expansion
- **XML**: Structured documents with namespace support
### 🧠 **Intelligent Semantic Mapping**
- **SKOS-Based Matching**: Automatic column-to-property alignment using SKOS labels
- **Ontology Imports**: Modular ontology architecture with `--import` flag
- **Semantic Alignment Reports**: Confidence scoring and mapping quality metrics
- **OWL2 Best Practices**: NamedIndividual declarations and standards compliance
### 🛠 **Advanced Processing**
- **IRI Templating**: Deterministic, idempotent IRI construction
- **Data Transformation**: Type casting, normalization, value transforms
- **Array Expansion**: Complex nested JSON array processing
- **Object Linking**: Cross-sheet joins and multi-valued cell unpacking
### 📋 **Enterprise Features**
- **Multiple Output Formats**: Turtle, RDF/XML, JSON-LD, N-Triples
- **SHACL Validation**: Validate generated RDF against ontology shapes
- **Batch Processing**: Handle 100k+ row datasets efficiently
- **Error Reporting**: Comprehensive validation and processing reports
## 🚀 Installation
### Requirements
- Python 3.11+ (recommended: Python 3.13)
### Install from PyPI
```bash
pip install rdfmap
```
### Development Installation
```bash
# Clone the repository
git clone https://github.com/rdfmap/rdfmap.git
cd rdfmap
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e ".[dev]"
```
## Quick Start
### 1. Run the Mortgage Example
```bash
# Convert mortgage loans data to RDF with validation
rdfmap convert \
--ontology examples/mortgage/ontology/mortgage.ttl \
--mapping examples/mortgage/config/mortgage_mapping.yaml \
--format ttl \
--output output/mortgage.ttl \
--validate \
--report output/validation_report.json
# Dry run with first 10 rows
rdfmap convert \
--mapping examples/mortgage/config/mortgage_mapping.yaml \
--limit 10 \
--validate \
--dry-run
# 🆕 Or auto-generate mapping from ontology + spreadsheet
rdfmap generate \
--ontology examples/mortgage/ontology/mortgage.ttl \
--spreadsheet examples/mortgage/data/loans.csv \
--output auto_mapping.yaml \
--export-schema
```
### 2. Understanding the Mortgage Example
The example converts loan data with this structure:
**Input CSV** (`examples/mortgage/data/loans.csv`):
```csv
LoanID,BorrowerID,BorrowerName,PropertyID,PropertyAddress,Principal,InterestRate,OriginationDate
L-1001,B-9001,Alex Morgan,P-7001,12 Oak St,250000,0.0525,2023-06-15
```
**Mapping Config** (`examples/mortgage/config/mortgage_mapping.yaml`):
- Maps `LoanID` → `ex:loanNumber`
- Creates linked resources for Borrower and Property
- Applies proper XSD datatypes
- Constructs IRIs using templates
**Output RDF** (Turtle):
```turtle
<https://data.example.com/loan/L-1001> a ex:MortgageLoan ;
ex:loanNumber "L-1001"^^xsd:string ;
ex:principalAmount "250000"^^xsd:decimal ;
ex:hasBorrower <https://data.example.com/borrower/B-9001> ;
ex:collateralProperty <https://data.example.com/property/P-7001> .
```
## Configuration Reference
### Mapping File Structure
```yaml
# Namespace declarations
namespaces:
ex: https://example.com/mortgage#
xsd: http://www.w3.org/2001/XMLSchema#
# Default settings
defaults:
base_iri: https://data.example.com/
language: en # Optional default language tag
# Sheet/file mappings
sheets:
- name: loans
source: loans.csv # Relative to mapping file or absolute
# Main resource for each row
row_resource:
class: ex:MortgageLoan
iri_template: "{base_iri}loan/{LoanID}"
# Column mappings
columns:
LoanID:
as: ex:loanNumber
datatype: xsd:string
required: true
Principal:
as: ex:principalAmount
datatype: xsd:decimal
transform: to_decimal # Built-in transform
default: 0 # Optional default value
Notes:
as: rdfs:comment
datatype: xsd:string
language: en # Language tag for literal
# Linked objects (object properties)
objects:
borrower:
predicate: ex:hasBorrower
class: ex:Borrower
iri_template: "{base_iri}borrower/{BorrowerID}"
properties:
- column: BorrowerName
as: ex:borrowerName
datatype: xsd:string
# Validation configuration
validation:
shacl:
enabled: true
shapes_file: shapes/mortgage_shapes.ttl
# Processing options
options:
delimiter: ","
header: true
on_error: "report" # "report" or "fail-fast"
skip_empty_values: true
```
### Built-in Transforms
- `to_decimal`: Convert to decimal number
- `to_integer`: Convert to integer
- `to_date`: Parse date (ISO format)
- `to_datetime`: Parse datetime with timezone support
- `to_boolean`: Convert to boolean
- `uppercase`: Convert string to uppercase
- `lowercase`: Convert string to lowercase
- `strip`: Trim whitespace
### IRI Templates
Use Python-style string formatting with column names:
- `{base_iri}loan/{LoanID}` → `https://data.example.com/loan/L-1001`
- `{base_iri}{EntityType}/{ID}` → Combine multiple columns
## CLI Reference
### Commands
#### `convert`
Convert spreadsheet data to RDF.
```bash
rdfmap convert [OPTIONS]
```
**Options:**
- `--ontology PATH`: Path to ontology file (supports TTL, RDF/XML, JSON-LD, N-Triples, etc.)
- `--mapping PATH`: Path to mapping configuration (YAML/JSON) [required]
- `--format, -f TEXT`: Output format: ttl, xml, jsonld, nt (default: ttl)
- `--output, -o FILE`: Output file path
- `--validate`: Run SHACL validation after conversion
- `--report PATH`: Write validation report to file (JSON)
- `--limit N`: Process only first N rows (for testing)
- `--dry-run`: Parse and validate without writing output
- `--verbose, -v`: Enable detailed logging
- `--log PATH`: Write log to file
**Examples:**
```bash
# Basic conversion to Turtle
rdfmap convert --mapping config.yaml --format ttl --output output.ttl
# With ontology validation and SHACL validation
rdfmap convert \
--mapping config.yaml \
--ontology ontology.ttl \
--format jsonld \
--output output.jsonld \
--validate \
--report validation.json
# Test with limited rows
rdfmap convert --mapping config.yaml --limit 100 --dry-run --verbose
```
#### `generate`
**NEW**: Automatically generate mapping configuration from ontology and spreadsheet.
```bash
rdfmap generate [OPTIONS]
```
**Options:**
- `--ontology, -ont PATH`: Path to ontology file (TTL, RDF/XML, etc.) [required]
- `--spreadsheet, -s PATH`: Path to spreadsheet file (CSV/XLSX) [required]
- `--output, -o PATH`: Output path for generated mapping config [required]
- `--base-iri, -b TEXT`: Base IRI for resources (default: http://example.org/)
- `--class, -c TEXT`: Target ontology class (auto-detects if omitted)
- `--format, -f TEXT`: Output format: yaml or json (default: yaml)
- `--analyze-only`: Show analysis without generating mapping
- `--export-schema`: Export JSON Schema for validation
- `--verbose, -v`: Enable detailed logging
**Examples:**
```bash
# Auto-generate mapping configuration
rdfmap generate \
--ontology ontology.ttl \
--spreadsheet data.csv \
--output mapping.yaml
# Specify target class and export JSON Schema
rdfmap generate \
-ont ontology.ttl \
-s data.csv \
-o mapping.yaml \
--class MortgageLoan \
--export-schema
# Analyze only (no generation)
rdfmap generate \
--ontology ontology.ttl \
--spreadsheet data.csv \
--output mapping.yaml \
--analyze-only
```
**What it does:**
- Analyzes ontology classes and properties
- Examines spreadsheet columns and data types
- Intelligently matches columns to properties
- Suggests appropriate XSD datatypes
- Generates IRI templates from identifier columns
- Detects relationships for linked objects
- Exports JSON Schema for validation
See [docs/MAPPING_GENERATOR.md](docs/MAPPING_GENERATOR.md) for details.
#### `validate`
Validate existing RDF file against shapes.
```bash
rdfmap validate --rdf PATH --shapes PATH [--report PATH]
```
#### `info`
Display information about mapping configuration.
```bash
rdfmap info --mapping PATH
```
## Architecture
```
rdfmap/
├── parsers/ # CSV/XLSX data source parsers
├── models/ # Pydantic schemas for mapping config
├── transforms/ # Data transformation functions
├── iri/ # IRI templating and generation
├── emitter/ # RDF graph construction with rdflib
├── validator/ # SHACL validation integration
└── cli/ # Command-line interface
```
### Key Design Principles
1. **Configuration-Driven**: All mappings declarative in YAML/JSON
2. **Modular**: Clear separation between parsing, transformation, and emission
3. **Deterministic**: Same input always produces same IRIs (idempotency)
4. **Extensible**: Easy to add new transforms, datatypes, or ontology patterns
5. **Robust**: Comprehensive error handling with row-level tracking
## Extending the Application
### Adding Custom Transforms
Edit `rdfmap/transforms/functions.py`:
```python
@register_transform("custom_transform")
def custom_transform(value: Any, **kwargs) -> Any:
"""Your custom transformation logic."""
return transformed_value
```
### Supporting New Ontology Patterns
1. Update mapping schema in `rdfmap/models/mapping.py` if needed
2. Implement pattern handler in `rdfmap/emitter/graph_builder.py`
3. Add test cases in `tests/test_patterns.py`
### Adding New Output Formats
Extend `rdfmap/emitter/serializer.py`:
```python
def serialize(graph: Graph, format: str, output_path: Path):
if format == "your_format":
# Custom serialization logic
pass
```
## Testing
```bash
# Run all tests
pytest
# Run with coverage
pytest --cov=rdfmap --cov-report=html
# Run specific test file
pytest tests/test_transforms.py
# Run mortgage example test
pytest tests/test_mortgage_example.py -v
```
## Error Handling
The application provides detailed error reporting:
### Row-Level Errors
```json
{
"row": 42,
"error": "Invalid datatype for column 'Principal': cannot convert 'N/A' to xsd:decimal",
"severity": "error"
}
```
### Validation Reports
```json
{
"conforms": false,
"results": [
{
"focusNode": "https://data.example.com/loan/L-1001",
"resultPath": "ex:principalAmount",
"resultMessage": "Value must be greater than 0"
}
]
}
```
## Performance Tips
1. **Large Files**: The application automatically streams data for files >10MB
2. **Chunking**: Process in batches using `--limit` and multiple runs
3. **Validation**: Skip validation during development (`--validate` only for final runs)
4. **Dry Runs**: Test mappings with `--limit 100 --dry-run` before full processing
## Troubleshooting
### "Column not found" errors
- Check CSV column names match mapping config exactly (case-sensitive)
- Verify CSV delimiter matches config (`delimiter: ","`)
### Invalid IRIs
- Ensure IRI template variables match column names exactly
- Check that base_iri ends with `/` or `#`
### Datatype conversion errors
- Review data for unexpected values (nulls, text in numeric fields)
- Use `transform` to normalize before typing
- Set `skip_empty_values: true` to ignore nulls
### SHACL validation failures
- Review validation report for specific violations
- Ensure ontology and shapes are compatible
- Check that required properties are mapped
## Contributing
Contributions welcome! Please:
1. Follow PEP 8 style guidelines
2. Add unit tests for new features
3. Update documentation
4. Run `pytest` and `mypy` before submitting
## License
MIT License - See LICENSE file for details
## Support
For issues, questions, or feature requests, please open an issue on the project repository.
## Acknowledgments
Built with:
- [rdflib](https://rdflib.readthedocs.io/) - RDF processing
- [pandas](https://pandas.pydata.org/) - Data manipulation
- [pydantic](https://docs.pydantic.dev/) - Data validation
- [pyshacl](https://github.com/RDFLib/pySHACL) - SHACL validation
- [typer](https://typer.tiangolo.com/) - CLI framework
Raw data
{
"_id": null,
"home_page": null,
"name": "semantic-rdf-mapper",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "RDFMap Team <rxcthefirst@gmail.com>",
"keywords": "rdf, ontology, semantic-web, knowledge-graph, owl, shacl, linked-data, data-conversion, skos, owl2, ttl, json-ld, csv-to-rdf, excel-to-rdf, json-to-rdf, xml-to-rdf, semantic-mapping, ontology-alignment, data-integration",
"author": "Enterprise Data Engineering",
"author_email": "RDFMap Contributors <rxcthefirst@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/95/8e/9247a5dd568474f4b5b40b38d922a54cf2f8280098cd136ca447f9449454/semantic_rdf_mapper-0.1.0.tar.gz",
"platform": null,
"description": "# RDFMap - Semantic Model Data Mapper\n\nConvert tabular and structured data (CSV, Excel, JSON, XML) into RDF triples aligned with OWL ontologies using intelligent SKOS-based semantic mapping.\n\n## \u2728 Features\n\n### \ud83d\udcca **Multi-Format Data Sources**\n- **CSV/TSV**: Standard delimited files with configurable separators\n- **Excel (XLSX)**: Multi-sheet workbooks with automatic type detection\n- **JSON**: Complex nested structures with array expansion\n- **XML**: Structured documents with namespace support\n\n### \ud83e\udde0 **Intelligent Semantic Mapping**\n- **SKOS-Based Matching**: Automatic column-to-property alignment using SKOS labels\n- **Ontology Imports**: Modular ontology architecture with `--import` flag\n- **Semantic Alignment Reports**: Confidence scoring and mapping quality metrics\n- **OWL2 Best Practices**: NamedIndividual declarations and standards compliance\n\n### \ud83d\udee0 **Advanced Processing**\n- **IRI Templating**: Deterministic, idempotent IRI construction\n- **Data Transformation**: Type casting, normalization, value transforms\n- **Array Expansion**: Complex nested JSON array processing\n- **Object Linking**: Cross-sheet joins and multi-valued cell unpacking\n\n### \ud83d\udccb **Enterprise Features**\n- **Multiple Output Formats**: Turtle, RDF/XML, JSON-LD, N-Triples\n- **SHACL Validation**: Validate generated RDF against ontology shapes\n- **Batch Processing**: Handle 100k+ row datasets efficiently\n- **Error Reporting**: Comprehensive validation and processing reports\n\n## \ud83d\ude80 Installation\n\n### Requirements\n- Python 3.11+ (recommended: Python 3.13)\n\n### Install from PyPI\n\n```bash\npip install rdfmap\n```\n\n### Development Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/rdfmap/rdfmap.git\ncd rdfmap\n\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate # On Windows: venv\\Scripts\\activate\n\n# Install in development mode\npip install -e \".[dev]\"\n```\n\n## Quick Start\n\n### 1. Run the Mortgage Example\n\n```bash\n# Convert mortgage loans data to RDF with validation\nrdfmap convert \\\n --ontology examples/mortgage/ontology/mortgage.ttl \\\n --mapping examples/mortgage/config/mortgage_mapping.yaml \\\n --format ttl \\\n --output output/mortgage.ttl \\\n --validate \\\n --report output/validation_report.json\n\n# Dry run with first 10 rows\nrdfmap convert \\\n --mapping examples/mortgage/config/mortgage_mapping.yaml \\\n --limit 10 \\\n --validate \\\n --dry-run\n\n# \ud83c\udd95 Or auto-generate mapping from ontology + spreadsheet\nrdfmap generate \\\n --ontology examples/mortgage/ontology/mortgage.ttl \\\n --spreadsheet examples/mortgage/data/loans.csv \\\n --output auto_mapping.yaml \\\n --export-schema\n```\n\n### 2. Understanding the Mortgage Example\n\nThe example converts loan data with this structure:\n\n**Input CSV** (`examples/mortgage/data/loans.csv`):\n```csv\nLoanID,BorrowerID,BorrowerName,PropertyID,PropertyAddress,Principal,InterestRate,OriginationDate\nL-1001,B-9001,Alex Morgan,P-7001,12 Oak St,250000,0.0525,2023-06-15\n```\n\n**Mapping Config** (`examples/mortgage/config/mortgage_mapping.yaml`):\n- Maps `LoanID` \u2192 `ex:loanNumber`\n- Creates linked resources for Borrower and Property\n- Applies proper XSD datatypes\n- Constructs IRIs using templates\n\n**Output RDF** (Turtle):\n```turtle\n<https://data.example.com/loan/L-1001> a ex:MortgageLoan ;\n ex:loanNumber \"L-1001\"^^xsd:string ;\n ex:principalAmount \"250000\"^^xsd:decimal ;\n ex:hasBorrower <https://data.example.com/borrower/B-9001> ;\n ex:collateralProperty <https://data.example.com/property/P-7001> .\n```\n\n## Configuration Reference\n\n### Mapping File Structure\n\n```yaml\n# Namespace declarations\nnamespaces:\n ex: https://example.com/mortgage#\n xsd: http://www.w3.org/2001/XMLSchema#\n\n# Default settings\ndefaults:\n base_iri: https://data.example.com/\n language: en # Optional default language tag\n\n# Sheet/file mappings\nsheets:\n - name: loans\n source: loans.csv # Relative to mapping file or absolute\n \n # Main resource for each row\n row_resource:\n class: ex:MortgageLoan\n iri_template: \"{base_iri}loan/{LoanID}\"\n \n # Column mappings\n columns:\n LoanID:\n as: ex:loanNumber\n datatype: xsd:string\n required: true\n \n Principal:\n as: ex:principalAmount\n datatype: xsd:decimal\n transform: to_decimal # Built-in transform\n default: 0 # Optional default value\n \n Notes:\n as: rdfs:comment\n datatype: xsd:string\n language: en # Language tag for literal\n \n # Linked objects (object properties)\n objects:\n borrower:\n predicate: ex:hasBorrower\n class: ex:Borrower\n iri_template: \"{base_iri}borrower/{BorrowerID}\"\n properties:\n - column: BorrowerName\n as: ex:borrowerName\n datatype: xsd:string\n\n# Validation configuration\nvalidation:\n shacl:\n enabled: true\n shapes_file: shapes/mortgage_shapes.ttl\n\n# Processing options\noptions:\n delimiter: \",\"\n header: true\n on_error: \"report\" # \"report\" or \"fail-fast\"\n skip_empty_values: true\n```\n\n### Built-in Transforms\n\n- `to_decimal`: Convert to decimal number\n- `to_integer`: Convert to integer\n- `to_date`: Parse date (ISO format)\n- `to_datetime`: Parse datetime with timezone support\n- `to_boolean`: Convert to boolean\n- `uppercase`: Convert string to uppercase\n- `lowercase`: Convert string to lowercase\n- `strip`: Trim whitespace\n\n### IRI Templates\n\nUse Python-style string formatting with column names:\n- `{base_iri}loan/{LoanID}` \u2192 `https://data.example.com/loan/L-1001`\n- `{base_iri}{EntityType}/{ID}` \u2192 Combine multiple columns\n\n## CLI Reference\n\n### Commands\n\n#### `convert`\n\nConvert spreadsheet data to RDF.\n\n```bash\nrdfmap convert [OPTIONS]\n```\n\n**Options:**\n\n- `--ontology PATH`: Path to ontology file (supports TTL, RDF/XML, JSON-LD, N-Triples, etc.)\n- `--mapping PATH`: Path to mapping configuration (YAML/JSON) [required]\n- `--format, -f TEXT`: Output format: ttl, xml, jsonld, nt (default: ttl)\n- `--output, -o FILE`: Output file path\n- `--validate`: Run SHACL validation after conversion\n- `--report PATH`: Write validation report to file (JSON)\n- `--limit N`: Process only first N rows (for testing)\n- `--dry-run`: Parse and validate without writing output\n- `--verbose, -v`: Enable detailed logging\n- `--log PATH`: Write log to file\n\n**Examples:**\n\n```bash\n# Basic conversion to Turtle\nrdfmap convert --mapping config.yaml --format ttl --output output.ttl\n\n# With ontology validation and SHACL validation\nrdfmap convert \\\n --mapping config.yaml \\\n --ontology ontology.ttl \\\n --format jsonld \\\n --output output.jsonld \\\n --validate \\\n --report validation.json\n\n# Test with limited rows\nrdfmap convert --mapping config.yaml --limit 100 --dry-run --verbose\n```\n\n#### `generate`\n\n**NEW**: Automatically generate mapping configuration from ontology and spreadsheet.\n\n```bash\nrdfmap generate [OPTIONS]\n```\n\n**Options:**\n\n- `--ontology, -ont PATH`: Path to ontology file (TTL, RDF/XML, etc.) [required]\n- `--spreadsheet, -s PATH`: Path to spreadsheet file (CSV/XLSX) [required]\n- `--output, -o PATH`: Output path for generated mapping config [required]\n- `--base-iri, -b TEXT`: Base IRI for resources (default: http://example.org/)\n- `--class, -c TEXT`: Target ontology class (auto-detects if omitted)\n- `--format, -f TEXT`: Output format: yaml or json (default: yaml)\n- `--analyze-only`: Show analysis without generating mapping\n- `--export-schema`: Export JSON Schema for validation\n- `--verbose, -v`: Enable detailed logging\n\n**Examples:**\n\n```bash\n# Auto-generate mapping configuration\nrdfmap generate \\\n --ontology ontology.ttl \\\n --spreadsheet data.csv \\\n --output mapping.yaml\n\n# Specify target class and export JSON Schema\nrdfmap generate \\\n -ont ontology.ttl \\\n -s data.csv \\\n -o mapping.yaml \\\n --class MortgageLoan \\\n --export-schema\n\n# Analyze only (no generation)\nrdfmap generate \\\n --ontology ontology.ttl \\\n --spreadsheet data.csv \\\n --output mapping.yaml \\\n --analyze-only\n```\n\n**What it does:**\n- Analyzes ontology classes and properties\n- Examines spreadsheet columns and data types\n- Intelligently matches columns to properties\n- Suggests appropriate XSD datatypes\n- Generates IRI templates from identifier columns\n- Detects relationships for linked objects\n- Exports JSON Schema for validation\n\nSee [docs/MAPPING_GENERATOR.md](docs/MAPPING_GENERATOR.md) for details.\n\n#### `validate`\n\nValidate existing RDF file against shapes.\n\n```bash\nrdfmap validate --rdf PATH --shapes PATH [--report PATH]\n```\n\n#### `info`\n\nDisplay information about mapping configuration.\n\n```bash\nrdfmap info --mapping PATH\n```\n\n## Architecture\n\n```\nrdfmap/\n\u251c\u2500\u2500 parsers/ # CSV/XLSX data source parsers\n\u251c\u2500\u2500 models/ # Pydantic schemas for mapping config\n\u251c\u2500\u2500 transforms/ # Data transformation functions\n\u251c\u2500\u2500 iri/ # IRI templating and generation\n\u251c\u2500\u2500 emitter/ # RDF graph construction with rdflib\n\u251c\u2500\u2500 validator/ # SHACL validation integration\n\u2514\u2500\u2500 cli/ # Command-line interface\n```\n\n### Key Design Principles\n\n1. **Configuration-Driven**: All mappings declarative in YAML/JSON\n2. **Modular**: Clear separation between parsing, transformation, and emission\n3. **Deterministic**: Same input always produces same IRIs (idempotency)\n4. **Extensible**: Easy to add new transforms, datatypes, or ontology patterns\n5. **Robust**: Comprehensive error handling with row-level tracking\n\n## Extending the Application\n\n### Adding Custom Transforms\n\nEdit `rdfmap/transforms/functions.py`:\n\n```python\n@register_transform(\"custom_transform\")\ndef custom_transform(value: Any, **kwargs) -> Any:\n \"\"\"Your custom transformation logic.\"\"\"\n return transformed_value\n```\n\n### Supporting New Ontology Patterns\n\n1. Update mapping schema in `rdfmap/models/mapping.py` if needed\n2. Implement pattern handler in `rdfmap/emitter/graph_builder.py`\n3. Add test cases in `tests/test_patterns.py`\n\n### Adding New Output Formats\n\nExtend `rdfmap/emitter/serializer.py`:\n\n```python\ndef serialize(graph: Graph, format: str, output_path: Path):\n if format == \"your_format\":\n # Custom serialization logic\n pass\n```\n\n## Testing\n\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=rdfmap --cov-report=html\n\n# Run specific test file\npytest tests/test_transforms.py\n\n# Run mortgage example test\npytest tests/test_mortgage_example.py -v\n```\n\n## Error Handling\n\nThe application provides detailed error reporting:\n\n### Row-Level Errors\n\n```json\n{\n \"row\": 42,\n \"error\": \"Invalid datatype for column 'Principal': cannot convert 'N/A' to xsd:decimal\",\n \"severity\": \"error\"\n}\n```\n\n### Validation Reports\n\n```json\n{\n \"conforms\": false,\n \"results\": [\n {\n \"focusNode\": \"https://data.example.com/loan/L-1001\",\n \"resultPath\": \"ex:principalAmount\",\n \"resultMessage\": \"Value must be greater than 0\"\n }\n ]\n}\n```\n\n## Performance Tips\n\n1. **Large Files**: The application automatically streams data for files >10MB\n2. **Chunking**: Process in batches using `--limit` and multiple runs\n3. **Validation**: Skip validation during development (`--validate` only for final runs)\n4. **Dry Runs**: Test mappings with `--limit 100 --dry-run` before full processing\n\n## Troubleshooting\n\n### \"Column not found\" errors\n- Check CSV column names match mapping config exactly (case-sensitive)\n- Verify CSV delimiter matches config (`delimiter: \",\"`)\n\n### Invalid IRIs\n- Ensure IRI template variables match column names exactly\n- Check that base_iri ends with `/` or `#`\n\n### Datatype conversion errors\n- Review data for unexpected values (nulls, text in numeric fields)\n- Use `transform` to normalize before typing\n- Set `skip_empty_values: true` to ignore nulls\n\n### SHACL validation failures\n- Review validation report for specific violations\n- Ensure ontology and shapes are compatible\n- Check that required properties are mapped\n\n## Contributing\n\nContributions welcome! Please:\n\n1. Follow PEP 8 style guidelines\n2. Add unit tests for new features\n3. Update documentation\n4. Run `pytest` and `mypy` before submitting\n\n## License\n\nMIT License - See LICENSE file for details\n\n## Support\n\nFor issues, questions, or feature requests, please open an issue on the project repository.\n\n## Acknowledgments\n\nBuilt with:\n- [rdflib](https://rdflib.readthedocs.io/) - RDF processing\n- [pandas](https://pandas.pydata.org/) - Data manipulation\n- [pydantic](https://docs.pydantic.dev/) - Data validation\n- [pyshacl](https://github.com/RDFLib/pySHACL) - SHACL validation\n- [typer](https://typer.tiangolo.com/) - CLI framework\n",
"bugtrack_url": null,
"license": null,
"summary": "Convert tabular data (CSV, Excel, JSON, XML) to RDF triples aligned with OWL ontologies using SKOS-based semantic mapping",
"version": "0.1.0",
"project_urls": {
"Changelog": "https://github.com/rxcthefirst/SemanticModelDataMapper/blob/main/CHANGELOG.md",
"Documentation": "https://github.com/rxcthefirst/SemanticModelDataMapper#readme",
"Homepage": "https://github.com/rxcthefirst/SemanticModelDataMapper",
"Issues": "https://github.com/rxcthefirst/SemanticModelDataMapper/issues",
"Repository": "https://github.com/rxcthefirst/SemanticModelDataMapper"
},
"split_keywords": [
"rdf",
" ontology",
" semantic-web",
" knowledge-graph",
" owl",
" shacl",
" linked-data",
" data-conversion",
" skos",
" owl2",
" ttl",
" json-ld",
" csv-to-rdf",
" excel-to-rdf",
" json-to-rdf",
" xml-to-rdf",
" semantic-mapping",
" ontology-alignment",
" data-integration"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a442984dbd04928c384c6b23be72ab28d49b4d30bea807ba295d4f5dbd765a78",
"md5": "522cbc7c1ce90b96d45565d66b0def23",
"sha256": "d6f4fc10155aa33c1202e039c277db75e86c30012d55118e0fb4cdcda18582f9"
},
"downloads": -1,
"filename": "semantic_rdf_mapper-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "522cbc7c1ce90b96d45565d66b0def23",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 76282,
"upload_time": "2025-11-03T06:19:06",
"upload_time_iso_8601": "2025-11-03T06:19:06.070598Z",
"url": "https://files.pythonhosted.org/packages/a4/42/984dbd04928c384c6b23be72ab28d49b4d30bea807ba295d4f5dbd765a78/semantic_rdf_mapper-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "958e9247a5dd568474f4b5b40b38d922a54cf2f8280098cd136ca447f9449454",
"md5": "324e39d04d1adb46e839e8f8117bccda",
"sha256": "991c0bd8e53fe04ac8013723426c89cc147f814ed0c26df36f7ee571ea3adf97"
},
"downloads": -1,
"filename": "semantic_rdf_mapper-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "324e39d04d1adb46e839e8f8117bccda",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 231109,
"upload_time": "2025-11-03T06:19:07",
"upload_time_iso_8601": "2025-11-03T06:19:07.708376Z",
"url": "https://files.pythonhosted.org/packages/95/8e/9247a5dd568474f4b5b40b38d922a54cf2f8280098cd136ca447f9449454/semantic_rdf_mapper-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-03 06:19:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "rxcthefirst",
"github_project": "SemanticModelDataMapper",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "rdflib",
"specs": [
[
">=",
"7.0.0"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"2.1.0"
]
]
},
{
"name": "openpyxl",
"specs": [
[
">=",
"3.1.0"
]
]
},
{
"name": "pydantic",
"specs": [
[
">=",
"2.5.0"
]
]
},
{
"name": "pydantic-settings",
"specs": [
[
">=",
"2.1.0"
]
]
},
{
"name": "pyshacl",
"specs": [
[
">=",
"0.25.0"
]
]
},
{
"name": "typer",
"specs": [
[
">=",
"0.9.0"
]
]
},
{
"name": "PyYAML",
"specs": [
[
">=",
"6.0.1"
]
]
},
{
"name": "python-dateutil",
"specs": [
[
">=",
"2.8.2"
]
]
},
{
"name": "rich",
"specs": [
[
">=",
"13.7.0"
]
]
},
{
"name": "click",
"specs": [
[
">=",
"8.1.7"
]
]
},
{
"name": "pytest",
"specs": [
[
">=",
"7.4.3"
]
]
},
{
"name": "pytest-cov",
"specs": [
[
">=",
"4.1.0"
]
]
},
{
"name": "mypy",
"specs": [
[
">=",
"1.7.1"
]
]
},
{
"name": "black",
"specs": [
[
">=",
"23.12.0"
]
]
},
{
"name": "ruff",
"specs": [
[
">=",
"0.1.8"
]
]
},
{
"name": "types-PyYAML",
"specs": [
[
">=",
"6.0.12"
]
]
},
{
"name": "types-python-dateutil",
"specs": [
[
">=",
"2.8.19"
]
]
},
{
"name": "pandas-stubs",
"specs": [
[
">=",
"2.1.1"
]
]
}
],
"lcname": "semantic-rdf-mapper"
}