# Adel-Lite: Automated Data Elements Linking - Lite
[](https://www.python.org/downloads/release/python-380/)
[](https://opensource.org/licenses/MIT)
**Adel-Lite** is a Python library for automated schema generation, data profiling, and relationship discovery for Pandas DataFrames. It helps you understand your data structure and relationships with minimal effort.
## Features
🔍 **Schema Generation**: Automatic structural schema detection
📊 **Data Profiling**: Comprehensive statistics and semantic type inference
🔗 **Relationship Mapping**: Primary/Foreign key detection using heuristics
⚡ **Constraint Discovery**: Intra-row constraint detection (GT, EQ)
📈 **Visualization**: Schema graphs with Graphviz
📤 **Multi-format Export**: JSON, YAML, SQL DDL, Avro
🛠️ **CLI Support**: Command-line interface for batch processing
## Installation
```bash
pip install adel-lite
```
### Development installation
```bash
git clone https://github.com/Parthnuwal7/adel-lite.git
cd adel-lite
pip install -e .
```
## Quick Start
### Basic Usage
```python
import pandas as pd
from adel_lite import schema, profile, map_relationships, build_meta
```
### Load your data
```python
customers = pd.DataFrame({
'customer_id': ,
'name': ['Alice', 'Bob', 'Charlie'],
'email': ['alice@test.com', 'bob@test.com', 'charlie@test.com']
})
orders = pd.DataFrame({
'order_id': ,
'customer_id': ,
'amount': [100.0, 150.0, 75.0]
})
df_list = [customers, orders]
table_names = ['customers', 'orders']
```
### Generate comprehensive analysis
```python
schema_result = schema(df_list, table_names)
profile_result = profile(df_list, table_names)
relationships_result = map_relationships(df_list, table_names)
```
### Build final meta structure
```python
meta = build_meta(schema_result, profile_result, relationships_result)
print(json.dumps(meta, indent=2))
```
### Command Line usage
#### Analyze CSV files
```bash
adel-lite --input data/*.csv --output schema.json
```
#### Generate visualization
```bash
adel-lite --input *.csv --visualize --output schema.json
```
#### Export as SQL DDL
```bash
adel-lite --input data/*.csv --format ddl --output schema.sql
```
#### Skip constraint detection for faster processing
```bash
adel-lite --input *.csv --no-constraints --output schema.json
```
## Core Functions
### 1. Schema Generation
```python
from adel_lite import schema
Generate structural schema
result = schema(df_list, table_names)
```
**Returns:**
- Table names and column information
- Data types (pandas + high-level)
- Nullable flags and positions
### 2. Data Profiling
```python
from adel_lite import profile
Generate comprehensive profiles
result = profile(df_list, table_names)
```
**Returns:**
- Statistical summaries (min, max, mean, etc.)
- Uniqueness and null ratios
- Semantic type inference (id, datetime, categorical, etc.)
- Primary key candidates
### 3. Relationship Mapping
```python
from adel_lite import map_relationships
Detect relationships
result = map_relationships(df_list, table_names, fk_threshold=0.8)
```
**Returns:**
- Primary key detection
- Foreign key relationships with confidence scores
- Composite key candidates
### 4. Constraint Detection
```python
from adel_lite import detect_constraints
Find intra-row constraints
result = detect_constraints(df_list, table_names, threshold=0.95)
```
**Returns:**
- GT constraints: `A > B`
- EQ constraints: `A + B = C`
- Confidence scores
### 5. Visualization
```python
from adel_lite import visualize
Generate schema graph
path = visualize(schema_result, relationships_result, format='png')
```
### 6. Export
```python
from adel_lite import export_schema
Export to different formats
json_content = export_schema(meta, format='json')
yaml_content = export_schema(meta, format='yaml')
ddl_content = export_schema(meta, format='ddl')
```
## Example Output
```json
{
"metadata": {
"generated_at": "2025-09-10T12:42:00",
"generator": "adel-lite",
"version": "0.1.0"
},
"tables": [
{
"name": "customers",
"primary_key": "customer_id",
"fields": [
{
"name": "customer_id",
"dtype": "integer",
"semantic_type": "id",
"subtype": "primary",
"nullable": false
}
]
}
],
"relationships": [
{
"type": "foreign_key",
"foreign_table": "orders",
"foreign_column": "customer_id",
"referenced_table": "customers",
"referenced_column": "customer_id",
"confidence": 0.92
}
]
}
```
## Advanced Usage
### Custom Thresholds
Adjust detection thresholds
```python
relationships = map_relationships(
df_list, table_names,
fk_threshold=0.9, # Stricter FK detection
name_similarity_threshold=0.8
)
constraints = detect_constraints(
df_list, table_names,
threshold=0.98 # Very strict constraints
)
```
### Sampling and Inspection
```python
from adel_lite import sample
Get sample data for inspection
samples = sample(df_list, table_names, n=10, method='random')
Conditional sampling
samples = sample_by_condition(
df_list,
['age > 25', 'amount > 100'],
table_names
)
```
## Configuration
### CLI Configuration
Full configuration example
```bash
adel-lite
--input data/*.csv
--output analysis.json
--format json
--visualize
--viz-format svg
--sample 5
--constraint-threshold 0.9
--fk-threshold 0.8
--verbose
```
### Logging
```python
import logging
#Enable debug logging
logging.getLogger('adel_lite').setLevel(logging.DEBUG)
```
## Performance Tips
1. **Skip constraints** for large datasets: `--no-constraints`
2. **Limit sampling** for inspection: `--sample 100`
3. **Use appropriate thresholds** based on data quality
4. **Process in batches** for very large datasets
## Requirements
- Python 3.8+
- pandas >= 1.3.0
- numpy >= 1.21.0
- pyyaml >= 6.0
- networkx >= 2.6
- matplotlib >= 3.5.0
- graphviz >= 0.20.0
- fuzzywuzzy >= 0.18.0
## Contributing
1. Fork the repository
2. Create a feature branch: `git checkout -b feature-name`
3. Make changes and add tests
4. Run tests: `pytest`
5. Submit a pull request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Roadmap
- [ ] Support for more data sources (databases, APIs)
- [ ] Advanced constraint types (LIKE patterns, regex)
- [ ] Machine learning-based relationship detection
- [ ] Interactive web interface
- [ ] Integration with data catalogs
## Support
- 📖 [Documentation](https://github.com/Parthnuwal7/adel-lite)
- 🐛 [Issue Tracker](https://github.com/Parthnuwal7/adel-lite)
- 💬 [Discussions](https://github.com/Parthnuwal7/adel-lite.git)
---
Made with ❤️ for the data community by Parth Nuwal
Raw data
{
"_id": null,
"home_page": "https://github.com/Parthnuwal7/adel-lite.git",
"name": "adel-lite",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "data, schema, profiling, pandas, automation",
"author": "Parth Nuwal",
"author_email": "Parth Nuwal <parthnuwal7@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/3d/1d/ce6ea63c4dc6495f635aec6e2461bdfd718f902ac0907ad961fc3761d517/adel_lite-0.1.0.tar.gz",
"platform": null,
"description": "# Adel-Lite: Automated Data Elements Linking - Lite\r\n\r\n[](https://www.python.org/downloads/release/python-380/)\r\n[](https://opensource.org/licenses/MIT)\r\n\r\n**Adel-Lite** is a Python library for automated schema generation, data profiling, and relationship discovery for Pandas DataFrames. It helps you understand your data structure and relationships with minimal effort.\r\n\r\n## Features\r\n\r\n\ud83d\udd0d **Schema Generation**: Automatic structural schema detection \r\n\ud83d\udcca **Data Profiling**: Comprehensive statistics and semantic type inference \r\n\ud83d\udd17 **Relationship Mapping**: Primary/Foreign key detection using heuristics \r\n\u26a1 **Constraint Discovery**: Intra-row constraint detection (GT, EQ) \r\n\ud83d\udcc8 **Visualization**: Schema graphs with Graphviz \r\n\ud83d\udce4 **Multi-format Export**: JSON, YAML, SQL DDL, Avro \r\n\ud83d\udee0\ufe0f **CLI Support**: Command-line interface for batch processing \r\n\r\n## Installation\r\n\r\n```bash\r\npip install adel-lite\r\n```\r\n### Development installation\r\n```bash\r\ngit clone https://github.com/Parthnuwal7/adel-lite.git\r\ncd adel-lite\r\npip install -e .\r\n```\r\n## Quick Start\r\n### Basic Usage\r\n\r\n```python\r\nimport pandas as pd\r\nfrom adel_lite import schema, profile, map_relationships, build_meta\r\n```\r\n### Load your data\r\n\r\n```python\r\ncustomers = pd.DataFrame({\r\n'customer_id': ,\r\n'name': ['Alice', 'Bob', 'Charlie'],\r\n'email': ['alice@test.com', 'bob@test.com', 'charlie@test.com']\r\n})\r\n\r\norders = pd.DataFrame({\r\n'order_id': ,\r\n'customer_id': ,\r\n'amount': [100.0, 150.0, 75.0]\r\n})\r\n\r\ndf_list = [customers, orders]\r\ntable_names = ['customers', 'orders']\r\n```\r\n### Generate comprehensive analysis\r\n\r\n```python\r\nschema_result = schema(df_list, table_names)\r\nprofile_result = profile(df_list, table_names)\r\nrelationships_result = map_relationships(df_list, table_names)\r\n```\r\n### Build final meta structure\r\n\r\n```python\r\nmeta = build_meta(schema_result, profile_result, relationships_result)\r\nprint(json.dumps(meta, indent=2))\r\n```\r\n### Command Line usage\r\n\r\n#### Analyze CSV files\r\n\r\n```bash\r\nadel-lite --input data/*.csv --output schema.json\r\n```\r\n#### Generate visualization\r\n\r\n```bash\r\nadel-lite --input *.csv --visualize --output schema.json\r\n```\r\n#### Export as SQL DDL\r\n\r\n```bash\r\nadel-lite --input data/*.csv --format ddl --output schema.sql\r\n```\r\n#### Skip constraint detection for faster processing\r\n\r\n```bash\r\nadel-lite --input *.csv --no-constraints --output schema.json\r\n```\r\n## Core Functions\r\n\r\n### 1. Schema Generation\r\n\r\n```python\r\nfrom adel_lite import schema\r\n\r\nGenerate structural schema\r\nresult = schema(df_list, table_names)\r\n```\r\n\r\n**Returns:**\r\n- Table names and column information\r\n- Data types (pandas + high-level)\r\n- Nullable flags and positions\r\n\r\n### 2. Data Profiling\r\n```python\r\nfrom adel_lite import profile\r\n\r\nGenerate comprehensive profiles\r\nresult = profile(df_list, table_names)\r\n```\r\n**Returns:**\r\n- Statistical summaries (min, max, mean, etc.)\r\n- Uniqueness and null ratios\r\n- Semantic type inference (id, datetime, categorical, etc.)\r\n- Primary key candidates\r\n\r\n### 3. Relationship Mapping\r\n```python\r\nfrom adel_lite import map_relationships\r\n\r\nDetect relationships\r\nresult = map_relationships(df_list, table_names, fk_threshold=0.8)\r\n```\r\n**Returns:**\r\n- Primary key detection\r\n- Foreign key relationships with confidence scores\r\n- Composite key candidates\r\n\r\n### 4. Constraint Detection\r\n```python\r\nfrom adel_lite import detect_constraints\r\n\r\nFind intra-row constraints\r\nresult = detect_constraints(df_list, table_names, threshold=0.95)\r\n```\r\n**Returns:**\r\n- GT constraints: `A > B`\r\n- EQ constraints: `A + B = C`\r\n- Confidence scores\r\n\r\n### 5. Visualization\r\n```python\r\nfrom adel_lite import visualize\r\n\r\nGenerate schema graph\r\npath = visualize(schema_result, relationships_result, format='png')\r\n```\r\n### 6. Export\r\n\r\n```python\r\nfrom adel_lite import export_schema\r\n\r\nExport to different formats\r\njson_content = export_schema(meta, format='json')\r\nyaml_content = export_schema(meta, format='yaml')\r\nddl_content = export_schema(meta, format='ddl')\r\n\r\n```\r\n\r\n## Example Output\r\n```json\r\n{\r\n\"metadata\": {\r\n\"generated_at\": \"2025-09-10T12:42:00\",\r\n\"generator\": \"adel-lite\",\r\n\"version\": \"0.1.0\"\r\n},\r\n\"tables\": [\r\n{\r\n\"name\": \"customers\",\r\n\"primary_key\": \"customer_id\",\r\n\"fields\": [\r\n{\r\n\"name\": \"customer_id\",\r\n\"dtype\": \"integer\",\r\n\"semantic_type\": \"id\",\r\n\"subtype\": \"primary\",\r\n\"nullable\": false\r\n}\r\n]\r\n}\r\n],\r\n\"relationships\": [\r\n{\r\n\"type\": \"foreign_key\",\r\n\"foreign_table\": \"orders\",\r\n\"foreign_column\": \"customer_id\",\r\n\"referenced_table\": \"customers\",\r\n\"referenced_column\": \"customer_id\",\r\n\"confidence\": 0.92\r\n}\r\n]\r\n}\r\n```\r\n\r\n## Advanced Usage\r\n\r\n### Custom Thresholds\r\n\r\nAdjust detection thresholds\r\n\r\n```python\r\nrelationships = map_relationships(\r\ndf_list, table_names,\r\nfk_threshold=0.9, # Stricter FK detection\r\nname_similarity_threshold=0.8\r\n)\r\n\r\nconstraints = detect_constraints(\r\ndf_list, table_names,\r\nthreshold=0.98 # Very strict constraints\r\n)\r\n\r\n```\r\n\r\n### Sampling and Inspection\r\n```python\r\nfrom adel_lite import sample\r\n\r\nGet sample data for inspection\r\nsamples = sample(df_list, table_names, n=10, method='random')\r\n\r\nConditional sampling\r\nsamples = sample_by_condition(\r\ndf_list,\r\n['age > 25', 'amount > 100'],\r\ntable_names\r\n)\r\n\r\n```\r\n\r\n## Configuration\r\n\r\n### CLI Configuration\r\n\r\nFull configuration example\r\n```bash \r\nadel-lite\r\n--input data/*.csv\r\n--output analysis.json\r\n--format json\r\n--visualize\r\n--viz-format svg\r\n--sample 5\r\n--constraint-threshold 0.9\r\n--fk-threshold 0.8\r\n--verbose\r\n\r\n```\r\n\r\n### Logging\r\n\r\n```python\r\nimport logging\r\n\r\n#Enable debug logging\r\nlogging.getLogger('adel_lite').setLevel(logging.DEBUG)\r\n\r\n```\r\n\r\n## Performance Tips\r\n\r\n1. **Skip constraints** for large datasets: `--no-constraints`\r\n2. **Limit sampling** for inspection: `--sample 100`\r\n3. **Use appropriate thresholds** based on data quality\r\n4. **Process in batches** for very large datasets\r\n\r\n## Requirements\r\n\r\n- Python 3.8+\r\n- pandas >= 1.3.0\r\n- numpy >= 1.21.0\r\n- pyyaml >= 6.0\r\n- networkx >= 2.6\r\n- matplotlib >= 3.5.0\r\n- graphviz >= 0.20.0\r\n- fuzzywuzzy >= 0.18.0\r\n\r\n## Contributing\r\n\r\n1. Fork the repository\r\n2. Create a feature branch: `git checkout -b feature-name`\r\n3. Make changes and add tests\r\n4. Run tests: `pytest`\r\n5. Submit a pull request\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n## Roadmap\r\n\r\n- [ ] Support for more data sources (databases, APIs)\r\n- [ ] Advanced constraint types (LIKE patterns, regex)\r\n- [ ] Machine learning-based relationship detection\r\n- [ ] Interactive web interface\r\n- [ ] Integration with data catalogs\r\n\r\n## Support\r\n\r\n- \ud83d\udcd6 [Documentation](https://github.com/Parthnuwal7/adel-lite)\r\n- \ud83d\udc1b [Issue Tracker](https://github.com/Parthnuwal7/adel-lite)\r\n- \ud83d\udcac [Discussions](https://github.com/Parthnuwal7/adel-lite.git)\r\n\r\n---\r\n\r\nMade with \u2764\ufe0f for the data community by Parth Nuwal\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Automated Data Elements Linking - Lite",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/Parthnuwal7/adel-lite.git"
},
"split_keywords": [
"data",
" schema",
" profiling",
" pandas",
" automation"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "c2aa0a453bb80db8e9b4a7003f508fc86e36cf733cf383bf4a4cad1d76f7a411",
"md5": "6c4337b600621a401634dd132f9621db",
"sha256": "314c86ea6502d9a33ff80c852b7b56d50b1cf5a8445c07740c3cdfc320c109b9"
},
"downloads": -1,
"filename": "adel_lite-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6c4337b600621a401634dd132f9621db",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 28309,
"upload_time": "2025-09-10T22:09:21",
"upload_time_iso_8601": "2025-09-10T22:09:21.294126Z",
"url": "https://files.pythonhosted.org/packages/c2/aa/0a453bb80db8e9b4a7003f508fc86e36cf733cf383bf4a4cad1d76f7a411/adel_lite-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3d1dce6ea63c4dc6495f635aec6e2461bdfd718f902ac0907ad961fc3761d517",
"md5": "3562e562ec94230685009ea3bbe0dc75",
"sha256": "37f8b1372a241797697fa5fc4b7e3537aba76035064c011a5524e673a93baf28"
},
"downloads": -1,
"filename": "adel_lite-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "3562e562ec94230685009ea3bbe0dc75",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 24446,
"upload_time": "2025-09-10T22:09:23",
"upload_time_iso_8601": "2025-09-10T22:09:23.195044Z",
"url": "https://files.pythonhosted.org/packages/3d/1d/ce6ea63c4dc6495f635aec6e2461bdfd718f902ac0907ad961fc3761d517/adel_lite-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-10 22:09:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Parthnuwal7",
"github_project": "adel-lite",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "pandas",
"specs": [
[
">=",
"1.3.0"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.21.0"
]
]
},
{
"name": "pyyaml",
"specs": [
[
">=",
"6.0"
]
]
},
{
"name": "networkx",
"specs": [
[
">=",
"2.6"
]
]
},
{
"name": "matplotlib",
"specs": [
[
">=",
"3.5.0"
]
]
},
{
"name": "graphviz",
"specs": [
[
">=",
"0.20.0"
]
]
},
{
"name": "fuzzywuzzy",
"specs": [
[
">=",
"0.18.0"
]
]
},
{
"name": "python-levenshtein",
"specs": [
[
">=",
"0.12.0"
]
]
}
],
"lcname": "adel-lite"
}