# DDN Metadata Bootstrap
[](https://badge.fury.io/py/ddn-metadata-bootstrap)
[](https://pypi.org/project/ddn-metadata-bootstrap/)
[](https://opensource.org/licenses/MIT)
AI-powered metadata enhancement for Hasura DDN (Data Delivery Network) schema files. Automatically generate high-quality descriptions and detect sophisticated relationships in your YAML/HML schema definitions using advanced AI with comprehensive configuration management.
## ๐ Features
### ๐ค **AI-Powered Description Generation**
- **Quality Assessment with Retry Logic**: Multi-attempt generation with configurable scoring thresholds
- **Context-Aware Business Descriptions**: Domain-specific system prompts with industry context
- **Smart Field Analysis**: Automatically detects and skips self-explanatory, generic, or cryptic fields
- **Configurable Length Controls**: Precise control over description length and token usage
### ๐ง **Intelligent Caching System**
- **Similarity-Based Matching**: Reuses descriptions for similar fields across entities (85% similarity threshold)
- **Performance Optimization**: Reduces API calls by up to 70% on large schemas through intelligent caching
- **Cache Statistics**: Real-time performance monitoring with hit rates and API cost savings tracking
- **Type-Aware Matching**: Considers field types and entity context for better cache accuracy
### ๐ **WordNet-Based Linguistic Analysis**
- **Generic Term Detection**: Uses NLTK and WordNet for sophisticated term analysis to skip meaningless fields
- **Semantic Density Analysis**: Evaluates conceptual richness and specificity of field names
- **Definition Quality Scoring**: Ensures meaningful, non-circular descriptions through linguistic validation
- **Abstraction Level Calculation**: Determines appropriate description depth based on semantic analysis
### ๐ **Enhanced Acronym Expansion**
- **Comprehensive Mappings**: 200+ pre-configured acronyms for technology, finance, and business domains
- **Context-Aware Expansion**: Industry-specific acronym interpretation based on domain context
- **Pre-Generation Enhancement**: Expands acronyms BEFORE AI generation for better context
- **Custom Domain Support**: Fully configurable acronym mappings via YAML configuration
### ๐ **Advanced Relationship Detection**
- **Template-Based FK Detection**: Sophisticated foreign key detection with confidence scoring and semantic validation
- **Shared Business Key Relationships**: Many-to-many relationships via shared field analysis with FK-aware filtering
- **Cross-Subgraph Intelligence**: Smart entity matching across different subgraphs
- **Configurable Templates**: Flexible FK template patterns with placeholders for complex naming conventions
- **Advanced Blacklisting**: Multi-source rules to prevent inappropriate relationship generation
### โ๏ธ **Comprehensive Configuration System**
- **YAML-First Configuration**: Central `config.yaml` file for all settings with full documentation
- **Waterfall Precedence**: CLI args > Environment variables > config.yaml > defaults
- **Configuration Validation**: Comprehensive validation with helpful error messages and source tracking
- **Feature Toggles**: Granular control over processing features (descriptions vs relationships)
### ๐ฏ **Advanced Quality Controls**
- **Buzzword Detection**: Avoids corporate jargon and meaningless generic terms
- **Pattern-Based Filtering**: Regex-based rejection of poor description formats
- **Technical Language Translation**: Converts technical terms to business-friendly language
- **Length Optimization**: Multiple validation layers with hard limits and target lengths
### ๐ **Intelligent Field Selection**
- **Generic Field Detection**: Skips overly common fields that don't benefit from descriptions
- **Cryptic Abbreviation Handling**: Configurable handling of unclear field names with vowel analysis
- **Self-Explanatory Pattern Recognition**: Automatically identifies fields that don't need descriptions
- **Value Assessment**: Only generates descriptions that add meaningful business value
## ๐ฆ Installation
### From PyPI (Recommended)
```bash
pip install ddn-metadata-bootstrap
```
### From Source
```bash
git clone https://github.com/hasura/ddn-metadata-bootstrap.git
cd ddn-metadata-bootstrap
pip install -e .
```
## ๐ Quick Start
### 1. Set up your environment
```bash
export ANTHROPIC_API_KEY="your-anthropic-api-key"
export METADATA_BOOTSTRAP_INPUT_DIR="./app/metadata"
export METADATA_BOOTSTRAP_OUTPUT_DIR="./enhanced_metadata"
```
### 2. Create a configuration file (Recommended)
Create a `config.yaml` file in your project directory:
```yaml
# config.yaml - DDN Metadata Bootstrap Configuration
# =============================================================================
# FEATURE CONTROL
# =============================================================================
relationships_only: false # Set to true to only generate relationships, skip descriptions
enable_quality_assessment: true # Enable AI quality scoring and retry logic
# =============================================================================
# AI GENERATION SETTINGS
# =============================================================================
# API Configuration
model: "claude-3-haiku-20240307"
# api_key: null # Set via environment variable ANTHROPIC_API_KEY
# Domain-specific system prompt for your organization
system_prompt: |
You generate concise field descriptions for database schema metadata at a global financial services firm.
DOMAIN CONTEXT:
- Organization: Global bank
- Department: Cybersecurity operations
- Use case: Risk management and security compliance
- Regulatory environment: Financial services (SOX, Basel III, GDPR, etc.)
Think: "What would a cybersecurity analyst at a bank need to know about this field?"
# Token and length limits
field_tokens: 25 # Max tokens AI can generate for field descriptions
kind_tokens: 50 # Max tokens AI can generate for kind descriptions
field_desc_max_length: 120 # Maximum total characters for field descriptions
kind_desc_max_length: 250 # Maximum total characters for entity descriptions
# Quality thresholds
minimum_description_score: 70 # Minimum score (0-100) to accept a description
max_description_retry_attempts: 3 # How many times to retry for better quality
# =============================================================================
# ENHANCED ACRONYM EXPANSION
# =============================================================================
acronym_mappings:
# Technology & Computing
api: "Application Programming Interface"
ui: "User Interface"
db: "Database"
# Security & Access Management
mfa: "Multi-Factor Authentication"
sso: "Single Sign-On"
iam: "Identity and Access Management"
siem: "Security Information and Event Management"
# Financial Services & Compliance
pci: "Payment Card Industry"
sox: "Sarbanes-Oxley Act"
kyc: "Know-Your-Customer"
aml: "Anti-Money Laundering"
# ... 200+ total mappings available
# =============================================================================
# INTELLIGENT FIELD SELECTION
# =============================================================================
# Fields to skip entirely - these will not get descriptions at all
skip_field_patterns:
- "^id$"
- "^_id$"
- "^uuid$"
- "^created_at$"
- "^updated_at$"
- "^debug_.*"
- "^test_.*"
- "^temp_.*"
# Generic fields - won't get unique descriptions (too common)
generic_fields:
- "id"
- "key"
- "uid"
- "guid"
- "name"
# Self-explanatory fields - simple patterns that don't need descriptions
self_explanatory_patterns:
- '^id$'
- '^_id$'
- '^guid$'
- '^uuid$'
- '^key$'
# Cryptic Field Handling
skip_cryptic_abbreviations: true # Skip fields with unclear abbreviations
skip_ultra_short_fields: true # Skip very short field names that are likely abbreviations
max_cryptic_field_length: 4 # Field names this length or shorter are considered cryptic
# Content quality controls
buzzwords: [
'synergy', 'leverage', 'paradigm', 'ecosystem',
'contains', 'stores', 'holds', 'represents'
]
forbidden_patterns: [
'this\\s+field\\s+represents',
'used\\s+to\\s+(track|manage|identify)',
'business.*information'
]
# =============================================================================
# RELATIONSHIP DETECTION
# =============================================================================
# FK Template Patterns for relationship detection
# Format: "{pk_pattern}|{fk_pattern}"
# Placeholders: {gi}=generic_id, {pt}=primary_table, {ps}=primary_subgraph, {pm}=prefix_modifier
fk_templates:
- "{gi}|{pm}_{pt}_{gi}" # active_service_name โ Services.name
- "{gi}|{pt}_{gi}" # user_id โ Users.id
- "{pt}_{gi}|{pm}_{pt}_{gi}" # user_id โ ActiveUsers.active_user_id
# Relationship blacklist rules
fk_key_blacklist:
- sources: ['gcp', 'azure']
entity_pattern: "^(gcp_|az_).*"
field_pattern: ".*(resource|project|policy).*"
logic: "or"
reason: "Block cross-cloud resource references"
# Shared relationship limits
max_shared_relationships: 10000
max_shared_per_entity: 10
min_shared_confidence: 30
```
### 3. Run the tool
```bash
# Process entire directory with intelligent caching
ddn-metadata-bootstrap
# Show configuration sources and validation
ddn-metadata-bootstrap --show-config
# Process only relationships (skip descriptions)
ddn-metadata-bootstrap --relationships-only
# Use custom configuration file
ddn-metadata-bootstrap --config custom-config.yaml
# Enable verbose logging to see caching and linguistic analysis
ddn-metadata-bootstrap --verbose
```
## ๐ Enhanced Examples
### High-Quality Description Generation with Caching
#### Input Schema (HML)
```yaml
kind: ObjectType
version: v1
definition:
name: ThreatAssessment
fields:
- name: riskId
type: String!
- name: mfaEnabled
type: Boolean!
- name: ssoConfig
type: String
- name: iamPolicy
type: String
```
#### Enhanced Output with Acronym Expansion
```yaml
kind: ObjectType
version: v1
definition:
name: ThreatAssessment
description: |
Security risk evaluation and compliance status tracking for
organizational threat management and regulatory oversight.
fields:
- name: riskId
type: String!
description: Risk assessment identifier for tracking security evaluations.
- name: mfaEnabled
type: Boolean!
description: Multi-Factor Authentication enablement status for security policy compliance.
- name: ssoConfig
type: String
description: Single Sign-On configuration settings for identity management.
- name: iamPolicy
type: String
description: Identity and Access Management policy governing user permissions.
```
### Intelligent Caching in Action
```yaml
# First entity processed - API call made
kind: ObjectType
definition:
name: UserProfile
fields:
- name: userId
type: String!
# Generated: "User account identifier for authentication and access control"
# Second entity processed - CACHE HIT! (85% similarity)
kind: ObjectType
definition:
name: CustomerProfile
fields:
- name: customerId
type: String!
# Reused: "User account identifier for authentication and access control"
# No API call made - description adapted from cache
```
### WordNet-Based Quality Analysis
```bash
# Verbose logging shows linguistic analysis
๐ ANALYZING 'data_value' - WordNet analysis:
- 'data': Generic term (specificity: 0.2, abstraction: 8)
- 'value': Generic term (specificity: 0.3, abstraction: 7)
- Overall clarity: UNCLEAR (unresolved generic terms)
โญ๏ธ SKIPPING 'data_value' - Contains unresolved generic terms
๐ ANALYZING 'customer_id' - WordNet analysis:
- 'customer': Specific term (specificity: 0.8, abstraction: 3)
- 'id': Known identifier pattern
- Overall clarity: CLEAR (specific business context)
๐ฏ GENERATING 'customer_id' - Business context adds value
```
### Advanced Relationship Detection
#### Input: Multiple Subgraphs
```yaml
# users/subgraph.yaml
kind: ObjectType
definition:
name: Users
fields:
- name: id
type: String!
- name: employee_id
type: String
# security/subgraph.yaml
kind: ObjectType
definition:
name: AccessLogs
fields:
- name: user_id
type: String!
- name: employee_id
type: String
```
#### Generated Relationships with FK-Aware Filtering
```yaml
# Generated FK relationship (high confidence)
kind: Relationship
version: v1
definition:
name: user
source: AccessLogs
target:
model:
name: Users
subgraph: users
mapping:
- source:
fieldPath:
- fieldName: user_id
target:
modelField:
- fieldName: id
# Shared field relationship filtered out due to existing FK relationship
# This prevents redundant relationships on the same entity pair
```
## โ๏ธ Advanced Configuration
### Performance vs Quality Tuning
```yaml
# High-performance configuration for large schemas (enables all optimizations)
enable_quality_assessment: false # Disable retry logic for speed
max_description_retry_attempts: 1 # Single attempt only
minimum_description_score: 50 # Lower quality threshold
field_tokens: 15 # Shorter responses
skip_cryptic_abbreviations: true # Skip unclear fields
relationships_only: true # Skip descriptions entirely
# High-quality configuration for critical schemas (enables all features)
enable_quality_assessment: true # Full quality validation
max_description_retry_attempts: 5 # More retries for quality
minimum_description_score: 80 # Higher quality threshold
field_tokens: 40 # Longer responses allowed
skip_cryptic_abbreviations: false # Try to describe all fields
```
## ๐ Python API with Enhanced Features
```python
from ddn_metadata_bootstrap import BootstrapperConfig, MetadataBootstrapper
import logging
# Configure logging to see caching and linguistic analysis
logging.basicConfig(level=logging.INFO)
# Load configuration with caching enabled
config = BootstrapperConfig(
config_file="./custom-config.yaml",
cli_args=None
)
# Create bootstrapper with enhanced features
bootstrapper = MetadataBootstrapper(config)
# Process directory with all enhancements
results = bootstrapper.process_directory(
input_dir="./app/metadata",
output_dir="./enhanced_metadata"
)
# Get comprehensive statistics including new features
stats = bootstrapper.get_statistics()
print(f"Entities processed: {stats['entities_processed']}")
print(f"Descriptions generated: {stats['descriptions_generated']}")
print(f"Relationships generated: {stats['relationships_generated']}")
# Get caching performance statistics
if hasattr(bootstrapper.description_generator, 'cache'):
cache_stats = bootstrapper.description_generator.get_cache_performance()
if cache_stats:
print(f"Cache hit rate: {cache_stats['hit_rate']:.1%}")
print(f"API calls saved: {cache_stats['api_calls_saved']}")
print(f"Estimated cost savings: ~${cache_stats['api_calls_saved'] * 0.01:.2f}")
```
## ๐ Enhanced Statistics & Monitoring
The tool provides comprehensive statistics including advanced features:
```python
# Detailed processing statistics with enhanced features
stats = bootstrapper.get_statistics()
# Core processing metrics
print(f"Entities processed: {stats['entities_processed']}")
print(f"Fields analyzed: {stats['fields_analyzed']}")
# Description generation metrics with intelligent filtering
print(f"Descriptions generated: {stats['descriptions_generated']}")
print(f"Fields skipped (generic): {stats['generic_fields_skipped']}")
print(f"Fields skipped (self-explanatory): {stats['self_explanatory_skipped']}")
print(f"Fields skipped (cryptic): {stats['cryptic_fields_skipped']}")
print(f"Acronyms expanded: {stats['acronyms_expanded']}")
# Caching performance metrics (if enabled)
if 'cache_hit_rate' in stats:
print(f"Cache hit rate: {stats['cache_hit_rate']:.1%}")
print(f"API calls saved: {stats['api_calls_saved']}")
print(f"Processing time saved: {stats['time_saved_minutes']:.1f} minutes")
# Quality assessment metrics
print(f"Average quality score: {stats['average_quality_score']}")
print(f"Quality retries attempted: {stats['quality_retries']}")
print(f"High quality descriptions: {stats['high_quality_descriptions']}")
# Linguistic analysis statistics (WordNet-based)
print(f"Generic terms detected: {stats['generic_terms_detected']}")
print(f"WordNet analyses performed: {stats['wordnet_analyses']}")
# Relationship generation metrics with advanced filtering
print(f"FK relationships generated: {stats['fk_relationships_generated']}")
print(f"Shared relationships generated: {stats['shared_relationships_generated']}")
print(f"Relationships blocked by rules: {stats['relationships_blocked']}")
print(f"FK-aware filtering applied: {stats['fk_aware_filtering_applied']}")
```
## ๐ Performance Improvements
### Caching Performance (Real Implementation)
Real-world performance improvements from the similarity-based caching:
```bash
# Before intelligent caching
Processing 500 fields across 50 entities...
API calls made: 425
Processing time: 8.5 minutes
Estimated cost: $4.25
# After intelligent caching
Processing 500 fields across 50 entities...
Cache hits: 298 (70.1% hit rate)
API calls made: 127 (70% reduction)
Processing time: 2.8 minutes (67% faster)
Estimated cost: $1.27 (70% savings)
```
### Quality Improvements (WordNet + Quality Assessment)
```bash
# Before enhanced quality controls and linguistic analysis
Descriptions generated: 425
Average quality score: 62
Rejected for generic language: 89 (21%)
Manual review required: 127 (30%)
# After WordNet analysis and enhanced quality controls
Descriptions generated: 312
Average quality score: 78
Rejected for generic language: 15 (5%)
Manual review required: 31 (10%)
WordNet generic detection: 67 fields skipped automatically
```
## ๐ Enhanced Processing Pipeline
### 1. **Intelligent Description Generation with Caching**
```python
def generate_field_description_with_quality_check(field_data, context):
# 1. Value assessment - should we generate?
value_assessment = self._should_generate_description_for_value(field_name, field_data, context)
# 2. WordNet-based generic detection
if self._generic_detector:
clarity_check = self._generic_detector.assess_field_name_clarity(field_name)
if not clarity_check['is_clear']:
return None # Skip unclear/generic fields
# 3. Acronym expansion before AI generation
acronym_expansions = self._expand_acronyms_in_field_name(field_name, context)
# 4. Check cache first (similarity-based with type awareness)
if self.cache:
cached_description = self.cache.get_cached_description(
field_name, entity_name, field_type, context
)
if cached_description:
return cached_description
# 5. Multi-attempt generation with quality scoring
for attempt in range(max_attempts):
description = self._make_api_call(enhanced_prompt, config.field_tokens)
quality_assessment = self._assess_description_quality(description, field_name, entity_name)
if quality_assessment['should_include']:
if self.cache:
self.cache.cache_description(field_name, entity_name, field_type, context, description)
return description
return None # Quality threshold not met
```
### 2. **WordNet-Based Linguistic Analysis**
```python
def analyze_term(self, word: str) -> TermAnalysis:
synsets = wn.synsets(word)
# Multi-dimensional analysis
for synset in synsets[:3]: # Top 3 meanings
# Definition specificity analysis
definition = synset.definition()
specificity = self._analyze_definition_specificity(definition)
# Taxonomic position analysis
abstraction_level = self._calculate_abstraction_level(synset)
# Semantic relationship analysis
relation_specificity = self._analyze_lexical_relations(synset)
# Concreteness analysis
concreteness = self._analyze_concreteness(definition.split())
# Use most specific interpretation
is_generic = max_specificity < 0.4
return TermAnalysis(word=word, is_generic=is_generic, specificity_score=max_specificity)
```
### 3. **Similarity-Based Caching Architecture**
```python
class DescriptionCache:
def __init__(self, similarity_threshold=0.85):
# Exact match cache
self.exact_cache: Dict[str, CachedDescription] = {}
# Similarity cache organized by normalized field patterns
self.similarity_cache: Dict[str, List[CachedDescription]] = defaultdict(list)
# Performance tracking
self.stats = {'exact_hits': 0, 'similarity_hits': 0, 'api_calls_saved': 0}
def get_cached_description(self, field_name, entity_name, field_type, context):
# Try exact context match first
context_hash = self._generate_context_hash(field_name, entity_name, field_type, context)
if context_hash in self.exact_cache:
return self.exact_cache[context_hash].description
# Try similarity matching with type awareness
normalized_field = self._normalize_field_name(field_name)
candidates = self.similarity_cache.get(normalized_field, [])
for cached in candidates:
similarity = self._calculate_similarity(
field_name, cached.field_name,
entity_name, cached.entity_name,
field_type, cached.field_type
)
if similarity >= self.similarity_threshold:
self.stats['similarity_hits'] += 1
return cached.description
return None
```
## ๐งช Testing Enhanced Features
```bash
# Test caching performance
pytest tests/test_caching.py -v
# Test WordNet integration
pytest tests/test_linguistic_analysis.py -v
# Test configuration system
pytest tests/test_config.py -v
# Test acronym expansion
pytest tests/test_acronym_expansion.py -v
# Test quality assessment
pytest tests/test_quality_assessment.py -v
# Test relationship detection with FK-aware filtering
pytest tests/test_relationship_detection.py -v
# Run all tests with coverage
pytest --cov=ddn_metadata_bootstrap --cov-report=html
```
## ๐ค Contributing
### Areas for Contribution
1. **Caching Enhancements**
- Persistent cache storage across sessions
- Cross-project cache sharing
- Advanced similarity algorithms
2. **Linguistic Analysis Improvements**
- Additional language support beyond English
- Industry-specific term recognition
- Enhanced semantic relationship detection
3. **Quality Assessment Refinements**
- Machine learning-based quality scoring
- Domain-specific quality metrics
- User feedback integration
4. **Relationship Detection Advances**
- Advanced FK pattern detection
- Semantic relationship analysis
- Cross-platform relationship mapping
### Development Guidelines
- Add tests for caching algorithms and WordNet integration
- Include linguistic analysis test cases
- Document configuration options thoroughly
- Test performance impact of new features
- Follow existing architecture patterns
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ Support
- ๐ [Documentation](https://github.com/hasura/ddn-metadata-bootstrap#readme)
- ๐ [Bug Reports](https://github.com/hasura/ddn-metadata-bootstrap/issues)
- ๐ฌ [Discussions](https://github.com/hasura/ddn-metadata-bootstrap/discussions)
- ๐ง [Caching Issues](https://github.com/hasura/ddn-metadata-bootstrap/issues?q=label%3Acaching)
- ๐ [Quality Assessment Issues](https://github.com/hasura/ddn-metadata-bootstrap/issues?q=label%3Aquality)
- ๐ฏ [WordNet Integration Issues](https://github.com/hasura/ddn-metadata-bootstrap/issues?q=label%3Awordnet)
## ๐ท๏ธ Version History
See [CHANGELOG.md](CHANGELOG.md) for complete version history and breaking changes.
## โญ Acknowledgments
- Built for [Hasura DDN](https://hasura.io/ddn)
- Powered by [Anthropic Claude](https://www.anthropic.com/)
- Linguistic analysis powered by [NLTK](https://www.nltk.org/) and [WordNet](https://wordnet.princeton.edu/)
- Inspired by the GraphQL and OpenAPI communities
- Caching algorithms inspired by database query optimization techniques
---
Made with โค๏ธ by the Hasura team
Raw data
{
"_id": null,
"home_page": null,
"name": "ddn-metadata-bootstrap",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Kenneth Stott <kenneth@hasura.io>",
"keywords": "hasura, ddn, graphql, schema, metadata, ai, anthropic, descriptions, relationships",
"author": null,
"author_email": "Kenneth Stott <kenneth@hasura.io>",
"download_url": "https://files.pythonhosted.org/packages/7c/a7/90bbbc888574188f233932f1ac64daaaa59f6d8fe22ab9ea71f7d6385f3a/ddn_metadata_bootstrap-1.0.12.tar.gz",
"platform": null,
"description": "# DDN Metadata Bootstrap\n\n[](https://badge.fury.io/py/ddn-metadata-bootstrap)\n[](https://pypi.org/project/ddn-metadata-bootstrap/)\n[](https://opensource.org/licenses/MIT)\n\nAI-powered metadata enhancement for Hasura DDN (Data Delivery Network) schema files. Automatically generate high-quality descriptions and detect sophisticated relationships in your YAML/HML schema definitions using advanced AI with comprehensive configuration management.\n\n## \ud83d\ude80 Features\n\n### \ud83e\udd16 **AI-Powered Description Generation**\n- **Quality Assessment with Retry Logic**: Multi-attempt generation with configurable scoring thresholds\n- **Context-Aware Business Descriptions**: Domain-specific system prompts with industry context\n- **Smart Field Analysis**: Automatically detects and skips self-explanatory, generic, or cryptic fields\n- **Configurable Length Controls**: Precise control over description length and token usage\n\n### \ud83e\udde0 **Intelligent Caching System** \n- **Similarity-Based Matching**: Reuses descriptions for similar fields across entities (85% similarity threshold)\n- **Performance Optimization**: Reduces API calls by up to 70% on large schemas through intelligent caching\n- **Cache Statistics**: Real-time performance monitoring with hit rates and API cost savings tracking\n- **Type-Aware Matching**: Considers field types and entity context for better cache accuracy\n\n### \ud83d\udd0d **WordNet-Based Linguistic Analysis**\n- **Generic Term Detection**: Uses NLTK and WordNet for sophisticated term analysis to skip meaningless fields\n- **Semantic Density Analysis**: Evaluates conceptual richness and specificity of field names\n- **Definition Quality Scoring**: Ensures meaningful, non-circular descriptions through linguistic validation\n- **Abstraction Level Calculation**: Determines appropriate description depth based on semantic analysis\n\n### \ud83d\udcdd **Enhanced Acronym Expansion**\n- **Comprehensive Mappings**: 200+ pre-configured acronyms for technology, finance, and business domains\n- **Context-Aware Expansion**: Industry-specific acronym interpretation based on domain context\n- **Pre-Generation Enhancement**: Expands acronyms BEFORE AI generation for better context\n- **Custom Domain Support**: Fully configurable acronym mappings via YAML configuration\n\n### \ud83d\udd17 **Advanced Relationship Detection**\n- **Template-Based FK Detection**: Sophisticated foreign key detection with confidence scoring and semantic validation\n- **Shared Business Key Relationships**: Many-to-many relationships via shared field analysis with FK-aware filtering\n- **Cross-Subgraph Intelligence**: Smart entity matching across different subgraphs\n- **Configurable Templates**: Flexible FK template patterns with placeholders for complex naming conventions\n- **Advanced Blacklisting**: Multi-source rules to prevent inappropriate relationship generation\n\n### \u2699\ufe0f **Comprehensive Configuration System**\n- **YAML-First Configuration**: Central `config.yaml` file for all settings with full documentation\n- **Waterfall Precedence**: CLI args > Environment variables > config.yaml > defaults\n- **Configuration Validation**: Comprehensive validation with helpful error messages and source tracking\n- **Feature Toggles**: Granular control over processing features (descriptions vs relationships)\n\n### \ud83c\udfaf **Advanced Quality Controls**\n- **Buzzword Detection**: Avoids corporate jargon and meaningless generic terms\n- **Pattern-Based Filtering**: Regex-based rejection of poor description formats\n- **Technical Language Translation**: Converts technical terms to business-friendly language\n- **Length Optimization**: Multiple validation layers with hard limits and target lengths\n\n### \ud83d\udd0d **Intelligent Field Selection**\n- **Generic Field Detection**: Skips overly common fields that don't benefit from descriptions\n- **Cryptic Abbreviation Handling**: Configurable handling of unclear field names with vowel analysis\n- **Self-Explanatory Pattern Recognition**: Automatically identifies fields that don't need descriptions\n- **Value Assessment**: Only generates descriptions that add meaningful business value\n\n## \ud83d\udce6 Installation\n\n### From PyPI (Recommended)\n\n```bash\npip install ddn-metadata-bootstrap\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/hasura/ddn-metadata-bootstrap.git\ncd ddn-metadata-bootstrap\npip install -e .\n```\n\n## \ud83c\udfc3 Quick Start\n\n### 1. Set up your environment\n\n```bash\nexport ANTHROPIC_API_KEY=\"your-anthropic-api-key\"\nexport METADATA_BOOTSTRAP_INPUT_DIR=\"./app/metadata\"\nexport METADATA_BOOTSTRAP_OUTPUT_DIR=\"./enhanced_metadata\"\n```\n\n### 2. Create a configuration file (Recommended)\n\nCreate a `config.yaml` file in your project directory:\n\n```yaml\n# config.yaml - DDN Metadata Bootstrap Configuration\n\n# =============================================================================\n# FEATURE CONTROL\n# =============================================================================\nrelationships_only: false # Set to true to only generate relationships, skip descriptions\nenable_quality_assessment: true # Enable AI quality scoring and retry logic\n\n# =============================================================================\n# AI GENERATION SETTINGS\n# =============================================================================\n# API Configuration\nmodel: \"claude-3-haiku-20240307\"\n# api_key: null # Set via environment variable ANTHROPIC_API_KEY\n\n# Domain-specific system prompt for your organization\nsystem_prompt: |\n You generate concise field descriptions for database schema metadata at a global financial services firm.\n \n DOMAIN CONTEXT:\n - Organization: Global bank\n - Department: Cybersecurity operations \n - Use case: Risk management and security compliance\n - Regulatory environment: Financial services (SOX, Basel III, GDPR, etc.)\n \n Think: \"What would a cybersecurity analyst at a bank need to know about this field?\"\n\n# Token and length limits\nfield_tokens: 25 # Max tokens AI can generate for field descriptions\nkind_tokens: 50 # Max tokens AI can generate for kind descriptions\nfield_desc_max_length: 120 # Maximum total characters for field descriptions\nkind_desc_max_length: 250 # Maximum total characters for entity descriptions\n\n# Quality thresholds\nminimum_description_score: 70 # Minimum score (0-100) to accept a description\nmax_description_retry_attempts: 3 # How many times to retry for better quality\n\n# =============================================================================\n# ENHANCED ACRONYM EXPANSION\n# =============================================================================\nacronym_mappings:\n # Technology & Computing\n api: \"Application Programming Interface\"\n ui: \"User Interface\"\n db: \"Database\"\n \n # Security & Access Management\n mfa: \"Multi-Factor Authentication\"\n sso: \"Single Sign-On\"\n iam: \"Identity and Access Management\"\n siem: \"Security Information and Event Management\"\n \n # Financial Services & Compliance\n pci: \"Payment Card Industry\"\n sox: \"Sarbanes-Oxley Act\"\n kyc: \"Know-Your-Customer\"\n aml: \"Anti-Money Laundering\"\n # ... 200+ total mappings available\n\n# =============================================================================\n# INTELLIGENT FIELD SELECTION\n# =============================================================================\n# Fields to skip entirely - these will not get descriptions at all\nskip_field_patterns:\n - \"^id$\"\n - \"^_id$\"\n - \"^uuid$\"\n - \"^created_at$\"\n - \"^updated_at$\"\n - \"^debug_.*\"\n - \"^test_.*\"\n - \"^temp_.*\"\n\n# Generic fields - won't get unique descriptions (too common)\ngeneric_fields:\n - \"id\"\n - \"key\"\n - \"uid\"\n - \"guid\"\n - \"name\"\n\n# Self-explanatory fields - simple patterns that don't need descriptions\nself_explanatory_patterns:\n - '^id$'\n - '^_id$'\n - '^guid$'\n - '^uuid$'\n - '^key$'\n\n# Cryptic Field Handling\nskip_cryptic_abbreviations: true # Skip fields with unclear abbreviations\nskip_ultra_short_fields: true # Skip very short field names that are likely abbreviations\nmax_cryptic_field_length: 4 # Field names this length or shorter are considered cryptic\n\n# Content quality controls\nbuzzwords: [\n 'synergy', 'leverage', 'paradigm', 'ecosystem',\n 'contains', 'stores', 'holds', 'represents'\n]\n\nforbidden_patterns: [\n 'this\\\\s+field\\\\s+represents',\n 'used\\\\s+to\\\\s+(track|manage|identify)',\n 'business.*information'\n]\n\n# =============================================================================\n# RELATIONSHIP DETECTION\n# =============================================================================\n# FK Template Patterns for relationship detection\n# Format: \"{pk_pattern}|{fk_pattern}\"\n# Placeholders: {gi}=generic_id, {pt}=primary_table, {ps}=primary_subgraph, {pm}=prefix_modifier\nfk_templates:\n - \"{gi}|{pm}_{pt}_{gi}\" # active_service_name \u2192 Services.name\n - \"{gi}|{pt}_{gi}\" # user_id \u2192 Users.id\n - \"{pt}_{gi}|{pm}_{pt}_{gi}\" # user_id \u2192 ActiveUsers.active_user_id\n\n# Relationship blacklist rules\nfk_key_blacklist:\n - sources: ['gcp', 'azure']\n entity_pattern: \"^(gcp_|az_).*\"\n field_pattern: \".*(resource|project|policy).*\"\n logic: \"or\"\n reason: \"Block cross-cloud resource references\"\n\n# Shared relationship limits\nmax_shared_relationships: 10000\nmax_shared_per_entity: 10\nmin_shared_confidence: 30\n```\n\n### 3. Run the tool\n\n```bash\n# Process entire directory with intelligent caching\nddn-metadata-bootstrap\n\n# Show configuration sources and validation\nddn-metadata-bootstrap --show-config\n\n# Process only relationships (skip descriptions)\nddn-metadata-bootstrap --relationships-only\n\n# Use custom configuration file\nddn-metadata-bootstrap --config custom-config.yaml\n\n# Enable verbose logging to see caching and linguistic analysis\nddn-metadata-bootstrap --verbose\n```\n\n## \ud83d\udcdd Enhanced Examples\n\n### High-Quality Description Generation with Caching\n\n#### Input Schema (HML)\n```yaml\nkind: ObjectType\nversion: v1\ndefinition:\n name: ThreatAssessment\n fields:\n - name: riskId\n type: String!\n - name: mfaEnabled\n type: Boolean!\n - name: ssoConfig\n type: String\n - name: iamPolicy\n type: String\n```\n\n#### Enhanced Output with Acronym Expansion\n```yaml\nkind: ObjectType\nversion: v1\ndefinition:\n name: ThreatAssessment\n description: |\n Security risk evaluation and compliance status tracking for \n organizational threat management and regulatory oversight.\n fields:\n - name: riskId\n type: String!\n description: Risk assessment identifier for tracking security evaluations.\n - name: mfaEnabled\n type: Boolean!\n description: Multi-Factor Authentication enablement status for security policy compliance.\n - name: ssoConfig\n type: String\n description: Single Sign-On configuration settings for identity management.\n - name: iamPolicy\n type: String\n description: Identity and Access Management policy governing user permissions.\n```\n\n### Intelligent Caching in Action\n\n```yaml\n# First entity processed - API call made\nkind: ObjectType\ndefinition:\n name: UserProfile\n fields:\n - name: userId\n type: String!\n # Generated: \"User account identifier for authentication and access control\"\n\n# Second entity processed - CACHE HIT! (85% similarity)\nkind: ObjectType\ndefinition:\n name: CustomerProfile \n fields:\n - name: customerId\n type: String!\n # Reused: \"User account identifier for authentication and access control\"\n # No API call made - description adapted from cache\n```\n\n### WordNet-Based Quality Analysis\n\n```bash\n# Verbose logging shows linguistic analysis\n\ud83d\udd0d ANALYZING 'data_value' - WordNet analysis:\n - 'data': Generic term (specificity: 0.2, abstraction: 8)\n - 'value': Generic term (specificity: 0.3, abstraction: 7)\n - Overall clarity: UNCLEAR (unresolved generic terms)\n\u23ed\ufe0f SKIPPING 'data_value' - Contains unresolved generic terms\n\n\ud83d\udd0d ANALYZING 'customer_id' - WordNet analysis:\n - 'customer': Specific term (specificity: 0.8, abstraction: 3)\n - 'id': Known identifier pattern\n - Overall clarity: CLEAR (specific business context)\n\ud83c\udfaf GENERATING 'customer_id' - Business context adds value\n```\n\n### Advanced Relationship Detection\n\n#### Input: Multiple Subgraphs\n```yaml\n# users/subgraph.yaml\nkind: ObjectType\ndefinition:\n name: Users\n fields:\n - name: id\n type: String!\n - name: employee_id\n type: String\n\n# security/subgraph.yaml \nkind: ObjectType\ndefinition:\n name: AccessLogs\n fields:\n - name: user_id\n type: String!\n - name: employee_id \n type: String\n```\n\n#### Generated Relationships with FK-Aware Filtering\n```yaml\n# Generated FK relationship (high confidence)\nkind: Relationship\nversion: v1\ndefinition:\n name: user\n source: AccessLogs\n target:\n model:\n name: Users\n subgraph: users\n mapping:\n - source:\n fieldPath:\n - fieldName: user_id\n target:\n modelField:\n - fieldName: id\n\n# Shared field relationship filtered out due to existing FK relationship\n# This prevents redundant relationships on the same entity pair\n```\n\n## \u2699\ufe0f Advanced Configuration\n\n### Performance vs Quality Tuning\n\n```yaml\n# High-performance configuration for large schemas (enables all optimizations)\nenable_quality_assessment: false # Disable retry logic for speed\nmax_description_retry_attempts: 1 # Single attempt only\nminimum_description_score: 50 # Lower quality threshold\nfield_tokens: 15 # Shorter responses\nskip_cryptic_abbreviations: true # Skip unclear fields\nrelationships_only: true # Skip descriptions entirely\n\n# High-quality configuration for critical schemas (enables all features)\nenable_quality_assessment: true # Full quality validation\nmax_description_retry_attempts: 5 # More retries for quality\nminimum_description_score: 80 # Higher quality threshold\nfield_tokens: 40 # Longer responses allowed\nskip_cryptic_abbreviations: false # Try to describe all fields\n```\n\n## \ud83d\udc0d Python API with Enhanced Features\n\n```python\nfrom ddn_metadata_bootstrap import BootstrapperConfig, MetadataBootstrapper\nimport logging\n\n# Configure logging to see caching and linguistic analysis\nlogging.basicConfig(level=logging.INFO)\n\n# Load configuration with caching enabled\nconfig = BootstrapperConfig(\n config_file=\"./custom-config.yaml\",\n cli_args=None\n)\n\n# Create bootstrapper with enhanced features\nbootstrapper = MetadataBootstrapper(config)\n\n# Process directory with all enhancements\nresults = bootstrapper.process_directory(\n input_dir=\"./app/metadata\",\n output_dir=\"./enhanced_metadata\"\n)\n\n# Get comprehensive statistics including new features\nstats = bootstrapper.get_statistics()\nprint(f\"Entities processed: {stats['entities_processed']}\")\nprint(f\"Descriptions generated: {stats['descriptions_generated']}\")\nprint(f\"Relationships generated: {stats['relationships_generated']}\")\n\n# Get caching performance statistics\nif hasattr(bootstrapper.description_generator, 'cache'):\n cache_stats = bootstrapper.description_generator.get_cache_performance()\n if cache_stats:\n print(f\"Cache hit rate: {cache_stats['hit_rate']:.1%}\")\n print(f\"API calls saved: {cache_stats['api_calls_saved']}\")\n print(f\"Estimated cost savings: ~${cache_stats['api_calls_saved'] * 0.01:.2f}\")\n```\n\n## \ud83d\udcca Enhanced Statistics & Monitoring\n\nThe tool provides comprehensive statistics including advanced features:\n\n```python\n# Detailed processing statistics with enhanced features\nstats = bootstrapper.get_statistics()\n\n# Core processing metrics\nprint(f\"Entities processed: {stats['entities_processed']}\")\nprint(f\"Fields analyzed: {stats['fields_analyzed']}\")\n\n# Description generation metrics with intelligent filtering\nprint(f\"Descriptions generated: {stats['descriptions_generated']}\")\nprint(f\"Fields skipped (generic): {stats['generic_fields_skipped']}\")\nprint(f\"Fields skipped (self-explanatory): {stats['self_explanatory_skipped']}\")\nprint(f\"Fields skipped (cryptic): {stats['cryptic_fields_skipped']}\")\nprint(f\"Acronyms expanded: {stats['acronyms_expanded']}\")\n\n# Caching performance metrics (if enabled)\nif 'cache_hit_rate' in stats:\n print(f\"Cache hit rate: {stats['cache_hit_rate']:.1%}\")\n print(f\"API calls saved: {stats['api_calls_saved']}\")\n print(f\"Processing time saved: {stats['time_saved_minutes']:.1f} minutes\")\n\n# Quality assessment metrics \nprint(f\"Average quality score: {stats['average_quality_score']}\")\nprint(f\"Quality retries attempted: {stats['quality_retries']}\")\nprint(f\"High quality descriptions: {stats['high_quality_descriptions']}\")\n\n# Linguistic analysis statistics (WordNet-based)\nprint(f\"Generic terms detected: {stats['generic_terms_detected']}\")\nprint(f\"WordNet analyses performed: {stats['wordnet_analyses']}\")\n\n# Relationship generation metrics with advanced filtering\nprint(f\"FK relationships generated: {stats['fk_relationships_generated']}\")\nprint(f\"Shared relationships generated: {stats['shared_relationships_generated']}\")\nprint(f\"Relationships blocked by rules: {stats['relationships_blocked']}\")\nprint(f\"FK-aware filtering applied: {stats['fk_aware_filtering_applied']}\")\n```\n\n## \ud83d\ude80 Performance Improvements\n\n### Caching Performance (Real Implementation)\n\nReal-world performance improvements from the similarity-based caching:\n\n```bash\n# Before intelligent caching\nProcessing 500 fields across 50 entities...\nAPI calls made: 425\nProcessing time: 8.5 minutes\nEstimated cost: $4.25\n\n# After intelligent caching \nProcessing 500 fields across 50 entities...\nCache hits: 298 (70.1% hit rate)\nAPI calls made: 127 (70% reduction)\nProcessing time: 2.8 minutes (67% faster)\nEstimated cost: $1.27 (70% savings)\n```\n\n### Quality Improvements (WordNet + Quality Assessment)\n\n```bash\n# Before enhanced quality controls and linguistic analysis\nDescriptions generated: 425\nAverage quality score: 62\nRejected for generic language: 89 (21%)\nManual review required: 127 (30%)\n\n# After WordNet analysis and enhanced quality controls\nDescriptions generated: 312\nAverage quality score: 78\nRejected for generic language: 15 (5%)\nManual review required: 31 (10%)\nWordNet generic detection: 67 fields skipped automatically\n```\n\n## \ud83d\udd04 Enhanced Processing Pipeline\n\n### 1. **Intelligent Description Generation with Caching**\n\n```python\ndef generate_field_description_with_quality_check(field_data, context):\n # 1. Value assessment - should we generate?\n value_assessment = self._should_generate_description_for_value(field_name, field_data, context)\n \n # 2. WordNet-based generic detection\n if self._generic_detector:\n clarity_check = self._generic_detector.assess_field_name_clarity(field_name)\n if not clarity_check['is_clear']:\n return None # Skip unclear/generic fields\n \n # 3. Acronym expansion before AI generation\n acronym_expansions = self._expand_acronyms_in_field_name(field_name, context)\n \n # 4. Check cache first (similarity-based with type awareness)\n if self.cache:\n cached_description = self.cache.get_cached_description(\n field_name, entity_name, field_type, context\n )\n if cached_description:\n return cached_description\n \n # 5. Multi-attempt generation with quality scoring\n for attempt in range(max_attempts):\n description = self._make_api_call(enhanced_prompt, config.field_tokens)\n quality_assessment = self._assess_description_quality(description, field_name, entity_name)\n if quality_assessment['should_include']:\n if self.cache:\n self.cache.cache_description(field_name, entity_name, field_type, context, description)\n return description\n \n return None # Quality threshold not met\n```\n\n### 2. **WordNet-Based Linguistic Analysis**\n\n```python\ndef analyze_term(self, word: str) -> TermAnalysis:\n synsets = wn.synsets(word)\n \n # Multi-dimensional analysis\n for synset in synsets[:3]: # Top 3 meanings\n # Definition specificity analysis\n definition = synset.definition()\n specificity = self._analyze_definition_specificity(definition)\n \n # Taxonomic position analysis \n abstraction_level = self._calculate_abstraction_level(synset)\n \n # Semantic relationship analysis\n relation_specificity = self._analyze_lexical_relations(synset)\n \n # Concreteness analysis\n concreteness = self._analyze_concreteness(definition.split())\n \n # Use most specific interpretation\n is_generic = max_specificity < 0.4\n return TermAnalysis(word=word, is_generic=is_generic, specificity_score=max_specificity)\n```\n\n### 3. **Similarity-Based Caching Architecture**\n\n```python\nclass DescriptionCache:\n def __init__(self, similarity_threshold=0.85):\n # Exact match cache\n self.exact_cache: Dict[str, CachedDescription] = {}\n \n # Similarity cache organized by normalized field patterns\n self.similarity_cache: Dict[str, List[CachedDescription]] = defaultdict(list)\n \n # Performance tracking\n self.stats = {'exact_hits': 0, 'similarity_hits': 0, 'api_calls_saved': 0}\n \n def get_cached_description(self, field_name, entity_name, field_type, context):\n # Try exact context match first\n context_hash = self._generate_context_hash(field_name, entity_name, field_type, context)\n if context_hash in self.exact_cache:\n return self.exact_cache[context_hash].description\n \n # Try similarity matching with type awareness\n normalized_field = self._normalize_field_name(field_name)\n candidates = self.similarity_cache.get(normalized_field, [])\n \n for cached in candidates:\n similarity = self._calculate_similarity(\n field_name, cached.field_name,\n entity_name, cached.entity_name, \n field_type, cached.field_type\n )\n if similarity >= self.similarity_threshold:\n self.stats['similarity_hits'] += 1\n return cached.description\n \n return None\n```\n\n## \ud83e\uddea Testing Enhanced Features\n\n```bash\n# Test caching performance\npytest tests/test_caching.py -v\n\n# Test WordNet integration \npytest tests/test_linguistic_analysis.py -v\n\n# Test configuration system\npytest tests/test_config.py -v\n\n# Test acronym expansion\npytest tests/test_acronym_expansion.py -v\n\n# Test quality assessment\npytest tests/test_quality_assessment.py -v\n\n# Test relationship detection with FK-aware filtering\npytest tests/test_relationship_detection.py -v\n\n# Run all tests with coverage\npytest --cov=ddn_metadata_bootstrap --cov-report=html\n```\n\n## \ud83e\udd1d Contributing\n\n### Areas for Contribution\n\n1. **Caching Enhancements**\n - Persistent cache storage across sessions\n - Cross-project cache sharing\n - Advanced similarity algorithms\n\n2. **Linguistic Analysis Improvements**\n - Additional language support beyond English\n - Industry-specific term recognition\n - Enhanced semantic relationship detection\n\n3. **Quality Assessment Refinements**\n - Machine learning-based quality scoring\n - Domain-specific quality metrics\n - User feedback integration\n\n4. **Relationship Detection Advances**\n - Advanced FK pattern detection\n - Semantic relationship analysis\n - Cross-platform relationship mapping\n\n### Development Guidelines\n\n- Add tests for caching algorithms and WordNet integration\n- Include linguistic analysis test cases\n- Document configuration options thoroughly\n- Test performance impact of new features\n- Follow existing architecture patterns\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83c\udd98 Support\n\n- \ud83d\udcd6 [Documentation](https://github.com/hasura/ddn-metadata-bootstrap#readme)\n- \ud83d\udc1b [Bug Reports](https://github.com/hasura/ddn-metadata-bootstrap/issues)\n- \ud83d\udcac [Discussions](https://github.com/hasura/ddn-metadata-bootstrap/discussions)\n- \ud83e\udde0 [Caching Issues](https://github.com/hasura/ddn-metadata-bootstrap/issues?q=label%3Acaching)\n- \ud83d\udd0d [Quality Assessment Issues](https://github.com/hasura/ddn-metadata-bootstrap/issues?q=label%3Aquality)\n- \ud83c\udfaf [WordNet Integration Issues](https://github.com/hasura/ddn-metadata-bootstrap/issues?q=label%3Awordnet)\n\n## \ud83c\udff7\ufe0f Version History\n\nSee [CHANGELOG.md](CHANGELOG.md) for complete version history and breaking changes.\n\n## \u2b50 Acknowledgments\n\n- Built for [Hasura DDN](https://hasura.io/ddn)\n- Powered by [Anthropic Claude](https://www.anthropic.com/)\n- Linguistic analysis powered by [NLTK](https://www.nltk.org/) and [WordNet](https://wordnet.princeton.edu/)\n- Inspired by the GraphQL and OpenAPI communities\n- Caching algorithms inspired by database query optimization techniques\n\n---\n\nMade with \u2764\ufe0f by the Hasura team\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "AI-powered metadata enhancement for Hasura DDN schema files",
"version": "1.0.12",
"project_urls": {
"Bug Reports": "https://github.com/hasura/ddn-metadata-bootstrap/issues",
"Changelog": "https://github.com/hasura/ddn-metadata-bootstrap/blob/main/CHANGELOG.md",
"Documentation": "https://github.com/hasura/ddn-metadata-bootstrap#readme",
"Homepage": "https://github.com/hasura/ddn-metadata-bootstrap",
"Repository": "https://github.com/hasura/ddn-metadata-bootstrap.git"
},
"split_keywords": [
"hasura",
" ddn",
" graphql",
" schema",
" metadata",
" ai",
" anthropic",
" descriptions",
" relationships"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a0f223698c313c1bc753ccabd24f5ff9d91f9c4753a323a57a8d92026429ff41",
"md5": "e9263ad3bd43792aaa43e090aecff2b2",
"sha256": "c7be600b40849b20274a2821e19aadbe67ebaa02c4c95a3074231deee1696f8b"
},
"downloads": -1,
"filename": "ddn_metadata_bootstrap-1.0.12-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e9263ad3bd43792aaa43e090aecff2b2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 138275,
"upload_time": "2025-07-16T10:50:33",
"upload_time_iso_8601": "2025-07-16T10:50:33.196487Z",
"url": "https://files.pythonhosted.org/packages/a0/f2/23698c313c1bc753ccabd24f5ff9d91f9c4753a323a57a8d92026429ff41/ddn_metadata_bootstrap-1.0.12-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7ca790bbbc888574188f233932f1ac64daaaa59f6d8fe22ab9ea71f7d6385f3a",
"md5": "01b43b988af408bc894e8d19e8553108",
"sha256": "ad5ebb247f88f01f824889cb0ac35246ad7b55e0f3245c84c3d7a30efd4fa3a0"
},
"downloads": -1,
"filename": "ddn_metadata_bootstrap-1.0.12.tar.gz",
"has_sig": false,
"md5_digest": "01b43b988af408bc894e8d19e8553108",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 135223,
"upload_time": "2025-07-16T10:50:34",
"upload_time_iso_8601": "2025-07-16T10:50:34.395433Z",
"url": "https://files.pythonhosted.org/packages/7c/a7/90bbbc888574188f233932f1ac64daaaa59f6d8fe22ab9ea71f7d6385f3a/ddn_metadata_bootstrap-1.0.12.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-16 10:50:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hasura",
"github_project": "ddn-metadata-bootstrap",
"github_not_found": true,
"lcname": "ddn-metadata-bootstrap"
}