pyjsonkit


Namepyjsonkit JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryA comprehensive Python toolkit for JSON processing with advanced AI-focused features for modern data workflows
upload_time2025-08-15 17:12:21
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords json ai ml data-processing validation parsing extraction cleaning transformation schema-generation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PyJSONKit

A comprehensive Python toolkit for JSON processing with advanced AI-focused features for modern data workflows.

## 🚀 Features

### Core JSON Operations
- Easy JSON file manipulation and validation
- Pretty printing and formatting
- Robust parsing with error handling
- Simple API for common JSON operations

### 🤖 AI-Focused Features
- **AI JSON Processing**: Extract and fix JSON from AI model responses
- **Advanced Data Extraction**: JSONPath-like queries and entity extraction
- **Intelligent Data Cleaning**: Remove AI artifacts and sanitize data
- **Smart Data Transformation**: Reshape data for ML/AI workflows
- **Schema Generation**: Auto-generate schemas from AI data samples

## Installation

```bash
pip install pyjsonkit
```

## Quick Start

### Basic JSON Operations
```python
from pyjsonkit import JSONHandler

# Create a JSON handler
handler = JSONHandler("data.json")

# Get a value
value = handler.get("key")

# Set a value
handler.set("key", "value")

# Validate JSON
is_valid = handler.validate()
```

### AI-Focused Features
```python
from pyjsonkit import AIJSONProcessor, JSONExtractor, JSONCleaner

# Extract JSON from AI responses
ai_response = '''
Here's the data you requested:
```json
{"name": "John", "age": 30}
```
'''
data = AIJSONProcessor.extract_json_from_text(ai_response)

# Clean AI-generated artifacts
messy_data = {"name": "[AI_GENERATED] John Doe", "note": "[TODO: verify]"}
clean_data = JSONCleaner.clean_ai_artifacts(messy_data)

# Extract data with complex queries
result = JSONExtractor.extract_by_path(data, "users[*].name")
```

## 📚 Comprehensive Feature Set

### AI JSON Processor
- Extract JSON from mixed AI responses (markdown, code blocks, plain text)
- Fix common AI JSON errors (quotes, booleans, trailing commas)
- Batch process multiple AI responses with error handling
- Extract structured data from natural language text

### Advanced Data Extraction
- JSONPath-like data extraction with complex path queries
- Multi-path extraction in single operations
- AI entity extraction (emails, phones, URLs, etc.)
- Nested array extraction with configurable depth
- Schema analysis and statistics

### Intelligent Data Cleaning
- Remove AI-generated artifacts and markers
- Sanitize sensitive data for AI processing
- Normalize strings and remove extra whitespace
- Deduplicate arrays and remove null/empty values
- Clean malformed data structures

### Smart Data Transformation
- Reshape data for ML training (features/labels separation)
- Convert to chat/conversation formats for LLMs
- Create embeddings-ready format with metadata
- Aggregate and pivot data for analysis
- Flatten nested structures for CSV export
- Normalize data for AI prompts with size limits

### Schema Generation
- Auto-generate JSON schemas from data samples
- Create AI prompt schemas with validation rules
- Support for strict and flexible schema modes
- Include examples and constraints in generated schemas
- Validate data against generated schemas

## 🎯 Use Cases

- **AI/ML Data Pipelines**: Process and clean data from AI models
- **LLM Integration**: Extract structured data from language model outputs
- **Data Validation**: Ensure data quality in automated workflows
- **API Response Processing**: Handle inconsistent JSON from various sources
- **Data Transformation**: Prepare data for different ML frameworks

## Development

### Setup

```bash
git clone https://github.com/Pikachoo1111/jsonkit.git
cd jsonkit
pip install -e ".[dev]"
```

### Running Tests

```bash
pytest
```

### Code Formatting

```bash
black src/
isort src/
```

## 📖 Documentation

For detailed documentation and examples, see the [examples](examples/) directory.

## 🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

MIT License - see LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pyjsonkit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Armaan Shahpuri <armaan30312@gmail.com>",
    "keywords": "json, ai, ml, data-processing, validation, parsing, extraction, cleaning, transformation, schema-generation",
    "author": null,
    "author_email": "Armaan Shahpuri <armaan30312@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/0f/5d/0f5ffd0e37deca37e6a43fd373ee7c3e75d9050ced1ce33fa48cf504b06c/pyjsonkit-0.1.0.tar.gz",
    "platform": null,
    "description": "# PyJSONKit\n\nA comprehensive Python toolkit for JSON processing with advanced AI-focused features for modern data workflows.\n\n## \ud83d\ude80 Features\n\n### Core JSON Operations\n- Easy JSON file manipulation and validation\n- Pretty printing and formatting\n- Robust parsing with error handling\n- Simple API for common JSON operations\n\n### \ud83e\udd16 AI-Focused Features\n- **AI JSON Processing**: Extract and fix JSON from AI model responses\n- **Advanced Data Extraction**: JSONPath-like queries and entity extraction\n- **Intelligent Data Cleaning**: Remove AI artifacts and sanitize data\n- **Smart Data Transformation**: Reshape data for ML/AI workflows\n- **Schema Generation**: Auto-generate schemas from AI data samples\n\n## Installation\n\n```bash\npip install pyjsonkit\n```\n\n## Quick Start\n\n### Basic JSON Operations\n```python\nfrom pyjsonkit import JSONHandler\n\n# Create a JSON handler\nhandler = JSONHandler(\"data.json\")\n\n# Get a value\nvalue = handler.get(\"key\")\n\n# Set a value\nhandler.set(\"key\", \"value\")\n\n# Validate JSON\nis_valid = handler.validate()\n```\n\n### AI-Focused Features\n```python\nfrom pyjsonkit import AIJSONProcessor, JSONExtractor, JSONCleaner\n\n# Extract JSON from AI responses\nai_response = '''\nHere's the data you requested:\n```json\n{\"name\": \"John\", \"age\": 30}\n```\n'''\ndata = AIJSONProcessor.extract_json_from_text(ai_response)\n\n# Clean AI-generated artifacts\nmessy_data = {\"name\": \"[AI_GENERATED] John Doe\", \"note\": \"[TODO: verify]\"}\nclean_data = JSONCleaner.clean_ai_artifacts(messy_data)\n\n# Extract data with complex queries\nresult = JSONExtractor.extract_by_path(data, \"users[*].name\")\n```\n\n## \ud83d\udcda Comprehensive Feature Set\n\n### AI JSON Processor\n- Extract JSON from mixed AI responses (markdown, code blocks, plain text)\n- Fix common AI JSON errors (quotes, booleans, trailing commas)\n- Batch process multiple AI responses with error handling\n- Extract structured data from natural language text\n\n### Advanced Data Extraction\n- JSONPath-like data extraction with complex path queries\n- Multi-path extraction in single operations\n- AI entity extraction (emails, phones, URLs, etc.)\n- Nested array extraction with configurable depth\n- Schema analysis and statistics\n\n### Intelligent Data Cleaning\n- Remove AI-generated artifacts and markers\n- Sanitize sensitive data for AI processing\n- Normalize strings and remove extra whitespace\n- Deduplicate arrays and remove null/empty values\n- Clean malformed data structures\n\n### Smart Data Transformation\n- Reshape data for ML training (features/labels separation)\n- Convert to chat/conversation formats for LLMs\n- Create embeddings-ready format with metadata\n- Aggregate and pivot data for analysis\n- Flatten nested structures for CSV export\n- Normalize data for AI prompts with size limits\n\n### Schema Generation\n- Auto-generate JSON schemas from data samples\n- Create AI prompt schemas with validation rules\n- Support for strict and flexible schema modes\n- Include examples and constraints in generated schemas\n- Validate data against generated schemas\n\n## \ud83c\udfaf Use Cases\n\n- **AI/ML Data Pipelines**: Process and clean data from AI models\n- **LLM Integration**: Extract structured data from language model outputs\n- **Data Validation**: Ensure data quality in automated workflows\n- **API Response Processing**: Handle inconsistent JSON from various sources\n- **Data Transformation**: Prepare data for different ML frameworks\n\n## Development\n\n### Setup\n\n```bash\ngit clone https://github.com/Pikachoo1111/jsonkit.git\ncd jsonkit\npip install -e \".[dev]\"\n```\n\n### Running Tests\n\n```bash\npytest\n```\n\n### Code Formatting\n\n```bash\nblack src/\nisort src/\n```\n\n## \ud83d\udcd6 Documentation\n\nFor detailed documentation and examples, see the [examples](examples/) directory.\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nMIT License - see LICENSE file for details.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A comprehensive Python toolkit for JSON processing with advanced AI-focused features for modern data workflows",
    "version": "0.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/Pikachoo1111/jsonkit/issues",
        "Homepage": "https://github.com/Pikachoo1111/jsonkit",
        "Repository": "https://github.com/Pikachoo1111/jsonkit.git"
    },
    "split_keywords": [
        "json",
        " ai",
        " ml",
        " data-processing",
        " validation",
        " parsing",
        " extraction",
        " cleaning",
        " transformation",
        " schema-generation"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0efe1c5036ac3b1bed46324cf9c92fff1c691622c5e759ec31ea37dd2956f1b8",
                "md5": "b78370c867b86460686895384374edfe",
                "sha256": "12226ef2ab78747150a945c3afd61dd0e71a62f46c3738105ca4bf5a04972cb0"
            },
            "downloads": -1,
            "filename": "pyjsonkit-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b78370c867b86460686895384374edfe",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 24526,
            "upload_time": "2025-08-15T17:12:20",
            "upload_time_iso_8601": "2025-08-15T17:12:20.281391Z",
            "url": "https://files.pythonhosted.org/packages/0e/fe/1c5036ac3b1bed46324cf9c92fff1c691622c5e759ec31ea37dd2956f1b8/pyjsonkit-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0f5d0f5ffd0e37deca37e6a43fd373ee7c3e75d9050ced1ce33fa48cf504b06c",
                "md5": "a3c22ed6ba73ca747a3dd66afa91a3bf",
                "sha256": "c90be03d52e38584c0e8af0ae8cc447954b95f292a7098ef2f09267340969c34"
            },
            "downloads": -1,
            "filename": "pyjsonkit-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a3c22ed6ba73ca747a3dd66afa91a3bf",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 40369,
            "upload_time": "2025-08-15T17:12:21",
            "upload_time_iso_8601": "2025-08-15T17:12:21.474901Z",
            "url": "https://files.pythonhosted.org/packages/0f/5d/0f5ffd0e37deca37e6a43fd373ee7c3e75d9050ced1ce33fa48cf504b06c/pyjsonkit-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-15 17:12:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Pikachoo1111",
    "github_project": "jsonkit",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pyjsonkit"
}
        
Elapsed time: 2.04475s