note-to-json


Namenote-to-json JSON
Version 0.2.3 PyPI version JSON
download
home_pageNone
SummaryConvert Markdown or text files to structured JSON, offline.
upload_time2025-08-11 23:20:24
maintainerNone
docs_urlNone
authorNote to JSON Team
requires_python>=3.10
licenseMIT
keywords markdown json cli privacy notes
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Note to JSON

[![PyPI version](https://img.shields.io/pypi/v/note-to-json.svg)](https://pypi.org/project/note-to-json/)
[![Python versions](https://img.shields.io/pypi/pyversions/note-to-json.svg)](https://pypi.org/project/note-to-json/)
[![License](https://img.shields.io/pypi/l/note-to-json.svg)](LICENSE)
[![CI](https://github.com/Mugiwara555343/note2json/actions/workflows/python-ci.yml/badge.svg)](https://github.com/Mugiwara555343/note2json/actions/workflows/python-ci.yml)
[![Publish](https://github.com/Mugiwara555343/note2json/actions/workflows/publish.yml/badge.svg)](https://github.com/Mugiwara555343/note2json/actions/workflows/publish.yml)

Convert Markdown or text files to structured JSON, offline.

## Features

- **Privacy-first**: All processing happens locally, no data sent to external services
- **Flexible input**: Supports Markdown, plain text, and JSON files
- **Automatic encoding detection**: Handles UTF-8, UTF-16, and other encodings
- **Batch processing**: Process multiple files with glob patterns
- **Resilient parsing**: Graceful handling of malformed inputs and encoding issues
- **Progress reporting**: Detailed feedback for batch operations
- **Error recovery**: Continue processing even when some files fail

## Installation

```bash
pip install note-to-json
```

## Quick Start

```bash
# Convert a single file
note2json input.md

# Convert multiple files
note2json *.md

# Output to STDOUT
note2json input.md --stdout

# Pretty-print JSON
note2json input.md --stdout --pretty
```

## CLI Usage

### Basic Commands

```bash
note2json [OPTIONS] INPUT_FILE(S)
```

### Options

- `-o, --output PATH`: Specify output file path
- `--stdout`: Print JSON to STDOUT instead of writing to file
- `--pretty`: Pretty-print JSON with 2-space indentation
- `--stdin`: Read input from STDIN instead of files
- `--input-format {auto,md,txt,json}`: Specify input format (default: auto)
- `--no-emoji`: Disable emoji in status output
- `--continue-on-error`: Continue processing remaining files even if some fail
- `--verbose`: Show detailed progress information
- `--retry-failed`: Automatically retry failed files with different strategies

### Input Formats

- **auto** (default): Automatically detect format based on content
- **md/txt**: Parse as Markdown/plain text
- **json**: Parse as JSON (with schema validation)

### Examples

```bash
# Parse to default output file
note2json input.md                    # → input.parsed.json

# Parse to custom output file
note2json input.md -o output.json     # → output.json

# Parse to STDOUT
note2json input.md --stdout           # → prints to terminal

# Pretty-print to STDOUT
note2json input.md --stdout --pretty  # → formatted JSON

# Process multiple files
note2json *.md                        # → individual .parsed.json files

# Continue on errors
note2json *.md --continue-on-error    # → process all files, report failures

# Retry failed files automatically
note2json *.md --retry-failed         # → retry failed files with different strategies

# Show progress
note2json *.md --verbose              # → detailed progress information

# Read from STDIN (Windows)
type data.json | note2json --stdin --input-format json --stdout

# Read from STDIN (macOS/Linux)
cat data.json | note2json --stdin --input-format json --stdout
```

## Resilience Features

### Error Handling

The CLI provides robust error handling with clear, actionable error messages:

- **Encoding issues**: Automatic fallback to multiple encoding detection methods
- **Malformed inputs**: Graceful degradation with automatic validation fixes
- **Batch processing**: Continue processing even when individual files fail
- **Detailed reporting**: Comprehensive error summaries with categorization
- **Actionable advice**: Specific suggestions for fixing common issues
- **Retry strategies**: Automatic retry with different parsing approaches

### Error Types

- **Missing files**: Exit code 2
- **Parsing errors**: Exit code 3
- **Encoding errors**: Detailed information about attempted encodings
- **Validation errors**: Automatic fixing of common schema issues
- **Format mismatches**: Clear guidance on input format selection
- **Retry failures**: Information when all retry strategies fail

### Enhanced Error Messages

Error messages now include specific, actionable advice:

```bash
# Example of enhanced error message with advice
Error: Schema validation failed at 'title': 'None' is not of type 'string'
💡 Advice: Add the missing required field 'title'
```

### Retry Logic

Use `--retry-failed` to automatically attempt processing failed files with different strategies:

```bash
note2json *.md --retry-failed
```

The retry system will:
1. **Format switching**: Try different input formats (txt, json, auto)
2. **Raw text processing**: Fall back to basic text extraction
3. **Schema relaxation**: Create minimal valid structures when possible
4. **Detailed reporting**: Show which retry strategy succeeded

### Continue on Error

Use `--continue-on-error` to process all files even when some fail:

```bash
note2json *.md --continue-on-error
```

This will:
- Process all files that can be parsed
- Report failures with detailed error messages
- Provide a summary of successful vs. failed files
- Exit with appropriate error code

### Enhanced Progress Reporting

Use `--verbose` for detailed progress information with time estimation:

```bash
note2json *.md --verbose
```

Shows:
- Current file being processed
- Progress counter (e.g., [3/10])
- Visual progress bar with percentage
- Estimated time remaining (ETA)
- Summary of results
- Error breakdown by type
- Troubleshooting tips for common issues

## Output Schema

The tool outputs structured JSON with the following schema:

```json
{
  "title": "string",
  "timestamp": "ISO 8601 date-time",
  "raw_text": "string",
  "plain_text": "string",
  "tags": ["string"],
  "headers": ["string"],
  "date": "string (optional)",
  "tone": "string (optional)",
  "summary": "string (optional)",
  "reflections": ["string (optional)"]
}
```

## Input Format Support

### Markdown/Text

- **Headers**: Extracts `# Title` as headers
- **Metadata**: Parses `**Date:**`, `**Tags:**`, `**Tone:**` fields
- **Summary**: Extracts content between `**Summary:**` and `---`
- **Reflections**: Extracts bullet points after `**Core Reflections:**`

### JSON

- **Schema validation**: Ensures output matches required schema
- **Auto-normalization**: Converts arbitrary JSON to schema format
- **Format detection**: Automatically identifies JSON vs. text content

## Encoding Support

- **UTF-8**: Standard encoding with BOM support
- **UTF-16**: Little-endian and big-endian variants
- **Fallback detection**: Uses chardet for automatic encoding detection
- **Error handling**: Graceful degradation with detailed error reporting

## Development

### Installation

```bash
git clone https://github.com/Mugiwara555343/note2json.git
cd note2json
pip install -e .
```

### Testing

```bash
# Run all tests
pytest

# Run integration tests only
pytest -m integration

# Run with coverage
pytest --cov=note_to_json
```

### Code Quality

```bash
# Format code
black note_to_json/ tests/

# Sort imports
isort note_to_json/ tests/

# Run pre-commit hooks
pre-commit run --all-files
```

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new functionality
5. Ensure all tests pass
6. Submit a pull request

## License

MIT License - see [LICENSE](LICENSE) file for details.

## Changelog

### v0.2.2

- **Resilience improvements**: Better error handling and recovery
- **Continue on error**: Process remaining files even when some fail
- **Progress reporting**: Detailed feedback for batch operations
- **Enhanced encoding detection**: Fallback mechanisms and better error messages
- **Validation fixes**: Automatic correction of common schema issues
- **Error categorization**: Grouped error reporting for better analysis

### v0.2.1

- Improved encoding detection
- Better error messages
- Enhanced JSON passthrough

### v0.2.0

- Added JSON input support
- Improved encoding handling
- Better error reporting

### v0.1.0

- Initial release
- Basic Markdown parsing
- CLI interface

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "note-to-json",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "markdown, json, cli, privacy, notes",
    "author": "Note to JSON Team",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/fe/d3/a0afff38f823394264e1832e74e74e2b7eb936b174fc81ffc8652f1223cb/note_to_json-0.2.3.tar.gz",
    "platform": null,
    "description": "# Note to JSON\n\n[![PyPI version](https://img.shields.io/pypi/v/note-to-json.svg)](https://pypi.org/project/note-to-json/)\n[![Python versions](https://img.shields.io/pypi/pyversions/note-to-json.svg)](https://pypi.org/project/note-to-json/)\n[![License](https://img.shields.io/pypi/l/note-to-json.svg)](LICENSE)\n[![CI](https://github.com/Mugiwara555343/note2json/actions/workflows/python-ci.yml/badge.svg)](https://github.com/Mugiwara555343/note2json/actions/workflows/python-ci.yml)\n[![Publish](https://github.com/Mugiwara555343/note2json/actions/workflows/publish.yml/badge.svg)](https://github.com/Mugiwara555343/note2json/actions/workflows/publish.yml)\n\nConvert Markdown or text files to structured JSON, offline.\n\n## Features\n\n- **Privacy-first**: All processing happens locally, no data sent to external services\n- **Flexible input**: Supports Markdown, plain text, and JSON files\n- **Automatic encoding detection**: Handles UTF-8, UTF-16, and other encodings\n- **Batch processing**: Process multiple files with glob patterns\n- **Resilient parsing**: Graceful handling of malformed inputs and encoding issues\n- **Progress reporting**: Detailed feedback for batch operations\n- **Error recovery**: Continue processing even when some files fail\n\n## Installation\n\n```bash\npip install note-to-json\n```\n\n## Quick Start\n\n```bash\n# Convert a single file\nnote2json input.md\n\n# Convert multiple files\nnote2json *.md\n\n# Output to STDOUT\nnote2json input.md --stdout\n\n# Pretty-print JSON\nnote2json input.md --stdout --pretty\n```\n\n## CLI Usage\n\n### Basic Commands\n\n```bash\nnote2json [OPTIONS] INPUT_FILE(S)\n```\n\n### Options\n\n- `-o, --output PATH`: Specify output file path\n- `--stdout`: Print JSON to STDOUT instead of writing to file\n- `--pretty`: Pretty-print JSON with 2-space indentation\n- `--stdin`: Read input from STDIN instead of files\n- `--input-format {auto,md,txt,json}`: Specify input format (default: auto)\n- `--no-emoji`: Disable emoji in status output\n- `--continue-on-error`: Continue processing remaining files even if some fail\n- `--verbose`: Show detailed progress information\n- `--retry-failed`: Automatically retry failed files with different strategies\n\n### Input Formats\n\n- **auto** (default): Automatically detect format based on content\n- **md/txt**: Parse as Markdown/plain text\n- **json**: Parse as JSON (with schema validation)\n\n### Examples\n\n```bash\n# Parse to default output file\nnote2json input.md                    # \u2192 input.parsed.json\n\n# Parse to custom output file\nnote2json input.md -o output.json     # \u2192 output.json\n\n# Parse to STDOUT\nnote2json input.md --stdout           # \u2192 prints to terminal\n\n# Pretty-print to STDOUT\nnote2json input.md --stdout --pretty  # \u2192 formatted JSON\n\n# Process multiple files\nnote2json *.md                        # \u2192 individual .parsed.json files\n\n# Continue on errors\nnote2json *.md --continue-on-error    # \u2192 process all files, report failures\n\n# Retry failed files automatically\nnote2json *.md --retry-failed         # \u2192 retry failed files with different strategies\n\n# Show progress\nnote2json *.md --verbose              # \u2192 detailed progress information\n\n# Read from STDIN (Windows)\ntype data.json | note2json --stdin --input-format json --stdout\n\n# Read from STDIN (macOS/Linux)\ncat data.json | note2json --stdin --input-format json --stdout\n```\n\n## Resilience Features\n\n### Error Handling\n\nThe CLI provides robust error handling with clear, actionable error messages:\n\n- **Encoding issues**: Automatic fallback to multiple encoding detection methods\n- **Malformed inputs**: Graceful degradation with automatic validation fixes\n- **Batch processing**: Continue processing even when individual files fail\n- **Detailed reporting**: Comprehensive error summaries with categorization\n- **Actionable advice**: Specific suggestions for fixing common issues\n- **Retry strategies**: Automatic retry with different parsing approaches\n\n### Error Types\n\n- **Missing files**: Exit code 2\n- **Parsing errors**: Exit code 3\n- **Encoding errors**: Detailed information about attempted encodings\n- **Validation errors**: Automatic fixing of common schema issues\n- **Format mismatches**: Clear guidance on input format selection\n- **Retry failures**: Information when all retry strategies fail\n\n### Enhanced Error Messages\n\nError messages now include specific, actionable advice:\n\n```bash\n# Example of enhanced error message with advice\nError: Schema validation failed at 'title': 'None' is not of type 'string'\n\ud83d\udca1 Advice: Add the missing required field 'title'\n```\n\n### Retry Logic\n\nUse `--retry-failed` to automatically attempt processing failed files with different strategies:\n\n```bash\nnote2json *.md --retry-failed\n```\n\nThe retry system will:\n1. **Format switching**: Try different input formats (txt, json, auto)\n2. **Raw text processing**: Fall back to basic text extraction\n3. **Schema relaxation**: Create minimal valid structures when possible\n4. **Detailed reporting**: Show which retry strategy succeeded\n\n### Continue on Error\n\nUse `--continue-on-error` to process all files even when some fail:\n\n```bash\nnote2json *.md --continue-on-error\n```\n\nThis will:\n- Process all files that can be parsed\n- Report failures with detailed error messages\n- Provide a summary of successful vs. failed files\n- Exit with appropriate error code\n\n### Enhanced Progress Reporting\n\nUse `--verbose` for detailed progress information with time estimation:\n\n```bash\nnote2json *.md --verbose\n```\n\nShows:\n- Current file being processed\n- Progress counter (e.g., [3/10])\n- Visual progress bar with percentage\n- Estimated time remaining (ETA)\n- Summary of results\n- Error breakdown by type\n- Troubleshooting tips for common issues\n\n## Output Schema\n\nThe tool outputs structured JSON with the following schema:\n\n```json\n{\n  \"title\": \"string\",\n  \"timestamp\": \"ISO 8601 date-time\",\n  \"raw_text\": \"string\",\n  \"plain_text\": \"string\",\n  \"tags\": [\"string\"],\n  \"headers\": [\"string\"],\n  \"date\": \"string (optional)\",\n  \"tone\": \"string (optional)\",\n  \"summary\": \"string (optional)\",\n  \"reflections\": [\"string (optional)\"]\n}\n```\n\n## Input Format Support\n\n### Markdown/Text\n\n- **Headers**: Extracts `# Title` as headers\n- **Metadata**: Parses `**Date:**`, `**Tags:**`, `**Tone:**` fields\n- **Summary**: Extracts content between `**Summary:**` and `---`\n- **Reflections**: Extracts bullet points after `**Core Reflections:**`\n\n### JSON\n\n- **Schema validation**: Ensures output matches required schema\n- **Auto-normalization**: Converts arbitrary JSON to schema format\n- **Format detection**: Automatically identifies JSON vs. text content\n\n## Encoding Support\n\n- **UTF-8**: Standard encoding with BOM support\n- **UTF-16**: Little-endian and big-endian variants\n- **Fallback detection**: Uses chardet for automatic encoding detection\n- **Error handling**: Graceful degradation with detailed error reporting\n\n## Development\n\n### Installation\n\n```bash\ngit clone https://github.com/Mugiwara555343/note2json.git\ncd note2json\npip install -e .\n```\n\n### Testing\n\n```bash\n# Run all tests\npytest\n\n# Run integration tests only\npytest -m integration\n\n# Run with coverage\npytest --cov=note_to_json\n```\n\n### Code Quality\n\n```bash\n# Format code\nblack note_to_json/ tests/\n\n# Sort imports\nisort note_to_json/ tests/\n\n# Run pre-commit hooks\npre-commit run --all-files\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests for new functionality\n5. Ensure all tests pass\n6. Submit a pull request\n\n## License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## Changelog\n\n### v0.2.2\n\n- **Resilience improvements**: Better error handling and recovery\n- **Continue on error**: Process remaining files even when some fail\n- **Progress reporting**: Detailed feedback for batch operations\n- **Enhanced encoding detection**: Fallback mechanisms and better error messages\n- **Validation fixes**: Automatic correction of common schema issues\n- **Error categorization**: Grouped error reporting for better analysis\n\n### v0.2.1\n\n- Improved encoding detection\n- Better error messages\n- Enhanced JSON passthrough\n\n### v0.2.0\n\n- Added JSON input support\n- Improved encoding handling\n- Better error reporting\n\n### v0.1.0\n\n- Initial release\n- Basic Markdown parsing\n- CLI interface\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Convert Markdown or text files to structured JSON, offline.",
    "version": "0.2.3",
    "project_urls": {
        "Homepage": "https://github.com/Mugiwara555343/note2json",
        "Issues": "https://github.com/Mugiwara555343/note2json/issues",
        "Repository": "https://github.com/Mugiwara555343/note2json"
    },
    "split_keywords": [
        "markdown",
        " json",
        " cli",
        " privacy",
        " notes"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "412347e564d5109c74ef44f85c8a8a346b87ac4950d6d6562dbcee39de88b877",
                "md5": "a016abc34174081c6b40c40e931dc0b9",
                "sha256": "f1f8f09d3bb901e1908368e2753fb8fc75703c7edc1e870f95ad20209f15ad50"
            },
            "downloads": -1,
            "filename": "note_to_json-0.2.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a016abc34174081c6b40c40e931dc0b9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 16774,
            "upload_time": "2025-08-11T23:20:22",
            "upload_time_iso_8601": "2025-08-11T23:20:22.619750Z",
            "url": "https://files.pythonhosted.org/packages/41/23/47e564d5109c74ef44f85c8a8a346b87ac4950d6d6562dbcee39de88b877/note_to_json-0.2.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fed3a0afff38f823394264e1832e74e74e2b7eb936b174fc81ffc8652f1223cb",
                "md5": "eb3875f1c334fc0d7e69e944ecc1a0d1",
                "sha256": "7b5450b2a5097d451624cd5b533e26be6ed4c29510b613cc200037fd297ed24b"
            },
            "downloads": -1,
            "filename": "note_to_json-0.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "eb3875f1c334fc0d7e69e944ecc1a0d1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 26934,
            "upload_time": "2025-08-11T23:20:24",
            "upload_time_iso_8601": "2025-08-11T23:20:24.043756Z",
            "url": "https://files.pythonhosted.org/packages/fe/d3/a0afff38f823394264e1832e74e74e2b7eb936b174fc81ffc8652f1223cb/note_to_json-0.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-11 23:20:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Mugiwara555343",
    "github_project": "note2json",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "note-to-json"
}
        
Elapsed time: 2.15071s