# Note to JSON
[](https://pypi.org/project/note-to-json/)
[](https://pypi.org/project/note-to-json/)
[](LICENSE)
[](https://github.com/Mugiwara555343/note2json/actions/workflows/python-ci.yml)
[](https://github.com/Mugiwara555343/note2json/actions/workflows/publish.yml)
Convert Markdown or text files to structured JSON, offline.
## Features
- **Privacy-first**: All processing happens locally, no data sent to external services
- **Flexible input**: Supports Markdown, plain text, and JSON files
- **Automatic encoding detection**: Handles UTF-8, UTF-16, and other encodings
- **Batch processing**: Process multiple files with glob patterns
- **Resilient parsing**: Graceful handling of malformed inputs and encoding issues
- **Progress reporting**: Detailed feedback for batch operations
- **Error recovery**: Continue processing even when some files fail
## Installation
```bash
pip install note-to-json
```
## Quick Start
```bash
# Convert a single file
note2json input.md
# Convert multiple files
note2json *.md
# Output to STDOUT
note2json input.md --stdout
# Pretty-print JSON
note2json input.md --stdout --pretty
```
## CLI Usage
### Basic Commands
```bash
note2json [OPTIONS] INPUT_FILE(S)
```
### Options
- `-o, --output PATH`: Specify output file path
- `--stdout`: Print JSON to STDOUT instead of writing to file
- `--pretty`: Pretty-print JSON with 2-space indentation
- `--stdin`: Read input from STDIN instead of files
- `--input-format {auto,md,txt,json}`: Specify input format (default: auto)
- `--no-emoji`: Disable emoji in status output
- `--continue-on-error`: Continue processing remaining files even if some fail
- `--verbose`: Show detailed progress information
- `--retry-failed`: Automatically retry failed files with different strategies
### Input Formats
- **auto** (default): Automatically detect format based on content
- **md/txt**: Parse as Markdown/plain text
- **json**: Parse as JSON (with schema validation)
### Examples
```bash
# Parse to default output file
note2json input.md # → input.parsed.json
# Parse to custom output file
note2json input.md -o output.json # → output.json
# Parse to STDOUT
note2json input.md --stdout # → prints to terminal
# Pretty-print to STDOUT
note2json input.md --stdout --pretty # → formatted JSON
# Process multiple files
note2json *.md # → individual .parsed.json files
# Continue on errors
note2json *.md --continue-on-error # → process all files, report failures
# Retry failed files automatically
note2json *.md --retry-failed # → retry failed files with different strategies
# Show progress
note2json *.md --verbose # → detailed progress information
# Read from STDIN (Windows)
type data.json | note2json --stdin --input-format json --stdout
# Read from STDIN (macOS/Linux)
cat data.json | note2json --stdin --input-format json --stdout
```
## Resilience Features
### Error Handling
The CLI provides robust error handling with clear, actionable error messages:
- **Encoding issues**: Automatic fallback to multiple encoding detection methods
- **Malformed inputs**: Graceful degradation with automatic validation fixes
- **Batch processing**: Continue processing even when individual files fail
- **Detailed reporting**: Comprehensive error summaries with categorization
- **Actionable advice**: Specific suggestions for fixing common issues
- **Retry strategies**: Automatic retry with different parsing approaches
### Error Types
- **Missing files**: Exit code 2
- **Parsing errors**: Exit code 3
- **Encoding errors**: Detailed information about attempted encodings
- **Validation errors**: Automatic fixing of common schema issues
- **Format mismatches**: Clear guidance on input format selection
- **Retry failures**: Information when all retry strategies fail
### Enhanced Error Messages
Error messages now include specific, actionable advice:
```bash
# Example of enhanced error message with advice
Error: Schema validation failed at 'title': 'None' is not of type 'string'
💡 Advice: Add the missing required field 'title'
```
### Retry Logic
Use `--retry-failed` to automatically attempt processing failed files with different strategies:
```bash
note2json *.md --retry-failed
```
The retry system will:
1. **Format switching**: Try different input formats (txt, json, auto)
2. **Raw text processing**: Fall back to basic text extraction
3. **Schema relaxation**: Create minimal valid structures when possible
4. **Detailed reporting**: Show which retry strategy succeeded
### Continue on Error
Use `--continue-on-error` to process all files even when some fail:
```bash
note2json *.md --continue-on-error
```
This will:
- Process all files that can be parsed
- Report failures with detailed error messages
- Provide a summary of successful vs. failed files
- Exit with appropriate error code
### Enhanced Progress Reporting
Use `--verbose` for detailed progress information with time estimation:
```bash
note2json *.md --verbose
```
Shows:
- Current file being processed
- Progress counter (e.g., [3/10])
- Visual progress bar with percentage
- Estimated time remaining (ETA)
- Summary of results
- Error breakdown by type
- Troubleshooting tips for common issues
## Output Schema
The tool outputs structured JSON with the following schema:
```json
{
"title": "string",
"timestamp": "ISO 8601 date-time",
"raw_text": "string",
"plain_text": "string",
"tags": ["string"],
"headers": ["string"],
"date": "string (optional)",
"tone": "string (optional)",
"summary": "string (optional)",
"reflections": ["string (optional)"]
}
```
## Input Format Support
### Markdown/Text
- **Headers**: Extracts `# Title` as headers
- **Metadata**: Parses `**Date:**`, `**Tags:**`, `**Tone:**` fields
- **Summary**: Extracts content between `**Summary:**` and `---`
- **Reflections**: Extracts bullet points after `**Core Reflections:**`
### JSON
- **Schema validation**: Ensures output matches required schema
- **Auto-normalization**: Converts arbitrary JSON to schema format
- **Format detection**: Automatically identifies JSON vs. text content
## Encoding Support
- **UTF-8**: Standard encoding with BOM support
- **UTF-16**: Little-endian and big-endian variants
- **Fallback detection**: Uses chardet for automatic encoding detection
- **Error handling**: Graceful degradation with detailed error reporting
## Development
### Installation
```bash
git clone https://github.com/Mugiwara555343/note2json.git
cd note2json
pip install -e .
```
### Testing
```bash
# Run all tests
pytest
# Run integration tests only
pytest -m integration
# Run with coverage
pytest --cov=note_to_json
```
### Code Quality
```bash
# Format code
black note_to_json/ tests/
# Sort imports
isort note_to_json/ tests/
# Run pre-commit hooks
pre-commit run --all-files
```
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new functionality
5. Ensure all tests pass
6. Submit a pull request
## License
MIT License - see [LICENSE](LICENSE) file for details.
## Changelog
### v0.2.2
- **Resilience improvements**: Better error handling and recovery
- **Continue on error**: Process remaining files even when some fail
- **Progress reporting**: Detailed feedback for batch operations
- **Enhanced encoding detection**: Fallback mechanisms and better error messages
- **Validation fixes**: Automatic correction of common schema issues
- **Error categorization**: Grouped error reporting for better analysis
### v0.2.1
- Improved encoding detection
- Better error messages
- Enhanced JSON passthrough
### v0.2.0
- Added JSON input support
- Improved encoding handling
- Better error reporting
### v0.1.0
- Initial release
- Basic Markdown parsing
- CLI interface
Raw data
{
"_id": null,
"home_page": null,
"name": "note-to-json",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "markdown, json, cli, privacy, notes",
"author": "Note to JSON Team",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/fe/d3/a0afff38f823394264e1832e74e74e2b7eb936b174fc81ffc8652f1223cb/note_to_json-0.2.3.tar.gz",
"platform": null,
"description": "# Note to JSON\n\n[](https://pypi.org/project/note-to-json/)\n[](https://pypi.org/project/note-to-json/)\n[](LICENSE)\n[](https://github.com/Mugiwara555343/note2json/actions/workflows/python-ci.yml)\n[](https://github.com/Mugiwara555343/note2json/actions/workflows/publish.yml)\n\nConvert Markdown or text files to structured JSON, offline.\n\n## Features\n\n- **Privacy-first**: All processing happens locally, no data sent to external services\n- **Flexible input**: Supports Markdown, plain text, and JSON files\n- **Automatic encoding detection**: Handles UTF-8, UTF-16, and other encodings\n- **Batch processing**: Process multiple files with glob patterns\n- **Resilient parsing**: Graceful handling of malformed inputs and encoding issues\n- **Progress reporting**: Detailed feedback for batch operations\n- **Error recovery**: Continue processing even when some files fail\n\n## Installation\n\n```bash\npip install note-to-json\n```\n\n## Quick Start\n\n```bash\n# Convert a single file\nnote2json input.md\n\n# Convert multiple files\nnote2json *.md\n\n# Output to STDOUT\nnote2json input.md --stdout\n\n# Pretty-print JSON\nnote2json input.md --stdout --pretty\n```\n\n## CLI Usage\n\n### Basic Commands\n\n```bash\nnote2json [OPTIONS] INPUT_FILE(S)\n```\n\n### Options\n\n- `-o, --output PATH`: Specify output file path\n- `--stdout`: Print JSON to STDOUT instead of writing to file\n- `--pretty`: Pretty-print JSON with 2-space indentation\n- `--stdin`: Read input from STDIN instead of files\n- `--input-format {auto,md,txt,json}`: Specify input format (default: auto)\n- `--no-emoji`: Disable emoji in status output\n- `--continue-on-error`: Continue processing remaining files even if some fail\n- `--verbose`: Show detailed progress information\n- `--retry-failed`: Automatically retry failed files with different strategies\n\n### Input Formats\n\n- **auto** (default): Automatically detect format based on content\n- **md/txt**: Parse as Markdown/plain text\n- **json**: Parse as JSON (with schema validation)\n\n### Examples\n\n```bash\n# Parse to default output file\nnote2json input.md # \u2192 input.parsed.json\n\n# Parse to custom output file\nnote2json input.md -o output.json # \u2192 output.json\n\n# Parse to STDOUT\nnote2json input.md --stdout # \u2192 prints to terminal\n\n# Pretty-print to STDOUT\nnote2json input.md --stdout --pretty # \u2192 formatted JSON\n\n# Process multiple files\nnote2json *.md # \u2192 individual .parsed.json files\n\n# Continue on errors\nnote2json *.md --continue-on-error # \u2192 process all files, report failures\n\n# Retry failed files automatically\nnote2json *.md --retry-failed # \u2192 retry failed files with different strategies\n\n# Show progress\nnote2json *.md --verbose # \u2192 detailed progress information\n\n# Read from STDIN (Windows)\ntype data.json | note2json --stdin --input-format json --stdout\n\n# Read from STDIN (macOS/Linux)\ncat data.json | note2json --stdin --input-format json --stdout\n```\n\n## Resilience Features\n\n### Error Handling\n\nThe CLI provides robust error handling with clear, actionable error messages:\n\n- **Encoding issues**: Automatic fallback to multiple encoding detection methods\n- **Malformed inputs**: Graceful degradation with automatic validation fixes\n- **Batch processing**: Continue processing even when individual files fail\n- **Detailed reporting**: Comprehensive error summaries with categorization\n- **Actionable advice**: Specific suggestions for fixing common issues\n- **Retry strategies**: Automatic retry with different parsing approaches\n\n### Error Types\n\n- **Missing files**: Exit code 2\n- **Parsing errors**: Exit code 3\n- **Encoding errors**: Detailed information about attempted encodings\n- **Validation errors**: Automatic fixing of common schema issues\n- **Format mismatches**: Clear guidance on input format selection\n- **Retry failures**: Information when all retry strategies fail\n\n### Enhanced Error Messages\n\nError messages now include specific, actionable advice:\n\n```bash\n# Example of enhanced error message with advice\nError: Schema validation failed at 'title': 'None' is not of type 'string'\n\ud83d\udca1 Advice: Add the missing required field 'title'\n```\n\n### Retry Logic\n\nUse `--retry-failed` to automatically attempt processing failed files with different strategies:\n\n```bash\nnote2json *.md --retry-failed\n```\n\nThe retry system will:\n1. **Format switching**: Try different input formats (txt, json, auto)\n2. **Raw text processing**: Fall back to basic text extraction\n3. **Schema relaxation**: Create minimal valid structures when possible\n4. **Detailed reporting**: Show which retry strategy succeeded\n\n### Continue on Error\n\nUse `--continue-on-error` to process all files even when some fail:\n\n```bash\nnote2json *.md --continue-on-error\n```\n\nThis will:\n- Process all files that can be parsed\n- Report failures with detailed error messages\n- Provide a summary of successful vs. failed files\n- Exit with appropriate error code\n\n### Enhanced Progress Reporting\n\nUse `--verbose` for detailed progress information with time estimation:\n\n```bash\nnote2json *.md --verbose\n```\n\nShows:\n- Current file being processed\n- Progress counter (e.g., [3/10])\n- Visual progress bar with percentage\n- Estimated time remaining (ETA)\n- Summary of results\n- Error breakdown by type\n- Troubleshooting tips for common issues\n\n## Output Schema\n\nThe tool outputs structured JSON with the following schema:\n\n```json\n{\n \"title\": \"string\",\n \"timestamp\": \"ISO 8601 date-time\",\n \"raw_text\": \"string\",\n \"plain_text\": \"string\",\n \"tags\": [\"string\"],\n \"headers\": [\"string\"],\n \"date\": \"string (optional)\",\n \"tone\": \"string (optional)\",\n \"summary\": \"string (optional)\",\n \"reflections\": [\"string (optional)\"]\n}\n```\n\n## Input Format Support\n\n### Markdown/Text\n\n- **Headers**: Extracts `# Title` as headers\n- **Metadata**: Parses `**Date:**`, `**Tags:**`, `**Tone:**` fields\n- **Summary**: Extracts content between `**Summary:**` and `---`\n- **Reflections**: Extracts bullet points after `**Core Reflections:**`\n\n### JSON\n\n- **Schema validation**: Ensures output matches required schema\n- **Auto-normalization**: Converts arbitrary JSON to schema format\n- **Format detection**: Automatically identifies JSON vs. text content\n\n## Encoding Support\n\n- **UTF-8**: Standard encoding with BOM support\n- **UTF-16**: Little-endian and big-endian variants\n- **Fallback detection**: Uses chardet for automatic encoding detection\n- **Error handling**: Graceful degradation with detailed error reporting\n\n## Development\n\n### Installation\n\n```bash\ngit clone https://github.com/Mugiwara555343/note2json.git\ncd note2json\npip install -e .\n```\n\n### Testing\n\n```bash\n# Run all tests\npytest\n\n# Run integration tests only\npytest -m integration\n\n# Run with coverage\npytest --cov=note_to_json\n```\n\n### Code Quality\n\n```bash\n# Format code\nblack note_to_json/ tests/\n\n# Sort imports\nisort note_to_json/ tests/\n\n# Run pre-commit hooks\npre-commit run --all-files\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests for new functionality\n5. Ensure all tests pass\n6. Submit a pull request\n\n## License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## Changelog\n\n### v0.2.2\n\n- **Resilience improvements**: Better error handling and recovery\n- **Continue on error**: Process remaining files even when some fail\n- **Progress reporting**: Detailed feedback for batch operations\n- **Enhanced encoding detection**: Fallback mechanisms and better error messages\n- **Validation fixes**: Automatic correction of common schema issues\n- **Error categorization**: Grouped error reporting for better analysis\n\n### v0.2.1\n\n- Improved encoding detection\n- Better error messages\n- Enhanced JSON passthrough\n\n### v0.2.0\n\n- Added JSON input support\n- Improved encoding handling\n- Better error reporting\n\n### v0.1.0\n\n- Initial release\n- Basic Markdown parsing\n- CLI interface\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Convert Markdown or text files to structured JSON, offline.",
"version": "0.2.3",
"project_urls": {
"Homepage": "https://github.com/Mugiwara555343/note2json",
"Issues": "https://github.com/Mugiwara555343/note2json/issues",
"Repository": "https://github.com/Mugiwara555343/note2json"
},
"split_keywords": [
"markdown",
" json",
" cli",
" privacy",
" notes"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "412347e564d5109c74ef44f85c8a8a346b87ac4950d6d6562dbcee39de88b877",
"md5": "a016abc34174081c6b40c40e931dc0b9",
"sha256": "f1f8f09d3bb901e1908368e2753fb8fc75703c7edc1e870f95ad20209f15ad50"
},
"downloads": -1,
"filename": "note_to_json-0.2.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a016abc34174081c6b40c40e931dc0b9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 16774,
"upload_time": "2025-08-11T23:20:22",
"upload_time_iso_8601": "2025-08-11T23:20:22.619750Z",
"url": "https://files.pythonhosted.org/packages/41/23/47e564d5109c74ef44f85c8a8a346b87ac4950d6d6562dbcee39de88b877/note_to_json-0.2.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "fed3a0afff38f823394264e1832e74e74e2b7eb936b174fc81ffc8652f1223cb",
"md5": "eb3875f1c334fc0d7e69e944ecc1a0d1",
"sha256": "7b5450b2a5097d451624cd5b533e26be6ed4c29510b613cc200037fd297ed24b"
},
"downloads": -1,
"filename": "note_to_json-0.2.3.tar.gz",
"has_sig": false,
"md5_digest": "eb3875f1c334fc0d7e69e944ecc1a0d1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 26934,
"upload_time": "2025-08-11T23:20:24",
"upload_time_iso_8601": "2025-08-11T23:20:24.043756Z",
"url": "https://files.pythonhosted.org/packages/fe/d3/a0afff38f823394264e1832e74e74e2b7eb936b174fc81ffc8652f1223cb/note_to_json-0.2.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-11 23:20:24",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Mugiwara555343",
"github_project": "note2json",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "note-to-json"
}