sem-meta


Namesem-meta JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/asaleh33/sem-meta
SummaryUnified interface for SEM image processing: metadata extraction, OCR-based pixel size estimation, and unit conversion
upload_time2025-09-05 15:12:45
maintainerNone
docs_urlNone
authorAhmed Khalil, Tommaso Rodani
requires_python>=3.7
licenseMIT
keywords sem microscopy image processing metadata extraction ocr pixel size scientific imaging electron microscopy
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # sem-meta

A unified Python package for SEM (Scanning Electron Microscopy) image processing, providing metadata extraction, OCR-based pixel size estimation, and unit conversion utilities.

## Features

- **SEMMetaData**: Extract and format metadata from SEM images
- **SEMOCR**: OCR-based pixel size estimation from scale bars
- **ConvertPS**: Unit conversion and error analysis for pixel size data
- **FullSEMKeys**: Standardized metadata key list for SEM images

## Installation

Install from PyPI:

```bash
pip install sem-meta
```

## Quick Start

```python
from sem_meta import SEMMeta, OCRPS, ConvertScale, FullSEMKeys

# Extract metadata from SEM images
metadata = SEMMeta.extract_metadata("path/to/sem/image.tif")

# Perform OCR on scale bars
pixel_size = OCRPS.extract_pixel_size("path/to/image/with/scalebar.tif")

# Convert units
converted_size = ConvertScale.convert_units(pixel_size, "μm", "nm")

# Access standardized SEM metadata keys
sem_keys = FullSEMKeys
```

## Main Components

### SEMMetaData
Extracts and processes metadata from SEM image files, particularly TIFF files with EXIF data.

### SEMOCR
Uses OCR (Optical Character Recognition) to extract pixel size information from scale bars in SEM images. Includes noise filtering and error correction for common OCR mistakes.

### ConvertPS
Handles unit conversion and normalization for pixel size measurements, supporting various scientific units commonly used in microscopy.

### FullSEMKeys
Provides a standardized set of metadata keys for consistent SEM image annotation and data extraction.

## Dependencies

- numpy: Numerical computations
- PIL (Pillow): Image processing
- pymysql: SQL safety utilities
- termcolor: Terminal output styling
- matplotlib: Visualization
- opencv-python: Advanced image preprocessing
- pytesseract: OCR functionality

## Requirements

- Python 3.6+
- Tesseract OCR engine (for OCR functionality)

### Installing Tesseract

**Ubuntu/Debian:**
```bash
sudo apt install tesseract-ocr
```

**macOS:**
```bash
brew install tesseract
```

**Windows:**
Download and install from: https://github.com/UB-Mannheim/tesseract/wiki

## Usage Examples

### Extracting SEM Metadata

```python
from sem_meta import SEMMeta

# Initialize and extract metadata
sem_processor = SEMMeta
metadata = sem_processor.extract_metadata("sample.tif")
print(metadata)
```

### OCR-based Scale Bar Reading

```python
from sem_meta import OCRPS

# Extract pixel size from scale bar
ocr_processor = OCRPS
pixel_size = ocr_processor.extract_pixel_size("sem_image.tif")
print(f"Pixel size: {pixel_size}")
```

### Unit Conversion

```python
from sem_meta import ConvertScale

# Convert between units
converter = ConvertScale
result = converter.convert_units("0.5 μm", "nm")
print(f"Converted: {result}")
```

## File Structure

```
sem-meta/
├── src/
│   └── sem_meta/
│       ├── __init__.py
│       ├── metadata_Module.py    # SEMMetaData class
│       ├── ocr_Module.py         # SEMOCR class
│       ├── convert_Module.py     # ConvertPS class
│       ├── SEMKEYS.py           # FullSEMKeys definitions
│       └── OCR_NOISE_DB.py      # OCR noise patterns database
├── README.md
├── LICENSE
└── pyproject.toml
```

## Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Authors

- Ahmed Khalil - Initial work

## Acknowledgments

- Built for the SEM imaging community
- Supports various SEM manufacturers' metadata formats
- Includes extensive OCR noise pattern recognition

## Support

If you encounter any problems or have questions, please open an issue on the GitHub repository.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/asaleh33/sem-meta",
    "name": "sem-meta",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "Ahmed Khalil <salehastro@gmail.com>, Tommaso Rodani <tommaso.rodani.phd@gmail.com>",
    "keywords": "SEM, microscopy, image processing, metadata extraction, OCR, pixel size, scientific imaging, electron microscopy",
    "author": "Ahmed Khalil, Tommaso Rodani",
    "author_email": "Ahmed Khalil <salehastro@gmail.com>, Tommaso Rodani <tommaso.rodani.phd@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/ca/a2/5b8cee70affe9ab67e39834aca51a72a6e70932a592971cc9a17dc2c1046/sem_meta-0.1.0.tar.gz",
    "platform": null,
    "description": "# sem-meta\n\nA unified Python package for SEM (Scanning Electron Microscopy) image processing, providing metadata extraction, OCR-based pixel size estimation, and unit conversion utilities.\n\n## Features\n\n- **SEMMetaData**: Extract and format metadata from SEM images\n- **SEMOCR**: OCR-based pixel size estimation from scale bars\n- **ConvertPS**: Unit conversion and error analysis for pixel size data\n- **FullSEMKeys**: Standardized metadata key list for SEM images\n\n## Installation\n\nInstall from PyPI:\n\n```bash\npip install sem-meta\n```\n\n## Quick Start\n\n```python\nfrom sem_meta import SEMMeta, OCRPS, ConvertScale, FullSEMKeys\n\n# Extract metadata from SEM images\nmetadata = SEMMeta.extract_metadata(\"path/to/sem/image.tif\")\n\n# Perform OCR on scale bars\npixel_size = OCRPS.extract_pixel_size(\"path/to/image/with/scalebar.tif\")\n\n# Convert units\nconverted_size = ConvertScale.convert_units(pixel_size, \"\u03bcm\", \"nm\")\n\n# Access standardized SEM metadata keys\nsem_keys = FullSEMKeys\n```\n\n## Main Components\n\n### SEMMetaData\nExtracts and processes metadata from SEM image files, particularly TIFF files with EXIF data.\n\n### SEMOCR\nUses OCR (Optical Character Recognition) to extract pixel size information from scale bars in SEM images. Includes noise filtering and error correction for common OCR mistakes.\n\n### ConvertPS\nHandles unit conversion and normalization for pixel size measurements, supporting various scientific units commonly used in microscopy.\n\n### FullSEMKeys\nProvides a standardized set of metadata keys for consistent SEM image annotation and data extraction.\n\n## Dependencies\n\n- numpy: Numerical computations\n- PIL (Pillow): Image processing\n- pymysql: SQL safety utilities\n- termcolor: Terminal output styling\n- matplotlib: Visualization\n- opencv-python: Advanced image preprocessing\n- pytesseract: OCR functionality\n\n## Requirements\n\n- Python 3.6+\n- Tesseract OCR engine (for OCR functionality)\n\n### Installing Tesseract\n\n**Ubuntu/Debian:**\n```bash\nsudo apt install tesseract-ocr\n```\n\n**macOS:**\n```bash\nbrew install tesseract\n```\n\n**Windows:**\nDownload and install from: https://github.com/UB-Mannheim/tesseract/wiki\n\n## Usage Examples\n\n### Extracting SEM Metadata\n\n```python\nfrom sem_meta import SEMMeta\n\n# Initialize and extract metadata\nsem_processor = SEMMeta\nmetadata = sem_processor.extract_metadata(\"sample.tif\")\nprint(metadata)\n```\n\n### OCR-based Scale Bar Reading\n\n```python\nfrom sem_meta import OCRPS\n\n# Extract pixel size from scale bar\nocr_processor = OCRPS\npixel_size = ocr_processor.extract_pixel_size(\"sem_image.tif\")\nprint(f\"Pixel size: {pixel_size}\")\n```\n\n### Unit Conversion\n\n```python\nfrom sem_meta import ConvertScale\n\n# Convert between units\nconverter = ConvertScale\nresult = converter.convert_units(\"0.5 \u03bcm\", \"nm\")\nprint(f\"Converted: {result}\")\n```\n\n## File Structure\n\n```\nsem-meta/\n\u251c\u2500\u2500 src/\n\u2502   \u2514\u2500\u2500 sem_meta/\n\u2502       \u251c\u2500\u2500 __init__.py\n\u2502       \u251c\u2500\u2500 metadata_Module.py    # SEMMetaData class\n\u2502       \u251c\u2500\u2500 ocr_Module.py         # SEMOCR class\n\u2502       \u251c\u2500\u2500 convert_Module.py     # ConvertPS class\n\u2502       \u251c\u2500\u2500 SEMKEYS.py           # FullSEMKeys definitions\n\u2502       \u2514\u2500\u2500 OCR_NOISE_DB.py      # OCR noise patterns database\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 LICENSE\n\u2514\u2500\u2500 pyproject.toml\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add some amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Authors\n\n- Ahmed Khalil - Initial work\n\n## Acknowledgments\n\n- Built for the SEM imaging community\n- Supports various SEM manufacturers' metadata formats\n- Includes extensive OCR noise pattern recognition\n\n## Support\n\nIf you encounter any problems or have questions, please open an issue on the GitHub repository.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Unified interface for SEM image processing: metadata extraction, OCR-based pixel size estimation, and unit conversion",
    "version": "0.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/asaleh33/sem-meta/issues",
        "Documentation": "https://github.com/asaleh33/sem-meta#readme",
        "Homepage": "https://github.com/asaleh33/sem-meta",
        "Repository": "https://github.com/asaleh33/sem-meta"
    },
    "split_keywords": [
        "sem",
        " microscopy",
        " image processing",
        " metadata extraction",
        " ocr",
        " pixel size",
        " scientific imaging",
        " electron microscopy"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c89e9f05007f5ed504129bbf9833228d3649c4ef2fa925301d371f7bd6f7f5a3",
                "md5": "50017c1ba9f6f448ad6c2310cd693dec",
                "sha256": "f0af9512333ac5c6c720331c6484deceec2f68679621755ff8d78fb50ad3d918"
            },
            "downloads": -1,
            "filename": "sem_meta-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "50017c1ba9f6f448ad6c2310cd693dec",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 32000,
            "upload_time": "2025-09-05T15:12:44",
            "upload_time_iso_8601": "2025-09-05T15:12:44.180491Z",
            "url": "https://files.pythonhosted.org/packages/c8/9e/9f05007f5ed504129bbf9833228d3649c4ef2fa925301d371f7bd6f7f5a3/sem_meta-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "caa25b8cee70affe9ab67e39834aca51a72a6e70932a592971cc9a17dc2c1046",
                "md5": "a8de57064753bebc00bcc5e805a6d3ef",
                "sha256": "c49f11805f0814c67ecda223151c63cf8418dd473b084706377fa16f3062ea2e"
            },
            "downloads": -1,
            "filename": "sem_meta-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a8de57064753bebc00bcc5e805a6d3ef",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 37740,
            "upload_time": "2025-09-05T15:12:45",
            "upload_time_iso_8601": "2025-09-05T15:12:45.741303Z",
            "url": "https://files.pythonhosted.org/packages/ca/a2/5b8cee70affe9ab67e39834aca51a72a6e70932a592971cc9a17dc2c1046/sem_meta-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-05 15:12:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "asaleh33",
    "github_project": "sem-meta",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sem-meta"
}
        
Elapsed time: 1.72529s