passport-mrz-extractor


Namepassport-mrz-extractor JSON
Version 1.0.13 PyPI version JSON
download
home_pagehttps://github.com/Azim-Kenzh/passport_mrz_extractor
SummaryA Python library for reading MRZ data from passport images using Tesseract OCR
upload_time2024-12-03 12:44:50
maintainerNone
docs_urlNone
authorAzimkozho Kenzhebek uulu
requires_python>=3.10
licenseMIT
keywords mrz passport ocr tesseract image-processing
VCS
bugtrack_url
requirements Pillow pytesseract mrz
Travis-CI No Travis.
coveralls test coverage No coveralls.
            passport_mrz_extractor
======================

`passport_mrz_extractor` is a Python library for extracting and validating Machine Readable Zone (MRZ) data from passport images.
It uses Tesseract OCR to read MRZ text and validates it using the `mrz` library.

Features
--------

- Extract MRZ data from passport images.
- Validate MRZ data fields, including document type, name, nationality, date of birth, and expiry date.
- Automatic image processing for better OCR accuracy.

Installation
------------

You can install `passport_mrz_extractor` using `pip`:

.. code-block:: bash

    pip install passport_mrz_extractor

Requirements
------------

- **Python** >= 3.10
- **Tesseract OCR** installed on your system

To install Tesseract:

- **Ubuntu**: `sudo apt install tesseract-ocr`
- **MacOS (using Homebrew)**: `brew install tesseract`
- **Windows**: Download the installer from https://github.com/UB-Mannheim/tesseract/wiki

Dependencies
------------

This library requires the following Python packages:

- `pytesseract` - For performing OCR on images.
- `opencv-python` - For image processing.
- `mrz` - For MRZ data validation.
- `Pillow` - For handling image files in Python.

Usage
-----

Here’s how to use `passport_mrz_extractor` to extract MRZ data from a passport image.

### Basic Example

This example demonstrates extracting all available MRZ fields from an image and handling potential errors.

.. code-block:: python

    from passport_mrz_extractor import read_mrz

    # Path to the passport image
    image_path = 'path/to/passport_image.jpg'

    try:
        mrz_data = read_mrz(image_path)
        print("Extracted MRZ Data:")
        for key, value in mrz_data.items():
            print(f"{key}: {value}")
    except ValueError as e:
        print(f"Error reading MRZ: {e}")

### Example of Using Specific MRZ Fields

In this example, we extract specific fields such as the country, document number, and birth date, and print them in a formatted output.

.. code-block:: python

    from passport_mrz_extractor import read_mrz

    # Path to the passport image
    image_path = 'path/to/passport_image.jpg'

    try:
        # Extract MRZ data
        mrz_data = mrz_reader.read_mrz(image_path)

        # Display specific fields
        print("Country of Issue:", mrz_data.get("country"))
        print("Document Number:", mrz_data.get("document_number"))
        print("Name:", mrz_data.get("name"))
        print("Surname:", mrz_data.get("surname"))
        print("Date of Birth:", mrz_data.get("birth_date"))
        print("Expiry Date:", mrz_data.get("expiry_date"))
        print("Nationality:", mrz_data.get("nationality"))
        print("Sex:", mrz_data.get("sex"))

    except ValueError as e:
        print(f"Error reading MRZ: {e}")

Contributing
------------

If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are welcome.

Issues
------

If you encounter any issues, please report them on the GitHub repository:

https://github.com/Azim-Kenzh/passport_mrz_extractor/issues

License
-------

`passport_mrz_extractor` is licensed under the MIT License.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Azim-Kenzh/passport_mrz_extractor",
    "name": "passport-mrz-extractor",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "MRZ passport OCR Tesseract image-processing",
    "author": "Azimkozho Kenzhebek uulu",
    "author_email": "azimkozho.inventor@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/28/79/54ea90e1b001576c9300e1cf24885215f26646a18f017508b93936a42089/passport_mrz_extractor-1.0.13.tar.gz",
    "platform": null,
    "description": "passport_mrz_extractor\n======================\n\n`passport_mrz_extractor` is a Python library for extracting and validating Machine Readable Zone (MRZ) data from passport images.\nIt uses Tesseract OCR to read MRZ text and validates it using the `mrz` library.\n\nFeatures\n--------\n\n- Extract MRZ data from passport images.\n- Validate MRZ data fields, including document type, name, nationality, date of birth, and expiry date.\n- Automatic image processing for better OCR accuracy.\n\nInstallation\n------------\n\nYou can install `passport_mrz_extractor` using `pip`:\n\n.. code-block:: bash\n\n    pip install passport_mrz_extractor\n\nRequirements\n------------\n\n- **Python** >= 3.10\n- **Tesseract OCR** installed on your system\n\nTo install Tesseract:\n\n- **Ubuntu**: `sudo apt install tesseract-ocr`\n- **MacOS (using Homebrew)**: `brew install tesseract`\n- **Windows**: Download the installer from https://github.com/UB-Mannheim/tesseract/wiki\n\nDependencies\n------------\n\nThis library requires the following Python packages:\n\n- `pytesseract` - For performing OCR on images.\n- `opencv-python` - For image processing.\n- `mrz` - For MRZ data validation.\n- `Pillow` - For handling image files in Python.\n\nUsage\n-----\n\nHere\u2019s how to use `passport_mrz_extractor` to extract MRZ data from a passport image.\n\n### Basic Example\n\nThis example demonstrates extracting all available MRZ fields from an image and handling potential errors.\n\n.. code-block:: python\n\n    from passport_mrz_extractor import read_mrz\n\n    # Path to the passport image\n    image_path = 'path/to/passport_image.jpg'\n\n    try:\n        mrz_data = read_mrz(image_path)\n        print(\"Extracted MRZ Data:\")\n        for key, value in mrz_data.items():\n            print(f\"{key}: {value}\")\n    except ValueError as e:\n        print(f\"Error reading MRZ: {e}\")\n\n### Example of Using Specific MRZ Fields\n\nIn this example, we extract specific fields such as the country, document number, and birth date, and print them in a formatted output.\n\n.. code-block:: python\n\n    from passport_mrz_extractor import read_mrz\n\n    # Path to the passport image\n    image_path = 'path/to/passport_image.jpg'\n\n    try:\n        # Extract MRZ data\n        mrz_data = mrz_reader.read_mrz(image_path)\n\n        # Display specific fields\n        print(\"Country of Issue:\", mrz_data.get(\"country\"))\n        print(\"Document Number:\", mrz_data.get(\"document_number\"))\n        print(\"Name:\", mrz_data.get(\"name\"))\n        print(\"Surname:\", mrz_data.get(\"surname\"))\n        print(\"Date of Birth:\", mrz_data.get(\"birth_date\"))\n        print(\"Expiry Date:\", mrz_data.get(\"expiry_date\"))\n        print(\"Nationality:\", mrz_data.get(\"nationality\"))\n        print(\"Sex:\", mrz_data.get(\"sex\"))\n\n    except ValueError as e:\n        print(f\"Error reading MRZ: {e}\")\n\nContributing\n------------\n\nIf you'd like to contribute, please fork the repository and use a feature branch. Pull requests are welcome.\n\nIssues\n------\n\nIf you encounter any issues, please report them on the GitHub repository:\n\nhttps://github.com/Azim-Kenzh/passport_mrz_extractor/issues\n\nLicense\n-------\n\n`passport_mrz_extractor` is licensed under the MIT License.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python library for reading MRZ data from passport images using Tesseract OCR",
    "version": "1.0.13",
    "project_urls": {
        "Homepage": "https://github.com/Azim-Kenzh/passport_mrz_extractor"
    },
    "split_keywords": [
        "mrz",
        "passport",
        "ocr",
        "tesseract",
        "image-processing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "287954ea90e1b001576c9300e1cf24885215f26646a18f017508b93936a42089",
                "md5": "36add40eb88164792ebfa8d0f19d06f4",
                "sha256": "10ab904e47b6b17d5462984d6168d0ab664cbda8d06c95310c1de929c0ee8d93"
            },
            "downloads": -1,
            "filename": "passport_mrz_extractor-1.0.13.tar.gz",
            "has_sig": false,
            "md5_digest": "36add40eb88164792ebfa8d0f19d06f4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 3949,
            "upload_time": "2024-12-03T12:44:50",
            "upload_time_iso_8601": "2024-12-03T12:44:50.271731Z",
            "url": "https://files.pythonhosted.org/packages/28/79/54ea90e1b001576c9300e1cf24885215f26646a18f017508b93936a42089/passport_mrz_extractor-1.0.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-03 12:44:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Azim-Kenzh",
    "github_project": "passport_mrz_extractor",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "Pillow",
            "specs": [
                [
                    "==",
                    "11.0.0"
                ]
            ]
        },
        {
            "name": "pytesseract",
            "specs": [
                [
                    "==",
                    "0.3.13"
                ]
            ]
        },
        {
            "name": "mrz",
            "specs": [
                [
                    "==",
                    "0.6.2"
                ]
            ]
        }
    ],
    "lcname": "passport-mrz-extractor"
}
        
Elapsed time: 0.76466s