piechartocr


Namepiechartocr JSON
Version 0.6.6 PyPI version JSON
download
home_pagehttps://git.ehtec.co/research/pie-chart-ocr
SummaryPie Chart Optical Character Recognition
upload_time2023-06-18 21:19:36
maintainer
docs_urlNone
authorElias Hohl
requires_python
licenseMIT
keywords pie chart parsing ocr
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pie-chart-ocr
A tool to extract tabular data from pie charts, developed as a component of the CryptoSearchTools toolkit.

Note: The original repository was moved to https://git.ehtec.co/research/pie-chart-ocr.
https://github.com/ehtec/pie-chart-ocr is a mirror.

# Installation

### Install via PyPi

You can install all tagged versions of `piechartocr` from PyPi:

```commandline
python3 -m pip install --upgrade piechartocr
```

Note: You cannot run tests and examples from the PyPi installation. The required
files need to be downloaded from Gitlab.

### Build from source

Install Boost and Tesseract:

```commandline
sudo apt install libboost-system-dev tesseract-ocr build-essential git
```

Clone this repository including submodules:

```commandline
git clone --recursive https://github.com/ehtec/pie-chart-ocr.git
cd pie-chart-ocr
```

Install Python requirements:

```commandline
python3 -m pip install -r requirements.txt
```

Compile libraries:

```commandline
python3 setup.py build_ext
```

Create temporary directories:
```commandline
mkdir temp
mkdir temp1
mkdir temp2
```

Unpack test charts:

```commandline
unzip data/charts_steph.zip -d data
unzip data/charts_steph_upsampled.zip -d data
unzip data/generated_pie_charts_legend.zip -d data
unzip data/generated_pie_charts_without_legend.zip -d data
```

# Usage

Run unit tests:

```commandline
python3 -m nose2 --start-dir tests/ --with-coverage
```

Run legacy tests / examples:

```commandline
python3 run_examples.py
```

Generate test data (mock pie charts):

```commandline
python3 run_generate_test_data.py
```

To extract data from any pie chart:

```python
from piechartocr import pie_chart_ocr

# Path to pie chart
path = "/path/to/my/chart.png"

# Extract data
data = pie_chart_ocr.main(path, interactive=False)

# Print the extracted list of tuples of the form [(percentage / 100, label)]
print(data["res"])
```

# Metrics

These metrics are autogenerated by the CI-pipeline.

Metrics for mock pie charts with legend:

![chart](https://git.ehtec.co/research/pie-chart-ocr/-/jobs/artifacts/main/raw/artifacts/ocr_test_metrics_mock_legend.png?job=generatemetrics)

Metrics for mock pie charts without legend:

![chart](https://git.ehtec.co/research/pie-chart-ocr/-/jobs/artifacts/main/raw/artifacts/ocr_test_metrics_mock_without_legend.png?job=generatemetrics)

Metrics for real world pie charts (many of them in awful quality, some even unreadable for humans):

![chart](https://git.ehtec.co/research/pie-chart-ocr/-/jobs/artifacts/main/raw/artifacts/ocr_test_metrics.png?job=generatemetrics)



            

Raw data

            {
    "_id": null,
    "home_page": "https://git.ehtec.co/research/pie-chart-ocr",
    "name": "piechartocr",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "pie chart parsing ocr",
    "author": "Elias Hohl",
    "author_email": "elias.hohl@ehtec.co",
    "download_url": "https://files.pythonhosted.org/packages/40/b1/e68d1b3f89b82f2e1bee5af43c2a8f762ace4b9ef64143f92e0437ded264/piechartocr-0.6.6.tar.gz",
    "platform": null,
    "description": "# pie-chart-ocr\nA tool to extract tabular data from pie charts, developed as a component of the CryptoSearchTools toolkit.\n\nNote: The original repository was moved to https://git.ehtec.co/research/pie-chart-ocr.\nhttps://github.com/ehtec/pie-chart-ocr is a mirror.\n\n# Installation\n\n### Install via PyPi\n\nYou can install all tagged versions of `piechartocr` from PyPi:\n\n```commandline\npython3 -m pip install --upgrade piechartocr\n```\n\nNote: You cannot run tests and examples from the PyPi installation. The required\nfiles need to be downloaded from Gitlab.\n\n### Build from source\n\nInstall Boost and Tesseract:\n\n```commandline\nsudo apt install libboost-system-dev tesseract-ocr build-essential git\n```\n\nClone this repository including submodules:\n\n```commandline\ngit clone --recursive https://github.com/ehtec/pie-chart-ocr.git\ncd pie-chart-ocr\n```\n\nInstall Python requirements:\n\n```commandline\npython3 -m pip install -r requirements.txt\n```\n\nCompile libraries:\n\n```commandline\npython3 setup.py build_ext\n```\n\nCreate temporary directories:\n```commandline\nmkdir temp\nmkdir temp1\nmkdir temp2\n```\n\nUnpack test charts:\n\n```commandline\nunzip data/charts_steph.zip -d data\nunzip data/charts_steph_upsampled.zip -d data\nunzip data/generated_pie_charts_legend.zip -d data\nunzip data/generated_pie_charts_without_legend.zip -d data\n```\n\n# Usage\n\nRun unit tests:\n\n```commandline\npython3 -m nose2 --start-dir tests/ --with-coverage\n```\n\nRun legacy tests / examples:\n\n```commandline\npython3 run_examples.py\n```\n\nGenerate test data (mock pie charts):\n\n```commandline\npython3 run_generate_test_data.py\n```\n\nTo extract data from any pie chart:\n\n```python\nfrom piechartocr import pie_chart_ocr\n\n# Path to pie chart\npath = \"/path/to/my/chart.png\"\n\n# Extract data\ndata = pie_chart_ocr.main(path, interactive=False)\n\n# Print the extracted list of tuples of the form [(percentage / 100, label)]\nprint(data[\"res\"])\n```\n\n# Metrics\n\nThese metrics are autogenerated by the CI-pipeline.\n\nMetrics for mock pie charts with legend:\n\n![chart](https://git.ehtec.co/research/pie-chart-ocr/-/jobs/artifacts/main/raw/artifacts/ocr_test_metrics_mock_legend.png?job=generatemetrics)\n\nMetrics for mock pie charts without legend:\n\n![chart](https://git.ehtec.co/research/pie-chart-ocr/-/jobs/artifacts/main/raw/artifacts/ocr_test_metrics_mock_without_legend.png?job=generatemetrics)\n\nMetrics for real world pie charts (many of them in awful quality, some even unreadable for humans):\n\n![chart](https://git.ehtec.co/research/pie-chart-ocr/-/jobs/artifacts/main/raw/artifacts/ocr_test_metrics.png?job=generatemetrics)\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Pie Chart Optical Character Recognition",
    "version": "0.6.6",
    "project_urls": {
        "Download": "https://git.ehtec.co/research/pie-chart-ocr/-/archive/v0.6.6-beta/pie-chart-ocr-v0.6.6-beta.zip",
        "Homepage": "https://git.ehtec.co/research/pie-chart-ocr"
    },
    "split_keywords": [
        "pie",
        "chart",
        "parsing",
        "ocr"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "40b1e68d1b3f89b82f2e1bee5af43c2a8f762ace4b9ef64143f92e0437ded264",
                "md5": "9b1e24e0c370f1d94fa085229c4999e9",
                "sha256": "dceda30533b8db3c614f44fc9202d571dac305d9c62d683784c1add7f8d4b498"
            },
            "downloads": -1,
            "filename": "piechartocr-0.6.6.tar.gz",
            "has_sig": false,
            "md5_digest": "9b1e24e0c370f1d94fa085229c4999e9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 95132295,
            "upload_time": "2023-06-18T21:19:36",
            "upload_time_iso_8601": "2023-06-18T21:19:36.772102Z",
            "url": "https://files.pythonhosted.org/packages/40/b1/e68d1b3f89b82f2e1bee5af43c2a8f762ace4b9ef64143f92e0437ded264/piechartocr-0.6.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-18 21:19:36",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "piechartocr"
}
        
Elapsed time: 0.10260s