echr-extractor


Nameechr-extractor JSON
Version 1.0.46 PyPI version JSON
download
home_pageNone
SummaryPython library for extracting case law data from the European Court of Human Rights (ECHR) HUDOC database
upload_time2025-10-09 09:00:43
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseApache-2.0
keywords echr extractor european convention human rights court case-law legal hudoc data-extraction
VCS
bugtrack_url
requirements requests pandas beautifulsoup4 dateparser tqdm
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ECHR Extractor

Python library for extracting case law data from the European Court of Human Rights (ECHR) HUDOC database.

## Features

- Extract metadata for ECHR cases from the HUDOC database
- Download full text content for cases
- Support for custom date ranges and case ID ranges
- Multiple language support
- Generate nodes and edges for network analysis
- Flexible output formats (CSV, JSON, in-memory DataFrames)

## Installation

```bash
pip install echr-extractor
```

## Quick Start

```python
from echr_extractor import get_echr, get_echr_extra, get_nodes_edges

# Get basic metadata for cases
df = get_echr(start_id=0, count=100, language=['ENG'])

# Get metadata + full text
df, full_texts = get_echr_extra(start_id=0, count=100, language=['ENG'])

# Generate network data
nodes, edges = get_nodes_edges(df=df)
```

## Functions

### `get_echr`

Gets all available metadata for ECHR cases from the HUDOC database.

**Parameters:**
- `start_id` (int, optional): The ID of the first case to download (default: 0)
- `end_id` (int, optional): The ID of the last case to download (default: maximum available)
- `count` (int, optional): Number of cases per language to download (default: None)
- `start_date` (str, optional): Start publication date (yyyy-mm-dd) (default: None)
- `end_date` (str, optional): End publication date (yyyy-mm-dd) (default: current date)
- `verbose` (bool, optional): Show progress information (default: False)
- `fields` (list, optional): Limit metadata fields to download (default: all fields)
- `save_file` (str, optional): Save as CSV file ('y') or return DataFrame ('n') (default: 'y')
- `language` (list, optional): Languages to download (default: ['ENG'])
- `link` (str, optional): Direct HUDOC search URL (default: None)
- `query_payload` (str, optional): Direct API query payload (default: None)

### `get_echr_extra`

Gets metadata and downloads full text for each case.

**Parameters:** Same as `get_echr` plus:
- `threads` (int, optional): Number of threads for parallel download (default: 10)

### `get_nodes_edges`

Generates nodes and edges for network analysis from case metadata.

**Parameters:**
- `metadata_path` (str, optional): Path to metadata CSV file (default: None)
- `df` (DataFrame, optional): Metadata DataFrame (default: None)
- `save_file` (str, optional): Save as files ('y') or return objects ('n') (default: 'y')

## Advanced Usage

### Using Custom Search URLs

You can use direct HUDOC search URLs:

```python
url = "https://hudoc.echr.coe.int/eng#{%22itemid%22:[%22001-57574%22]}"
df = get_echr(link=url)
```

### Using Query Payloads

For more robust searching, use simple field:value queries:

```python
payload = 'article:8'
df = get_echr(query_payload=payload)
```

### Date Range Filtering

```python
df = get_echr(
    start_date="2020-01-01",
    end_date="2023-12-31",
    language=['ENG', 'FRE']
)
```

### Specific Fields Only

```python
fields = ['itemid', 'doctypebranch', 'title', 'kpdate']
df = get_echr(count=100, fields=fields)
```

## Requirements

- Python 3.8+
- requests
- pandas
- beautifulsoup4
- dateparser
- tqdm

## License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

## Contributors

- Benjamin Rodrigues de Miranda
- Chloe Crombach
- Piotr Lewandowski
- Pranav Bapat
- Shashank MC
- Gijs van Dijck

## Citation

If you use this library in your research, please cite:

```bibtex
@software{echr_extractor,
  title={ECHR Extractor: Python Library for European Court of Human Rights Data},
  author={LawTech Lab, Maastricht University},
  url={https://github.com/maastrichtlawtech/echr-extractor},
  year={2024}
}
```

## Support

For bug reports and feature requests, please open an issue on GitHub.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "echr-extractor",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "echr, extractor, european, convention, human, rights, court, case-law, legal, hudoc, data-extraction",
    "author": null,
    "author_email": "LawTech Lab <lawtech@maastrichtuniversity.nl>",
    "download_url": "https://files.pythonhosted.org/packages/1e/23/7a1e17836dc6b6a2a0b9c32978f28b4d2006aac9250951d70d1617305d8b/echr_extractor-1.0.46.tar.gz",
    "platform": null,
    "description": "# ECHR Extractor\n\nPython library for extracting case law data from the European Court of Human Rights (ECHR) HUDOC database.\n\n## Features\n\n- Extract metadata for ECHR cases from the HUDOC database\n- Download full text content for cases\n- Support for custom date ranges and case ID ranges\n- Multiple language support\n- Generate nodes and edges for network analysis\n- Flexible output formats (CSV, JSON, in-memory DataFrames)\n\n## Installation\n\n```bash\npip install echr-extractor\n```\n\n## Quick Start\n\n```python\nfrom echr_extractor import get_echr, get_echr_extra, get_nodes_edges\n\n# Get basic metadata for cases\ndf = get_echr(start_id=0, count=100, language=['ENG'])\n\n# Get metadata + full text\ndf, full_texts = get_echr_extra(start_id=0, count=100, language=['ENG'])\n\n# Generate network data\nnodes, edges = get_nodes_edges(df=df)\n```\n\n## Functions\n\n### `get_echr`\n\nGets all available metadata for ECHR cases from the HUDOC database.\n\n**Parameters:**\n- `start_id` (int, optional): The ID of the first case to download (default: 0)\n- `end_id` (int, optional): The ID of the last case to download (default: maximum available)\n- `count` (int, optional): Number of cases per language to download (default: None)\n- `start_date` (str, optional): Start publication date (yyyy-mm-dd) (default: None)\n- `end_date` (str, optional): End publication date (yyyy-mm-dd) (default: current date)\n- `verbose` (bool, optional): Show progress information (default: False)\n- `fields` (list, optional): Limit metadata fields to download (default: all fields)\n- `save_file` (str, optional): Save as CSV file ('y') or return DataFrame ('n') (default: 'y')\n- `language` (list, optional): Languages to download (default: ['ENG'])\n- `link` (str, optional): Direct HUDOC search URL (default: None)\n- `query_payload` (str, optional): Direct API query payload (default: None)\n\n### `get_echr_extra`\n\nGets metadata and downloads full text for each case.\n\n**Parameters:** Same as `get_echr` plus:\n- `threads` (int, optional): Number of threads for parallel download (default: 10)\n\n### `get_nodes_edges`\n\nGenerates nodes and edges for network analysis from case metadata.\n\n**Parameters:**\n- `metadata_path` (str, optional): Path to metadata CSV file (default: None)\n- `df` (DataFrame, optional): Metadata DataFrame (default: None)\n- `save_file` (str, optional): Save as files ('y') or return objects ('n') (default: 'y')\n\n## Advanced Usage\n\n### Using Custom Search URLs\n\nYou can use direct HUDOC search URLs:\n\n```python\nurl = \"https://hudoc.echr.coe.int/eng#{%22itemid%22:[%22001-57574%22]}\"\ndf = get_echr(link=url)\n```\n\n### Using Query Payloads\n\nFor more robust searching, use simple field:value queries:\n\n```python\npayload = 'article:8'\ndf = get_echr(query_payload=payload)\n```\n\n### Date Range Filtering\n\n```python\ndf = get_echr(\n    start_date=\"2020-01-01\",\n    end_date=\"2023-12-31\",\n    language=['ENG', 'FRE']\n)\n```\n\n### Specific Fields Only\n\n```python\nfields = ['itemid', 'doctypebranch', 'title', 'kpdate']\ndf = get_echr(count=100, fields=fields)\n```\n\n## Requirements\n\n- Python 3.8+\n- requests\n- pandas\n- beautifulsoup4\n- dateparser\n- tqdm\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the LICENSE file for details.\n\n## Contributors\n\n- Benjamin Rodrigues de Miranda\n- Chloe Crombach\n- Piotr Lewandowski\n- Pranav Bapat\n- Shashank MC\n- Gijs van Dijck\n\n## Citation\n\nIf you use this library in your research, please cite:\n\n```bibtex\n@software{echr_extractor,\n  title={ECHR Extractor: Python Library for European Court of Human Rights Data},\n  author={LawTech Lab, Maastricht University},\n  url={https://github.com/maastrichtlawtech/echr-extractor},\n  year={2024}\n}\n```\n\n## Support\n\nFor bug reports and feature requests, please open an issue on GitHub.\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Python library for extracting case law data from the European Court of Human Rights (ECHR) HUDOC database",
    "version": "1.0.46",
    "project_urls": {
        "Bug Reports": "https://github.com/maastrichtlawtech/echr-extractor/issues",
        "Documentation": "https://github.com/maastrichtlawtech/echr-extractor",
        "Homepage": "https://github.com/maastrichtlawtech/echr-extractor",
        "Repository": "https://github.com/maastrichtlawtech/echr-extractor"
    },
    "split_keywords": [
        "echr",
        " extractor",
        " european",
        " convention",
        " human",
        " rights",
        " court",
        " case-law",
        " legal",
        " hudoc",
        " data-extraction"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "44c078961fc52bb26ac37297871d07e456e039777b797d4e3b30cef661c70419",
                "md5": "aaaed2a59692f747ec0cd0eb61471524",
                "sha256": "4e944e37575839937ad184634b5bdaafa3dbc5faa675fec9924b9d7c5726d38c"
            },
            "downloads": -1,
            "filename": "echr_extractor-1.0.46-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "aaaed2a59692f747ec0cd0eb61471524",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 22710,
            "upload_time": "2025-10-09T09:00:41",
            "upload_time_iso_8601": "2025-10-09T09:00:41.990285Z",
            "url": "https://files.pythonhosted.org/packages/44/c0/78961fc52bb26ac37297871d07e456e039777b797d4e3b30cef661c70419/echr_extractor-1.0.46-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1e237a1e17836dc6b6a2a0b9c32978f28b4d2006aac9250951d70d1617305d8b",
                "md5": "32e6f636e39228caae15f6302286cdc4",
                "sha256": "21a0123fdb59a52daa3dc2472e53aa3af2c2e86e338f66d55b0b1add0ed4d63f"
            },
            "downloads": -1,
            "filename": "echr_extractor-1.0.46.tar.gz",
            "has_sig": false,
            "md5_digest": "32e6f636e39228caae15f6302286cdc4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 34329,
            "upload_time": "2025-10-09T09:00:43",
            "upload_time_iso_8601": "2025-10-09T09:00:43.317957Z",
            "url": "https://files.pythonhosted.org/packages/1e/23/7a1e17836dc6b6a2a0b9c32978f28b4d2006aac9250951d70d1617305d8b/echr_extractor-1.0.46.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-09 09:00:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "maastrichtlawtech",
    "github_project": "echr-extractor",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.26.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "beautifulsoup4",
            "specs": [
                [
                    ">=",
                    "4.9.3"
                ]
            ]
        },
        {
            "name": "dateparser",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.60.0"
                ]
            ]
        }
    ],
    "lcname": "echr-extractor"
}
        
Elapsed time: 3.48364s