# ECHR Extractor
Python library for extracting case law data from the European Court of Human Rights (ECHR) HUDOC database.
## Features
- Extract metadata for ECHR cases from the HUDOC database
- Download full text content for cases
- Support for custom date ranges and case ID ranges
- Multiple language support
- Generate nodes and edges for network analysis
- Flexible output formats (CSV, JSON, in-memory DataFrames)
## Installation
```bash
pip install echr-extractor
```
## Quick Start
```python
from echr_extractor import get_echr, get_echr_extra, get_nodes_edges
# Get basic metadata for cases
df = get_echr(start_id=0, count=100, language=['ENG'])
# Get metadata + full text
df, full_texts = get_echr_extra(start_id=0, count=100, language=['ENG'])
# Generate network data
nodes, edges = get_nodes_edges(df=df)
```
## Functions
### `get_echr`
Gets all available metadata for ECHR cases from the HUDOC database.
**Parameters:**
- `start_id` (int, optional): The ID of the first case to download (default: 0)
- `end_id` (int, optional): The ID of the last case to download (default: maximum available)
- `count` (int, optional): Number of cases per language to download (default: None)
- `start_date` (str, optional): Start publication date (yyyy-mm-dd) (default: None)
- `end_date` (str, optional): End publication date (yyyy-mm-dd) (default: current date)
- `verbose` (bool, optional): Show progress information (default: False)
- `fields` (list, optional): Limit metadata fields to download (default: all fields)
- `save_file` (str, optional): Save as CSV file ('y') or return DataFrame ('n') (default: 'y')
- `language` (list, optional): Languages to download (default: ['ENG'])
- `link` (str, optional): Direct HUDOC search URL (default: None)
- `query_payload` (str, optional): Direct API query payload (default: None)
### `get_echr_extra`
Gets metadata and downloads full text for each case.
**Parameters:** Same as `get_echr` plus:
- `threads` (int, optional): Number of threads for parallel download (default: 10)
### `get_nodes_edges`
Generates nodes and edges for network analysis from case metadata.
**Parameters:**
- `metadata_path` (str, optional): Path to metadata CSV file (default: None)
- `df` (DataFrame, optional): Metadata DataFrame (default: None)
- `save_file` (str, optional): Save as files ('y') or return objects ('n') (default: 'y')
## Advanced Usage
### Using Custom Search URLs
You can use direct HUDOC search URLs:
```python
url = "https://hudoc.echr.coe.int/eng#{%22itemid%22:[%22001-57574%22]}"
df = get_echr(link=url)
```
### Using Query Payloads
For more robust searching, use simple field:value queries:
```python
payload = 'article:8'
df = get_echr(query_payload=payload)
```
### Date Range Filtering
```python
df = get_echr(
start_date="2020-01-01",
end_date="2023-12-31",
language=['ENG', 'FRE']
)
```
### Specific Fields Only
```python
fields = ['itemid', 'doctypebranch', 'title', 'kpdate']
df = get_echr(count=100, fields=fields)
```
## Requirements
- Python 3.8+
- requests
- pandas
- beautifulsoup4
- dateparser
- tqdm
## License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
## Contributors
- Benjamin Rodrigues de Miranda
- Chloe Crombach
- Piotr Lewandowski
- Pranav Bapat
- Shashank MC
- Gijs van Dijck
## Citation
If you use this library in your research, please cite:
```bibtex
@software{echr_extractor,
title={ECHR Extractor: Python Library for European Court of Human Rights Data},
author={LawTech Lab, Maastricht University},
url={https://github.com/maastrichtlawtech/echr-extractor},
year={2024}
}
```
## Support
For bug reports and feature requests, please open an issue on GitHub.
Raw data
{
"_id": null,
"home_page": null,
"name": "echr-extractor",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "echr, extractor, european, convention, human, rights, court, case-law, legal, hudoc, data-extraction",
"author": null,
"author_email": "LawTech Lab <lawtech@maastrichtuniversity.nl>",
"download_url": "https://files.pythonhosted.org/packages/1e/23/7a1e17836dc6b6a2a0b9c32978f28b4d2006aac9250951d70d1617305d8b/echr_extractor-1.0.46.tar.gz",
"platform": null,
"description": "# ECHR Extractor\n\nPython library for extracting case law data from the European Court of Human Rights (ECHR) HUDOC database.\n\n## Features\n\n- Extract metadata for ECHR cases from the HUDOC database\n- Download full text content for cases\n- Support for custom date ranges and case ID ranges\n- Multiple language support\n- Generate nodes and edges for network analysis\n- Flexible output formats (CSV, JSON, in-memory DataFrames)\n\n## Installation\n\n```bash\npip install echr-extractor\n```\n\n## Quick Start\n\n```python\nfrom echr_extractor import get_echr, get_echr_extra, get_nodes_edges\n\n# Get basic metadata for cases\ndf = get_echr(start_id=0, count=100, language=['ENG'])\n\n# Get metadata + full text\ndf, full_texts = get_echr_extra(start_id=0, count=100, language=['ENG'])\n\n# Generate network data\nnodes, edges = get_nodes_edges(df=df)\n```\n\n## Functions\n\n### `get_echr`\n\nGets all available metadata for ECHR cases from the HUDOC database.\n\n**Parameters:**\n- `start_id` (int, optional): The ID of the first case to download (default: 0)\n- `end_id` (int, optional): The ID of the last case to download (default: maximum available)\n- `count` (int, optional): Number of cases per language to download (default: None)\n- `start_date` (str, optional): Start publication date (yyyy-mm-dd) (default: None)\n- `end_date` (str, optional): End publication date (yyyy-mm-dd) (default: current date)\n- `verbose` (bool, optional): Show progress information (default: False)\n- `fields` (list, optional): Limit metadata fields to download (default: all fields)\n- `save_file` (str, optional): Save as CSV file ('y') or return DataFrame ('n') (default: 'y')\n- `language` (list, optional): Languages to download (default: ['ENG'])\n- `link` (str, optional): Direct HUDOC search URL (default: None)\n- `query_payload` (str, optional): Direct API query payload (default: None)\n\n### `get_echr_extra`\n\nGets metadata and downloads full text for each case.\n\n**Parameters:** Same as `get_echr` plus:\n- `threads` (int, optional): Number of threads for parallel download (default: 10)\n\n### `get_nodes_edges`\n\nGenerates nodes and edges for network analysis from case metadata.\n\n**Parameters:**\n- `metadata_path` (str, optional): Path to metadata CSV file (default: None)\n- `df` (DataFrame, optional): Metadata DataFrame (default: None)\n- `save_file` (str, optional): Save as files ('y') or return objects ('n') (default: 'y')\n\n## Advanced Usage\n\n### Using Custom Search URLs\n\nYou can use direct HUDOC search URLs:\n\n```python\nurl = \"https://hudoc.echr.coe.int/eng#{%22itemid%22:[%22001-57574%22]}\"\ndf = get_echr(link=url)\n```\n\n### Using Query Payloads\n\nFor more robust searching, use simple field:value queries:\n\n```python\npayload = 'article:8'\ndf = get_echr(query_payload=payload)\n```\n\n### Date Range Filtering\n\n```python\ndf = get_echr(\n start_date=\"2020-01-01\",\n end_date=\"2023-12-31\",\n language=['ENG', 'FRE']\n)\n```\n\n### Specific Fields Only\n\n```python\nfields = ['itemid', 'doctypebranch', 'title', 'kpdate']\ndf = get_echr(count=100, fields=fields)\n```\n\n## Requirements\n\n- Python 3.8+\n- requests\n- pandas\n- beautifulsoup4\n- dateparser\n- tqdm\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the LICENSE file for details.\n\n## Contributors\n\n- Benjamin Rodrigues de Miranda\n- Chloe Crombach\n- Piotr Lewandowski\n- Pranav Bapat\n- Shashank MC\n- Gijs van Dijck\n\n## Citation\n\nIf you use this library in your research, please cite:\n\n```bibtex\n@software{echr_extractor,\n title={ECHR Extractor: Python Library for European Court of Human Rights Data},\n author={LawTech Lab, Maastricht University},\n url={https://github.com/maastrichtlawtech/echr-extractor},\n year={2024}\n}\n```\n\n## Support\n\nFor bug reports and feature requests, please open an issue on GitHub.\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Python library for extracting case law data from the European Court of Human Rights (ECHR) HUDOC database",
"version": "1.0.46",
"project_urls": {
"Bug Reports": "https://github.com/maastrichtlawtech/echr-extractor/issues",
"Documentation": "https://github.com/maastrichtlawtech/echr-extractor",
"Homepage": "https://github.com/maastrichtlawtech/echr-extractor",
"Repository": "https://github.com/maastrichtlawtech/echr-extractor"
},
"split_keywords": [
"echr",
" extractor",
" european",
" convention",
" human",
" rights",
" court",
" case-law",
" legal",
" hudoc",
" data-extraction"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "44c078961fc52bb26ac37297871d07e456e039777b797d4e3b30cef661c70419",
"md5": "aaaed2a59692f747ec0cd0eb61471524",
"sha256": "4e944e37575839937ad184634b5bdaafa3dbc5faa675fec9924b9d7c5726d38c"
},
"downloads": -1,
"filename": "echr_extractor-1.0.46-py3-none-any.whl",
"has_sig": false,
"md5_digest": "aaaed2a59692f747ec0cd0eb61471524",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 22710,
"upload_time": "2025-10-09T09:00:41",
"upload_time_iso_8601": "2025-10-09T09:00:41.990285Z",
"url": "https://files.pythonhosted.org/packages/44/c0/78961fc52bb26ac37297871d07e456e039777b797d4e3b30cef661c70419/echr_extractor-1.0.46-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1e237a1e17836dc6b6a2a0b9c32978f28b4d2006aac9250951d70d1617305d8b",
"md5": "32e6f636e39228caae15f6302286cdc4",
"sha256": "21a0123fdb59a52daa3dc2472e53aa3af2c2e86e338f66d55b0b1add0ed4d63f"
},
"downloads": -1,
"filename": "echr_extractor-1.0.46.tar.gz",
"has_sig": false,
"md5_digest": "32e6f636e39228caae15f6302286cdc4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 34329,
"upload_time": "2025-10-09T09:00:43",
"upload_time_iso_8601": "2025-10-09T09:00:43.317957Z",
"url": "https://files.pythonhosted.org/packages/1e/23/7a1e17836dc6b6a2a0b9c32978f28b4d2006aac9250951d70d1617305d8b/echr_extractor-1.0.46.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-09 09:00:43",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "maastrichtlawtech",
"github_project": "echr-extractor",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "requests",
"specs": [
[
">=",
"2.26.0"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"1.3.0"
]
]
},
{
"name": "beautifulsoup4",
"specs": [
[
">=",
"4.9.3"
]
]
},
{
"name": "dateparser",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
">=",
"4.60.0"
]
]
}
],
"lcname": "echr-extractor"
}