# LangExtract Anthropic Provider
A provider plugin for [LangExtract](https://github.com/google/langextract) that integrates Anthropic's Claude API for robust, structured information extraction.
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/Apache-2.0)
[](https://github.com/psf/black)
## Features
- **Native Anthropic API**: Uses the official `anthropic` Python SDK for Claude models.
- **Safe parameter handling**: Whitelist filtering; unsupported params raise clear errors.
- **Concurrent batching**: Parallel inference for multi-prompt workloads.
- **Schema-aware**: Optional structured output mode (JSON) from LangExtract examples.
- **Modern packaging**: `pyproject.toml` with Hatch; works well with `uv`.
## Installation
### Using UV (Recommended)
```bash
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install the package
uv add langextract-anthropic
```
### Using pip
```bash
pip install langextract-anthropic
```
### From Source
```bash
git clone <repository-url>
cd langextract-anthropic
uv sync
```
## Quick Start
### 1. Set up Anthropic API credentials
```bash
export ANTHROPIC_API_KEY="your-api-key"
```
### 2. Use with LangExtract
```python
import langextract as lx
# Define extraction examples
examples = [
lx.ExampleData(
text="John Smith works at Microsoft in Seattle.",
extractions=[
lx.ExtractionData(
extraction_class="Person",
attributes={"name": "John Smith"}
),
lx.ExtractionData(
extraction_class="Organization",
attributes={"name": "Microsoft"}
),
lx.ExtractionData(
extraction_class="Location",
attributes={"name": "Seattle"}
),
],
),
]
# Extract information using Anthropic Claude
result = lx.extract(
text_or_documents="Sarah Johnson is a data scientist at Google in Mountain View.",
prompt_description="Extract people, organizations, and locations.",
examples=examples,
model_id="anthropic-claude-3-5-sonnet-latest",
temperature=0.1,
max_tokens=512,
)
print(result.extractions)
```
## Supported Models
This provider supports all Anthropic Claude models:
- `claude-3-5-sonnet-latest` (recommended)
- `claude-3-5-sonnet-20241022`
- `claude-3-5-haiku-latest`
- `claude-3-opus-latest`
- `claude-3-sonnet-20240229`
- `claude-3-haiku-20240307`
### Model ID Format
Use the `anthropic-` prefix or specify the model name directly:
- `anthropic-claude-3-5-sonnet-latest` → Uses model: `claude-3-5-sonnet-latest`
- `anthropic-claude-3-opus-latest` → Uses model: `claude-3-opus-latest`
- `claude-3-5-sonnet-latest` → Uses model directly
## Configuration Parameters
### Core Parameters
| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `model_id` | `str` | Model identifier | `"claude-3-5-sonnet-latest"` |
| `api_key` | `str` | Anthropic API key | `ANTHROPIC_API_KEY` env var |
| `temperature` | `float` | Controls randomness (0-1) | `None` |
| `max_workers` | `int` | Parallel request workers | `10` |
### Anthropic API Parameters
| Parameter | Type | Description | Range |
|-----------|------|-------------|--------|
| `max_tokens` | `int` | Maximum tokens to generate | 1-8192 |
| `temperature` | `float` | Sampling temperature | 0.0-1.0 |
| `top_p` | `float` | Nucleus sampling | 0.0-1.0 |
| `top_k` | `int` | Top-k sampling | 0-200 |
| `stop_sequences` | `list[str]` | Stop sequences | Max 4 items |
| `metadata` | `dict` | Request tracking metadata | - |
### Usage Examples
```python
# Basic extraction
result = lx.extract(
text_or_documents=text,
prompt_description=prompt,
examples=examples,
model_id="anthropic-claude-3-5-sonnet-latest",
)
# With custom parameters
result = lx.extract(
text_or_documents=text,
prompt_description=prompt,
examples=examples,
model_id="anthropic-claude-3-5-sonnet-latest",
temperature=0.3,
max_tokens=1000,
top_p=0.9,
stop_sequences=["END", "STOP"],
metadata={"user_id": "user123"},
)
```
## Environment Variables
| Variable | Description | Required |
|----------|-------------|----------|
| `ANTHROPIC_API_KEY` | Anthropic API key | Yes |
## Development
### Setup Development Environment
```bash
# Clone the repository
git clone <repository-url>
cd langextract-anthropic
# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync --dev
```
### Running Tests
```bash
# Run unit tests (no API calls)
uv run pytest tests/ -m "unit"
# Run integration tests (requires ANTHROPIC_API_KEY)
uv run pytest tests/ -m "integration"
# Run all tests with coverage
uv run pytest tests/ --cov=langextract_anthropic --cov-report=html
```
### Development Commands
```bash
# Format code
uv run black langextract_anthropic tests
uv run isort langextract_anthropic tests
# Lint code
uv run ruff check langextract_anthropic tests
uv run mypy langextract_anthropic
# Build package
uv build
# Bump version
python scripts/bump_version.py patch # 0.1.0 -> 0.1.1
python scripts/bump_version.py minor # 0.1.0 -> 0.2.0
python scripts/bump_version.py major # 0.1.0 -> 1.0.0
```
## Testing
This provider includes comprehensive testing:
- **Unit tests**: Mock-based testing of provider logic
- **Parameter tests**: Validation of API parameter filtering
- **Integration tests**: Real API testing (requires credentials)
```bash
# Set up test environment
export ANTHROPIC_API_KEY="your-api-key"
# Run specific test categories
uv run pytest tests/test_provider_unit.py -v
uv run pytest tests/test_parameter_filtering.py -v
uv run pytest tests/test_anthropic_integration.py -v # requires API key
```
## Error Handling
The provider provides clear error messages for common issues:
```python
try:
result = lx.extract(...)
except lx.exceptions.InferenceConfigError as e:
# Configuration errors (missing API key, invalid params)
print(f"Configuration error: {e}")
except lx.exceptions.InferenceRuntimeError as e:
# Runtime errors (API failures, network issues)
print(f"Runtime error: {e}")
print(f"Original error: {e.original}")
```
## Troubleshooting
### Common Issues
1. **Missing API Key**
```
InferenceConfigError: Anthropic API key not provided
```
Solution: Set `ANTHROPIC_API_KEY` environment variable or pass `api_key` parameter.
2. **Invalid Model Name**
```
AnthropicAPIError: model not found
```
Solution: Use a valid Claude model name (see supported models above).
3. **Rate Limiting**
```
AnthropicAPIError: 429 Too Many Requests
```
Solution: Reduce `max_workers` or add retry logic in your application.
4. **Token Limit Exceeded**
```
AnthropicAPIError: maximum context length exceeded
```
Solution: Reduce input text length or increase `max_tokens` parameter.
## License
This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
## Changelog
See [CHANGELOG.md](CHANGELOG.md) for a list of changes and version history.
Raw data
{
"_id": null,
"home_page": null,
"name": "langextract-anthropic",
"maintainer": "Norbert Laszlo",
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "anthropic, claude, extraction, langextract, llm, nlp, plugin, provider",
"author": "Norbert Laszlo",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/2f/17/7588ef960c595cf101a1ecf79bf87ccbc7697350326582d38e3ca85ce9c4/langextract_anthropic-0.2.1.tar.gz",
"platform": null,
"description": "# LangExtract Anthropic Provider\n\nA provider plugin for [LangExtract](https://github.com/google/langextract) that integrates Anthropic's Claude API for robust, structured information extraction.\n\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/Apache-2.0)\n[](https://github.com/psf/black)\n\n## Features\n\n- **Native Anthropic API**: Uses the official `anthropic` Python SDK for Claude models.\n- **Safe parameter handling**: Whitelist filtering; unsupported params raise clear errors.\n- **Concurrent batching**: Parallel inference for multi-prompt workloads.\n- **Schema-aware**: Optional structured output mode (JSON) from LangExtract examples.\n- **Modern packaging**: `pyproject.toml` with Hatch; works well with `uv`.\n\n## Installation\n\n### Using UV (Recommended)\n\n```bash\n# Install UV if you haven't already\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Install the package\nuv add langextract-anthropic\n```\n\n### Using pip\n\n```bash\npip install langextract-anthropic\n```\n\n### From Source\n\n```bash\ngit clone <repository-url>\ncd langextract-anthropic\nuv sync\n```\n\n## Quick Start\n\n### 1. Set up Anthropic API credentials\n\n```bash\nexport ANTHROPIC_API_KEY=\"your-api-key\"\n```\n\n### 2. Use with LangExtract\n\n```python\nimport langextract as lx\n\n# Define extraction examples\nexamples = [\n lx.ExampleData(\n text=\"John Smith works at Microsoft in Seattle.\",\n extractions=[\n lx.ExtractionData(\n extraction_class=\"Person\",\n attributes={\"name\": \"John Smith\"}\n ),\n lx.ExtractionData(\n extraction_class=\"Organization\", \n attributes={\"name\": \"Microsoft\"}\n ),\n lx.ExtractionData(\n extraction_class=\"Location\",\n attributes={\"name\": \"Seattle\"}\n ),\n ],\n ),\n]\n\n# Extract information using Anthropic Claude\nresult = lx.extract(\n text_or_documents=\"Sarah Johnson is a data scientist at Google in Mountain View.\",\n prompt_description=\"Extract people, organizations, and locations.\",\n examples=examples,\n model_id=\"anthropic-claude-3-5-sonnet-latest\",\n temperature=0.1,\n max_tokens=512,\n)\n\nprint(result.extractions)\n```\n\n## Supported Models\n\nThis provider supports all Anthropic Claude models:\n\n- `claude-3-5-sonnet-latest` (recommended)\n- `claude-3-5-sonnet-20241022` \n- `claude-3-5-haiku-latest`\n- `claude-3-opus-latest`\n- `claude-3-sonnet-20240229`\n- `claude-3-haiku-20240307`\n\n### Model ID Format\n\nUse the `anthropic-` prefix or specify the model name directly:\n\n- `anthropic-claude-3-5-sonnet-latest` \u2192 Uses model: `claude-3-5-sonnet-latest`\n- `anthropic-claude-3-opus-latest` \u2192 Uses model: `claude-3-opus-latest`\n- `claude-3-5-sonnet-latest` \u2192 Uses model directly\n\n## Configuration Parameters\n\n### Core Parameters\n\n| Parameter | Type | Description | Default |\n|-----------|------|-------------|---------|\n| `model_id` | `str` | Model identifier | `\"claude-3-5-sonnet-latest\"` |\n| `api_key` | `str` | Anthropic API key | `ANTHROPIC_API_KEY` env var |\n| `temperature` | `float` | Controls randomness (0-1) | `None` |\n| `max_workers` | `int` | Parallel request workers | `10` |\n\n### Anthropic API Parameters\n\n| Parameter | Type | Description | Range |\n|-----------|------|-------------|--------|\n| `max_tokens` | `int` | Maximum tokens to generate | 1-8192 |\n| `temperature` | `float` | Sampling temperature | 0.0-1.0 |\n| `top_p` | `float` | Nucleus sampling | 0.0-1.0 |\n| `top_k` | `int` | Top-k sampling | 0-200 |\n| `stop_sequences` | `list[str]` | Stop sequences | Max 4 items |\n| `metadata` | `dict` | Request tracking metadata | - |\n\n### Usage Examples\n\n```python\n# Basic extraction\nresult = lx.extract(\n text_or_documents=text,\n prompt_description=prompt,\n examples=examples,\n model_id=\"anthropic-claude-3-5-sonnet-latest\",\n)\n\n# With custom parameters\nresult = lx.extract(\n text_or_documents=text,\n prompt_description=prompt,\n examples=examples,\n model_id=\"anthropic-claude-3-5-sonnet-latest\",\n temperature=0.3,\n max_tokens=1000,\n top_p=0.9,\n stop_sequences=[\"END\", \"STOP\"],\n metadata={\"user_id\": \"user123\"},\n)\n```\n\n## Environment Variables\n\n| Variable | Description | Required |\n|----------|-------------|----------|\n| `ANTHROPIC_API_KEY` | Anthropic API key | Yes |\n\n## Development\n\n### Setup Development Environment\n\n```bash\n# Clone the repository\ngit clone <repository-url>\ncd langextract-anthropic\n\n# Install UV\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Install dependencies\nuv sync --dev\n```\n\n### Running Tests\n\n```bash\n# Run unit tests (no API calls)\nuv run pytest tests/ -m \"unit\"\n\n# Run integration tests (requires ANTHROPIC_API_KEY)\nuv run pytest tests/ -m \"integration\" \n\n# Run all tests with coverage\nuv run pytest tests/ --cov=langextract_anthropic --cov-report=html\n```\n\n### Development Commands\n\n```bash\n# Format code\nuv run black langextract_anthropic tests\nuv run isort langextract_anthropic tests\n\n# Lint code\nuv run ruff check langextract_anthropic tests\nuv run mypy langextract_anthropic\n\n# Build package\nuv build\n\n# Bump version\npython scripts/bump_version.py patch # 0.1.0 -> 0.1.1\npython scripts/bump_version.py minor # 0.1.0 -> 0.2.0\npython scripts/bump_version.py major # 0.1.0 -> 1.0.0\n```\n\n## Testing\n\nThis provider includes comprehensive testing:\n\n- **Unit tests**: Mock-based testing of provider logic\n- **Parameter tests**: Validation of API parameter filtering\n- **Integration tests**: Real API testing (requires credentials)\n\n```bash\n# Set up test environment\nexport ANTHROPIC_API_KEY=\"your-api-key\"\n\n# Run specific test categories\nuv run pytest tests/test_provider_unit.py -v\nuv run pytest tests/test_parameter_filtering.py -v\nuv run pytest tests/test_anthropic_integration.py -v # requires API key\n```\n\n## Error Handling\n\nThe provider provides clear error messages for common issues:\n\n```python\ntry:\n result = lx.extract(...)\nexcept lx.exceptions.InferenceConfigError as e:\n # Configuration errors (missing API key, invalid params)\n print(f\"Configuration error: {e}\")\nexcept lx.exceptions.InferenceRuntimeError as e:\n # Runtime errors (API failures, network issues)\n print(f\"Runtime error: {e}\")\n print(f\"Original error: {e.original}\")\n```\n\n## Troubleshooting\n\n### Common Issues\n\n1. **Missing API Key**\n ```\n InferenceConfigError: Anthropic API key not provided\n ```\n Solution: Set `ANTHROPIC_API_KEY` environment variable or pass `api_key` parameter.\n\n2. **Invalid Model Name**\n ```\n AnthropicAPIError: model not found\n ```\n Solution: Use a valid Claude model name (see supported models above).\n\n3. **Rate Limiting**\n ```\n AnthropicAPIError: 429 Too Many Requests\n ```\n Solution: Reduce `max_workers` or add retry logic in your application.\n\n4. **Token Limit Exceeded**\n ```\n AnthropicAPIError: maximum context length exceeded\n ```\n Solution: Reduce input text length or increase `max_tokens` parameter.\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for a list of changes and version history.",
"bugtrack_url": null,
"license": null,
"summary": "LangExtract provider plugin for Anthropic Claude",
"version": "0.2.1",
"project_urls": {
"Bug Tracker": "https://github.com/Nobbettt/langextract-anthropic/issues",
"Documentation": "https://github.com/Nobbettt/langextract-anthropic/blob/main/README.md",
"Homepage": "https://github.com/Nobbettt/langextract-anthropic",
"Repository": "https://github.com/Nobbettt/langextract-anthropic"
},
"split_keywords": [
"anthropic",
" claude",
" extraction",
" langextract",
" llm",
" nlp",
" plugin",
" provider"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "414a945fb9b089a8ed90a8f63a733901921ba829edb62bdef707a620c9be5b4c",
"md5": "4d95e6719db972bc4783ebc2ac51d886",
"sha256": "514f22d644029c757859c81fbaab5893b9b58c097257a8e93488df3e0f23f940"
},
"downloads": -1,
"filename": "langextract_anthropic-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4d95e6719db972bc4783ebc2ac51d886",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 13707,
"upload_time": "2025-08-21T08:41:31",
"upload_time_iso_8601": "2025-08-21T08:41:31.928686Z",
"url": "https://files.pythonhosted.org/packages/41/4a/945fb9b089a8ed90a8f63a733901921ba829edb62bdef707a620c9be5b4c/langextract_anthropic-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "2f177588ef960c595cf101a1ecf79bf87ccbc7697350326582d38e3ca85ce9c4",
"md5": "523d3fa5490449963f60675ec3b465c3",
"sha256": "73c76fe9920bcd855a0498a629b6b7758cc804966a316d017925b7a38527f33c"
},
"downloads": -1,
"filename": "langextract_anthropic-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "523d3fa5490449963f60675ec3b465c3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 12952,
"upload_time": "2025-08-21T08:41:33",
"upload_time_iso_8601": "2025-08-21T08:41:33.039011Z",
"url": "https://files.pythonhosted.org/packages/2f/17/7588ef960c595cf101a1ecf79bf87ccbc7697350326582d38e3ca85ce9c4/langextract_anthropic-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-21 08:41:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Nobbettt",
"github_project": "langextract-anthropic",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "langextract-anthropic"
}