langextract-anthropic

Name	langextract-anthropic JSON
Version	0.2.1 JSON
	download
home_page	None
Summary	LangExtract provider plugin for Anthropic Claude
upload_time	2025-08-21 08:41:33
maintainer	Norbert Laszlo
docs_url	None
author	Norbert Laszlo
requires_python	>=3.10
license	None
keywords	anthropic claude extraction langextract llm nlp plugin provider
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # LangExtract Anthropic Provider

A provider plugin for [LangExtract](https://github.com/google/langextract) that integrates Anthropic's Claude API for robust, structured information extraction.

[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

## Features

- **Native Anthropic API**: Uses the official `anthropic` Python SDK for Claude models.
- **Safe parameter handling**: Whitelist filtering; unsupported params raise clear errors.
- **Concurrent batching**: Parallel inference for multi-prompt workloads.
- **Schema-aware**: Optional structured output mode (JSON) from LangExtract examples.
- **Modern packaging**: `pyproject.toml` with Hatch; works well with `uv`.

## Installation

### Using UV (Recommended)

```bash
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install the package
uv add langextract-anthropic
```

### Using pip

```bash
pip install langextract-anthropic
```

### From Source

```bash
git clone <repository-url>
cd langextract-anthropic
uv sync
```

## Quick Start

### 1. Set up Anthropic API credentials

```bash
export ANTHROPIC_API_KEY="your-api-key"
```

### 2. Use with LangExtract

```python
import langextract as lx

# Define extraction examples
examples = [
    lx.ExampleData(
        text="John Smith works at Microsoft in Seattle.",
        extractions=[
            lx.ExtractionData(
                extraction_class="Person",
                attributes={"name": "John Smith"}
            ),
            lx.ExtractionData(
                extraction_class="Organization", 
                attributes={"name": "Microsoft"}
            ),
            lx.ExtractionData(
                extraction_class="Location",
                attributes={"name": "Seattle"}
            ),
        ],
    ),
]

# Extract information using Anthropic Claude
result = lx.extract(
    text_or_documents="Sarah Johnson is a data scientist at Google in Mountain View.",
    prompt_description="Extract people, organizations, and locations.",
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
    temperature=0.1,
    max_tokens=512,
)

print(result.extractions)
```

## Supported Models

This provider supports all Anthropic Claude models:

- `claude-3-5-sonnet-latest` (recommended)
- `claude-3-5-sonnet-20241022`  
- `claude-3-5-haiku-latest`
- `claude-3-opus-latest`
- `claude-3-sonnet-20240229`
- `claude-3-haiku-20240307`

### Model ID Format

Use the `anthropic-` prefix or specify the model name directly:

- `anthropic-claude-3-5-sonnet-latest` → Uses model: `claude-3-5-sonnet-latest`
- `anthropic-claude-3-opus-latest` → Uses model: `claude-3-opus-latest`
- `claude-3-5-sonnet-latest` → Uses model directly

## Configuration Parameters

### Core Parameters

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `model_id` | `str` | Model identifier | `"claude-3-5-sonnet-latest"` |
| `api_key` | `str` | Anthropic API key | `ANTHROPIC_API_KEY` env var |
| `temperature` | `float` | Controls randomness (0-1) | `None` |
| `max_workers` | `int` | Parallel request workers | `10` |

### Anthropic API Parameters

| Parameter | Type | Description | Range |
|-----------|------|-------------|--------|
| `max_tokens` | `int` | Maximum tokens to generate | 1-8192 |
| `temperature` | `float` | Sampling temperature | 0.0-1.0 |
| `top_p` | `float` | Nucleus sampling | 0.0-1.0 |
| `top_k` | `int` | Top-k sampling | 0-200 |
| `stop_sequences` | `list[str]` | Stop sequences | Max 4 items |
| `metadata` | `dict` | Request tracking metadata | - |

### Usage Examples

```python
# Basic extraction
result = lx.extract(
    text_or_documents=text,
    prompt_description=prompt,
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
)

# With custom parameters
result = lx.extract(
    text_or_documents=text,
    prompt_description=prompt,
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
    temperature=0.3,
    max_tokens=1000,
    top_p=0.9,
    stop_sequences=["END", "STOP"],
    metadata={"user_id": "user123"},
)
```

## Environment Variables

| Variable | Description | Required |
|----------|-------------|----------|
| `ANTHROPIC_API_KEY` | Anthropic API key | Yes |

## Development

### Setup Development Environment

```bash
# Clone the repository
git clone <repository-url>
cd langextract-anthropic

# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync --dev
```

### Running Tests

```bash
# Run unit tests (no API calls)
uv run pytest tests/ -m "unit"

# Run integration tests (requires ANTHROPIC_API_KEY)
uv run pytest tests/ -m "integration" 

# Run all tests with coverage
uv run pytest tests/ --cov=langextract_anthropic --cov-report=html
```

### Development Commands

```bash
# Format code
uv run black langextract_anthropic tests
uv run isort langextract_anthropic tests

# Lint code
uv run ruff check langextract_anthropic tests
uv run mypy langextract_anthropic

# Build package
uv build

# Bump version
python scripts/bump_version.py patch  # 0.1.0 -> 0.1.1
python scripts/bump_version.py minor  # 0.1.0 -> 0.2.0
python scripts/bump_version.py major  # 0.1.0 -> 1.0.0
```

## Testing

This provider includes comprehensive testing:

- **Unit tests**: Mock-based testing of provider logic
- **Parameter tests**: Validation of API parameter filtering
- **Integration tests**: Real API testing (requires credentials)

```bash
# Set up test environment
export ANTHROPIC_API_KEY="your-api-key"

# Run specific test categories
uv run pytest tests/test_provider_unit.py -v
uv run pytest tests/test_parameter_filtering.py -v
uv run pytest tests/test_anthropic_integration.py -v  # requires API key
```

## Error Handling

The provider provides clear error messages for common issues:

```python
try:
    result = lx.extract(...)
except lx.exceptions.InferenceConfigError as e:
    # Configuration errors (missing API key, invalid params)
    print(f"Configuration error: {e}")
except lx.exceptions.InferenceRuntimeError as e:
    # Runtime errors (API failures, network issues)
    print(f"Runtime error: {e}")
    print(f"Original error: {e.original}")
```

## Troubleshooting

### Common Issues

1. **Missing API Key**
   ```
   InferenceConfigError: Anthropic API key not provided
   ```
   Solution: Set `ANTHROPIC_API_KEY` environment variable or pass `api_key` parameter.

2. **Invalid Model Name**
   ```
   AnthropicAPIError: model not found
   ```
   Solution: Use a valid Claude model name (see supported models above).

3. **Rate Limiting**
   ```
   AnthropicAPIError: 429 Too Many Requests
   ```
   Solution: Reduce `max_workers` or add retry logic in your application.

4. **Token Limit Exceeded**
   ```
   AnthropicAPIError: maximum context length exceeded
   ```
   Solution: Reduce input text length or increase `max_tokens` parameter.

## License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

## Changelog

See [CHANGELOG.md](CHANGELOG.md) for a list of changes and version history.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "langextract-anthropic",
    "maintainer": "Norbert Laszlo",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "anthropic, claude, extraction, langextract, llm, nlp, plugin, provider",
    "author": "Norbert Laszlo",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/2f/17/7588ef960c595cf101a1ecf79bf87ccbc7697350326582d38e3ca85ce9c4/langextract_anthropic-0.2.1.tar.gz",
    "platform": null,
    "description": "# LangExtract Anthropic Provider\n\nA provider plugin for [LangExtract](https://github.com/google/langextract) that integrates Anthropic's Claude API for robust, structured information extraction.\n\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n## Features\n\n- **Native Anthropic API**: Uses the official `anthropic` Python SDK for Claude models.\n- **Safe parameter handling**: Whitelist filtering; unsupported params raise clear errors.\n- **Concurrent batching**: Parallel inference for multi-prompt workloads.\n- **Schema-aware**: Optional structured output mode (JSON) from LangExtract examples.\n- **Modern packaging**: `pyproject.toml` with Hatch; works well with `uv`.\n\n## Installation\n\n### Using UV (Recommended)\n\n```bash\n# Install UV if you haven't already\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Install the package\nuv add langextract-anthropic\n```\n\n### Using pip\n\n```bash\npip install langextract-anthropic\n```\n\n### From Source\n\n```bash\ngit clone <repository-url>\ncd langextract-anthropic\nuv sync\n```\n\n## Quick Start\n\n### 1. Set up Anthropic API credentials\n\n```bash\nexport ANTHROPIC_API_KEY=\"your-api-key\"\n```\n\n### 2. Use with LangExtract\n\n```python\nimport langextract as lx\n\n# Define extraction examples\nexamples = [\n    lx.ExampleData(\n        text=\"John Smith works at Microsoft in Seattle.\",\n        extractions=[\n            lx.ExtractionData(\n                extraction_class=\"Person\",\n                attributes={\"name\": \"John Smith\"}\n            ),\n            lx.ExtractionData(\n                extraction_class=\"Organization\", \n                attributes={\"name\": \"Microsoft\"}\n            ),\n            lx.ExtractionData(\n                extraction_class=\"Location\",\n                attributes={\"name\": \"Seattle\"}\n            ),\n        ],\n    ),\n]\n\n# Extract information using Anthropic Claude\nresult = lx.extract(\n    text_or_documents=\"Sarah Johnson is a data scientist at Google in Mountain View.\",\n    prompt_description=\"Extract people, organizations, and locations.\",\n    examples=examples,\n    model_id=\"anthropic-claude-3-5-sonnet-latest\",\n    temperature=0.1,\n    max_tokens=512,\n)\n\nprint(result.extractions)\n```\n\n## Supported Models\n\nThis provider supports all Anthropic Claude models:\n\n- `claude-3-5-sonnet-latest` (recommended)\n- `claude-3-5-sonnet-20241022`  \n- `claude-3-5-haiku-latest`\n- `claude-3-opus-latest`\n- `claude-3-sonnet-20240229`\n- `claude-3-haiku-20240307`\n\n### Model ID Format\n\nUse the `anthropic-` prefix or specify the model name directly:\n\n- `anthropic-claude-3-5-sonnet-latest` \u2192 Uses model: `claude-3-5-sonnet-latest`\n- `anthropic-claude-3-opus-latest` \u2192 Uses model: `claude-3-opus-latest`\n- `claude-3-5-sonnet-latest` \u2192 Uses model directly\n\n## Configuration Parameters\n\n### Core Parameters\n\n| Parameter | Type | Description | Default |\n|-----------|------|-------------|---------|\n| `model_id` | `str` | Model identifier | `\"claude-3-5-sonnet-latest\"` |\n| `api_key` | `str` | Anthropic API key | `ANTHROPIC_API_KEY` env var |\n| `temperature` | `float` | Controls randomness (0-1) | `None` |\n| `max_workers` | `int` | Parallel request workers | `10` |\n\n### Anthropic API Parameters\n\n| Parameter | Type | Description | Range |\n|-----------|------|-------------|--------|\n| `max_tokens` | `int` | Maximum tokens to generate | 1-8192 |\n| `temperature` | `float` | Sampling temperature | 0.0-1.0 |\n| `top_p` | `float` | Nucleus sampling | 0.0-1.0 |\n| `top_k` | `int` | Top-k sampling | 0-200 |\n| `stop_sequences` | `list[str]` | Stop sequences | Max 4 items |\n| `metadata` | `dict` | Request tracking metadata | - |\n\n### Usage Examples\n\n```python\n# Basic extraction\nresult = lx.extract(\n    text_or_documents=text,\n    prompt_description=prompt,\n    examples=examples,\n    model_id=\"anthropic-claude-3-5-sonnet-latest\",\n)\n\n# With custom parameters\nresult = lx.extract(\n    text_or_documents=text,\n    prompt_description=prompt,\n    examples=examples,\n    model_id=\"anthropic-claude-3-5-sonnet-latest\",\n    temperature=0.3,\n    max_tokens=1000,\n    top_p=0.9,\n    stop_sequences=[\"END\", \"STOP\"],\n    metadata={\"user_id\": \"user123\"},\n)\n```\n\n## Environment Variables\n\n| Variable | Description | Required |\n|----------|-------------|----------|\n| `ANTHROPIC_API_KEY` | Anthropic API key | Yes |\n\n## Development\n\n### Setup Development Environment\n\n```bash\n# Clone the repository\ngit clone <repository-url>\ncd langextract-anthropic\n\n# Install UV\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Install dependencies\nuv sync --dev\n```\n\n### Running Tests\n\n```bash\n# Run unit tests (no API calls)\nuv run pytest tests/ -m \"unit\"\n\n# Run integration tests (requires ANTHROPIC_API_KEY)\nuv run pytest tests/ -m \"integration\" \n\n# Run all tests with coverage\nuv run pytest tests/ --cov=langextract_anthropic --cov-report=html\n```\n\n### Development Commands\n\n```bash\n# Format code\nuv run black langextract_anthropic tests\nuv run isort langextract_anthropic tests\n\n# Lint code\nuv run ruff check langextract_anthropic tests\nuv run mypy langextract_anthropic\n\n# Build package\nuv build\n\n# Bump version\npython scripts/bump_version.py patch  # 0.1.0 -> 0.1.1\npython scripts/bump_version.py minor  # 0.1.0 -> 0.2.0\npython scripts/bump_version.py major  # 0.1.0 -> 1.0.0\n```\n\n## Testing\n\nThis provider includes comprehensive testing:\n\n- **Unit tests**: Mock-based testing of provider logic\n- **Parameter tests**: Validation of API parameter filtering\n- **Integration tests**: Real API testing (requires credentials)\n\n```bash\n# Set up test environment\nexport ANTHROPIC_API_KEY=\"your-api-key\"\n\n# Run specific test categories\nuv run pytest tests/test_provider_unit.py -v\nuv run pytest tests/test_parameter_filtering.py -v\nuv run pytest tests/test_anthropic_integration.py -v  # requires API key\n```\n\n## Error Handling\n\nThe provider provides clear error messages for common issues:\n\n```python\ntry:\n    result = lx.extract(...)\nexcept lx.exceptions.InferenceConfigError as e:\n    # Configuration errors (missing API key, invalid params)\n    print(f\"Configuration error: {e}\")\nexcept lx.exceptions.InferenceRuntimeError as e:\n    # Runtime errors (API failures, network issues)\n    print(f\"Runtime error: {e}\")\n    print(f\"Original error: {e.original}\")\n```\n\n## Troubleshooting\n\n### Common Issues\n\n1. **Missing API Key**\n   ```\n   InferenceConfigError: Anthropic API key not provided\n   ```\n   Solution: Set `ANTHROPIC_API_KEY` environment variable or pass `api_key` parameter.\n\n2. **Invalid Model Name**\n   ```\n   AnthropicAPIError: model not found\n   ```\n   Solution: Use a valid Claude model name (see supported models above).\n\n3. **Rate Limiting**\n   ```\n   AnthropicAPIError: 429 Too Many Requests\n   ```\n   Solution: Reduce `max_workers` or add retry logic in your application.\n\n4. **Token Limit Exceeded**\n   ```\n   AnthropicAPIError: maximum context length exceeded\n   ```\n   Solution: Reduce input text length or increase `max_tokens` parameter.\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for a list of changes and version history.",
    "bugtrack_url": null,
    "license": null,
    "summary": "LangExtract provider plugin for Anthropic Claude",
    "version": "0.2.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/Nobbettt/langextract-anthropic/issues",
        "Documentation": "https://github.com/Nobbettt/langextract-anthropic/blob/main/README.md",
        "Homepage": "https://github.com/Nobbettt/langextract-anthropic",
        "Repository": "https://github.com/Nobbettt/langextract-anthropic"
    },
    "split_keywords": [
        "anthropic",
        " claude",
        " extraction",
        " langextract",
        " llm",
        " nlp",
        " plugin",
        " provider"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "414a945fb9b089a8ed90a8f63a733901921ba829edb62bdef707a620c9be5b4c",
                "md5": "4d95e6719db972bc4783ebc2ac51d886",
                "sha256": "514f22d644029c757859c81fbaab5893b9b58c097257a8e93488df3e0f23f940"
            },
            "downloads": -1,
            "filename": "langextract_anthropic-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4d95e6719db972bc4783ebc2ac51d886",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 13707,
            "upload_time": "2025-08-21T08:41:31",
            "upload_time_iso_8601": "2025-08-21T08:41:31.928686Z",
            "url": "https://files.pythonhosted.org/packages/41/4a/945fb9b089a8ed90a8f63a733901921ba829edb62bdef707a620c9be5b4c/langextract_anthropic-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2f177588ef960c595cf101a1ecf79bf87ccbc7697350326582d38e3ca85ce9c4",
                "md5": "523d3fa5490449963f60675ec3b465c3",
                "sha256": "73c76fe9920bcd855a0498a629b6b7758cc804966a316d017925b7a38527f33c"
            },
            "downloads": -1,
            "filename": "langextract_anthropic-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "523d3fa5490449963f60675ec3b465c3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 12952,
            "upload_time": "2025-08-21T08:41:33",
            "upload_time_iso_8601": "2025-08-21T08:41:33.039011Z",
            "url": "https://files.pythonhosted.org/packages/2f/17/7588ef960c595cf101a1ecf79bf87ccbc7697350326582d38e3ca85ce9c4/langextract_anthropic-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-21 08:41:33",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Nobbettt",
    "github_project": "langextract-anthropic",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "langextract-anthropic"
}

Norbert Laszlo