artl-mcp


Nameartl-mcp JSON
Version 0.31.0 PyPI version JSON
download
home_pageNone
SummaryPydanticAI and MCP approaches for getting textual representations of scientific literature from PMIDs, DOIs, etc.
upload_time2025-07-26 17:21:19
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseMIT
keywords doi mcp pmid pubmed scientific-literature
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ARTL-MCP: All Roads to Literature

An MCP (Model Context Protocol) server and CLI toolkit for comprehensive scientific literature retrieval and analysis using PMIDs, DOIs, PMCIDs, and keyword searches.

## Quick Start

### MCP Server (Recommended)

Add this to your Claude Desktop MCP configuration:

```json
{
  "mcpServers": {
    "artl-mcp": {
      "command": "uvx",
      "args": ["artl-mcp"]
    }
  }
}
```

### Standalone CLI

```bash
# Install and use CLI commands
uvx artl-cli get-doi-metadata --doi "10.1038/nature12373"
uvx artl-cli search-papers-by-keyword --query "CRISPR gene editing" --max-results 5
```

## Core Features

### 🔍 **Literature Search & Discovery**
- Keyword-based paper search with advanced filtering
- Recent publication discovery
- PubMed search with multiple output formats

### 📄 **Metadata & Content Retrieval**
- DOI/PMID/PMCID metadata extraction
- Abstract retrieval from PubMed
- Full-text access via multiple sources (PMC, Unpaywall, BioC)
- PDF text extraction and processing

### 🔗 **Identifier Management**
- Universal identifier conversion (DOI ↔ PMID ↔ PMCID)
- Support for multiple input formats (URLs, CURIEs, raw IDs)
- Comprehensive identifier validation

### 📊 **Citation Networks**
- Reference analysis (papers cited BY a given paper)
- Citation analysis (papers that CITE a given paper)
- Multi-source citation data (CrossRef, OpenAlex, Semantic Scholar)
- Related paper discovery through citation networks

### 💾 **File Management**
- **Save path reporting** - tools tell you exactly where files were saved
- **Content size management** - large content (>100KB) automatically truncated for LLM responses
- **Memory-efficient streaming** for large files (PDFs, datasets)  
- **Backward compatible** - existing code continues to work
- **Cross-platform filename sanitization**
- **Multiple output formats** (JSON, TXT, CSV, PDF)
- **Configurable directories** and temp file management

## Available MCP Tools

When running as an MCP server, you get access to 32 tools organized into categories:

### 🔄 **Enhanced Return Values**

All file-saving tools now return **structured data** with save path information:

```json
{
  "data": { /* tool-specific content */ },
  "saved_to": "/path/to/saved/file.json"
}
```

**Text-based tools** include content size management:

```json
{
  "content": "Full or truncated content...",
  "saved_to": "/path/to/saved/file.txt",
  "truncated": false,
  "content_length": 45678
}
```

### Literature Search
- `search_papers_by_keyword` - Advanced keyword search with filtering
- `search_recent_papers` - Find recent publications  
- `search_pubmed_for_pmids` - PubMed search returning PMIDs

### Metadata & Abstracts
- `get_doi_metadata` - Comprehensive DOI metadata
- `get_abstract_from_pubmed_id` - PubMed abstracts
- `get_doi_fetcher_metadata` - Enhanced metadata (requires email)
- `get_unpaywall_info` - Open access availability

### Full Text Access
- `get_full_text_from_doi` - Multi-source full text (requires email)
- `extract_pdf_text` - PDF text extraction
- `get_pmcid_text` - PMC full text
- `get_full_text_from_bioc` - BioC format text

### Identifier Conversion
- `get_all_identifiers` - Get all IDs for any identifier
- `doi_to_pmid`, `pmid_to_doi` - Individual conversions
- `validate_identifier` - Format validation

### Citation Networks  
- `get_paper_references` - Papers cited by a given paper
- `get_paper_citations` - Papers citing a given paper
- `get_citation_network` - Comprehensive citation data
- `find_related_papers` - Citation-based recommendations

## CLI Commands

The `artl-cli` command provides access to all functionality:

```bash
# Metadata retrieval
artl-cli get-doi-metadata --doi "10.1038/nature12373"
artl-cli get-abstract-from-pubmed-id --pmid "23851394"

# Literature search
artl-cli search-papers-by-keyword --query "machine learning" --max-results 10
artl-cli search-recent-papers --query "COVID-19" --years-back 2

# Full text (requires email for some sources)
artl-cli get-full-text-from-doi --doi "10.1038/nature12373" --email "user@institution.edu"

# Identifier conversion
artl-cli doi-to-pmid --doi "10.1038/nature12373"
artl-cli get-all-identifiers --identifier "PMC3737249"

# Citation analysis  
artl-cli get-paper-citations --doi "10.1038/nature12373"
```

## Configuration

### Email Requirements
Several APIs require institutional email addresses:
```bash
export ARTL_EMAIL_ADDR="researcher@university.edu"
# or create local/.env file with: ARTL_EMAIL_ADDR=researcher@university.edu
```

**MCP Client Configuration:** Different MCP clients support configuration injection. ARTL-MCP's enhanced configuration system provides multiple methods for email setup:

- **Claude Desktop**: Inherits system environment variables automatically
- **Goose Desktop**: Requires MCP extension configuration (see [USERS.md](USERS.md#mcp-client-configuration-issues))  
- **Other clients**: May support client-specific configuration injection

See [USERS.md](USERS.md#email-configuration-for-literature-access) for comprehensive configuration instructions.

### File Output
Configure where files are saved:
```bash
export ARTL_OUTPUT_DIR="~/Papers"           # Default: ~/Documents/artl-mcp
export ARTL_TEMP_DIR="/tmp/my-artl-temp"    # Default: system temp + artl-mcp
export ARTL_KEEP_TEMP_FILES=true            # Default: false
```

## Supported Identifier Formats

**DOI**: `10.1038/nature12373`, `doi:10.1038/nature12373`, `https://doi.org/10.1038/nature12373`

**PMID**: `23851394`, `PMID:23851394`, `pmid:23851394`

**PMCID**: `PMC3737249`, `3737249`, `PMC:3737249`

All tools automatically detect and normalize identifier formats.

## Development Setup

```bash
git clone https://github.com/contextualizer-ai/artl-mcp.git
cd artl-mcp
uv sync --group dev

# Run tests
make test                    # Fast development tests
make test-coverage          # Full test suite with coverage

# Code quality
make lint                   # Ruff linting
make format                 # Black formatting
make mypy                   # Type checking
```

## Documentation

- **[USERS.md](USERS.md)** - Comprehensive user guide with examples
- **[DEVELOPERS.md](DEVELOPERS.md)** - Development setup and architecture
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "artl-mcp",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "doi, mcp, pmid, pubmed, scientific-literature",
    "author": null,
    "author_email": "Mark Andrew Miller <MAM@lbl.gov>, Justin Reese <justaddcoffee@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/37/e9/f0f4b675748636097de265efe1bb452f56cf9f21bc621d1161a92e6e8a98/artl_mcp-0.31.0.tar.gz",
    "platform": null,
    "description": "# ARTL-MCP: All Roads to Literature\n\nAn MCP (Model Context Protocol) server and CLI toolkit for comprehensive scientific literature retrieval and analysis using PMIDs, DOIs, PMCIDs, and keyword searches.\n\n## Quick Start\n\n### MCP Server (Recommended)\n\nAdd this to your Claude Desktop MCP configuration:\n\n```json\n{\n  \"mcpServers\": {\n    \"artl-mcp\": {\n      \"command\": \"uvx\",\n      \"args\": [\"artl-mcp\"]\n    }\n  }\n}\n```\n\n### Standalone CLI\n\n```bash\n# Install and use CLI commands\nuvx artl-cli get-doi-metadata --doi \"10.1038/nature12373\"\nuvx artl-cli search-papers-by-keyword --query \"CRISPR gene editing\" --max-results 5\n```\n\n## Core Features\n\n### \ud83d\udd0d **Literature Search & Discovery**\n- Keyword-based paper search with advanced filtering\n- Recent publication discovery\n- PubMed search with multiple output formats\n\n### \ud83d\udcc4 **Metadata & Content Retrieval**\n- DOI/PMID/PMCID metadata extraction\n- Abstract retrieval from PubMed\n- Full-text access via multiple sources (PMC, Unpaywall, BioC)\n- PDF text extraction and processing\n\n### \ud83d\udd17 **Identifier Management**\n- Universal identifier conversion (DOI \u2194 PMID \u2194 PMCID)\n- Support for multiple input formats (URLs, CURIEs, raw IDs)\n- Comprehensive identifier validation\n\n### \ud83d\udcca **Citation Networks**\n- Reference analysis (papers cited BY a given paper)\n- Citation analysis (papers that CITE a given paper)\n- Multi-source citation data (CrossRef, OpenAlex, Semantic Scholar)\n- Related paper discovery through citation networks\n\n### \ud83d\udcbe **File Management**\n- **Save path reporting** - tools tell you exactly where files were saved\n- **Content size management** - large content (>100KB) automatically truncated for LLM responses\n- **Memory-efficient streaming** for large files (PDFs, datasets)  \n- **Backward compatible** - existing code continues to work\n- **Cross-platform filename sanitization**\n- **Multiple output formats** (JSON, TXT, CSV, PDF)\n- **Configurable directories** and temp file management\n\n## Available MCP Tools\n\nWhen running as an MCP server, you get access to 32 tools organized into categories:\n\n### \ud83d\udd04 **Enhanced Return Values**\n\nAll file-saving tools now return **structured data** with save path information:\n\n```json\n{\n  \"data\": { /* tool-specific content */ },\n  \"saved_to\": \"/path/to/saved/file.json\"\n}\n```\n\n**Text-based tools** include content size management:\n\n```json\n{\n  \"content\": \"Full or truncated content...\",\n  \"saved_to\": \"/path/to/saved/file.txt\",\n  \"truncated\": false,\n  \"content_length\": 45678\n}\n```\n\n### Literature Search\n- `search_papers_by_keyword` - Advanced keyword search with filtering\n- `search_recent_papers` - Find recent publications  \n- `search_pubmed_for_pmids` - PubMed search returning PMIDs\n\n### Metadata & Abstracts\n- `get_doi_metadata` - Comprehensive DOI metadata\n- `get_abstract_from_pubmed_id` - PubMed abstracts\n- `get_doi_fetcher_metadata` - Enhanced metadata (requires email)\n- `get_unpaywall_info` - Open access availability\n\n### Full Text Access\n- `get_full_text_from_doi` - Multi-source full text (requires email)\n- `extract_pdf_text` - PDF text extraction\n- `get_pmcid_text` - PMC full text\n- `get_full_text_from_bioc` - BioC format text\n\n### Identifier Conversion\n- `get_all_identifiers` - Get all IDs for any identifier\n- `doi_to_pmid`, `pmid_to_doi` - Individual conversions\n- `validate_identifier` - Format validation\n\n### Citation Networks  \n- `get_paper_references` - Papers cited by a given paper\n- `get_paper_citations` - Papers citing a given paper\n- `get_citation_network` - Comprehensive citation data\n- `find_related_papers` - Citation-based recommendations\n\n## CLI Commands\n\nThe `artl-cli` command provides access to all functionality:\n\n```bash\n# Metadata retrieval\nartl-cli get-doi-metadata --doi \"10.1038/nature12373\"\nartl-cli get-abstract-from-pubmed-id --pmid \"23851394\"\n\n# Literature search\nartl-cli search-papers-by-keyword --query \"machine learning\" --max-results 10\nartl-cli search-recent-papers --query \"COVID-19\" --years-back 2\n\n# Full text (requires email for some sources)\nartl-cli get-full-text-from-doi --doi \"10.1038/nature12373\" --email \"user@institution.edu\"\n\n# Identifier conversion\nartl-cli doi-to-pmid --doi \"10.1038/nature12373\"\nartl-cli get-all-identifiers --identifier \"PMC3737249\"\n\n# Citation analysis  \nartl-cli get-paper-citations --doi \"10.1038/nature12373\"\n```\n\n## Configuration\n\n### Email Requirements\nSeveral APIs require institutional email addresses:\n```bash\nexport ARTL_EMAIL_ADDR=\"researcher@university.edu\"\n# or create local/.env file with: ARTL_EMAIL_ADDR=researcher@university.edu\n```\n\n**MCP Client Configuration:** Different MCP clients support configuration injection. ARTL-MCP's enhanced configuration system provides multiple methods for email setup:\n\n- **Claude Desktop**: Inherits system environment variables automatically\n- **Goose Desktop**: Requires MCP extension configuration (see [USERS.md](USERS.md#mcp-client-configuration-issues))  \n- **Other clients**: May support client-specific configuration injection\n\nSee [USERS.md](USERS.md#email-configuration-for-literature-access) for comprehensive configuration instructions.\n\n### File Output\nConfigure where files are saved:\n```bash\nexport ARTL_OUTPUT_DIR=\"~/Papers\"           # Default: ~/Documents/artl-mcp\nexport ARTL_TEMP_DIR=\"/tmp/my-artl-temp\"    # Default: system temp + artl-mcp\nexport ARTL_KEEP_TEMP_FILES=true            # Default: false\n```\n\n## Supported Identifier Formats\n\n**DOI**: `10.1038/nature12373`, `doi:10.1038/nature12373`, `https://doi.org/10.1038/nature12373`\n\n**PMID**: `23851394`, `PMID:23851394`, `pmid:23851394`\n\n**PMCID**: `PMC3737249`, `3737249`, `PMC:3737249`\n\nAll tools automatically detect and normalize identifier formats.\n\n## Development Setup\n\n```bash\ngit clone https://github.com/contextualizer-ai/artl-mcp.git\ncd artl-mcp\nuv sync --group dev\n\n# Run tests\nmake test                    # Fast development tests\nmake test-coverage          # Full test suite with coverage\n\n# Code quality\nmake lint                   # Ruff linting\nmake format                 # Black formatting\nmake mypy                   # Type checking\n```\n\n## Documentation\n\n- **[USERS.md](USERS.md)** - Comprehensive user guide with examples\n- **[DEVELOPERS.md](DEVELOPERS.md)** - Development setup and architecture",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "PydanticAI and MCP approaches for getting textual representations of scientific literature from PMIDs, DOIs, etc.",
    "version": "0.31.0",
    "project_urls": {
        "Documentation": "https://github.com/contextualizer-ai/artl-mcp#readme",
        "Homepage": "https://github.com/contextualizer-ai/artl-mcp",
        "Issues": "https://github.com/contextualizer-ai/artl-mcp/issues",
        "Repository": "https://github.com/contextualizer-ai/artl-mcp"
    },
    "split_keywords": [
        "doi",
        " mcp",
        " pmid",
        " pubmed",
        " scientific-literature"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "661c34cb4689f3dbd0faffdfeeff5d09fa97ceb1542937e4489d3295c8b47351",
                "md5": "9b61e0a7f318d4afaf109883bb2b5ad0",
                "sha256": "cb82e4caf298a8473d46a789c8eb9041ccc11dfc3a61b5ca9ed450f485d17575"
            },
            "downloads": -1,
            "filename": "artl_mcp-0.31.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9b61e0a7f318d4afaf109883bb2b5ad0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 51345,
            "upload_time": "2025-07-26T17:21:18",
            "upload_time_iso_8601": "2025-07-26T17:21:18.079101Z",
            "url": "https://files.pythonhosted.org/packages/66/1c/34cb4689f3dbd0faffdfeeff5d09fa97ceb1542937e4489d3295c8b47351/artl_mcp-0.31.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "37e9f0f4b675748636097de265efe1bb452f56cf9f21bc621d1161a92e6e8a98",
                "md5": "07f329630a62df927d2dc31b262cda5f",
                "sha256": "66a0df8af0e4347afa7f1ac3ffd55aa32ca1514f4592bd74b040ef3d16700c0d"
            },
            "downloads": -1,
            "filename": "artl_mcp-0.31.0.tar.gz",
            "has_sig": false,
            "md5_digest": "07f329630a62df927d2dc31b262cda5f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 86993,
            "upload_time": "2025-07-26T17:21:19",
            "upload_time_iso_8601": "2025-07-26T17:21:19.606669Z",
            "url": "https://files.pythonhosted.org/packages/37/e9/f0f4b675748636097de265efe1bb452f56cf9f21bc621d1161a92e6e8a98/artl_mcp-0.31.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-26 17:21:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "contextualizer-ai",
    "github_project": "artl-mcp#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "artl-mcp"
}
        
Elapsed time: 1.63821s