Name | artl-mcp JSON |
Version |
0.31.0
JSON |
| download |
home_page | None |
Summary | PydanticAI and MCP approaches for getting textual representations of scientific literature from PMIDs, DOIs, etc. |
upload_time | 2025-07-26 17:21:19 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.11 |
license | MIT |
keywords |
doi
mcp
pmid
pubmed
scientific-literature
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# ARTL-MCP: All Roads to Literature
An MCP (Model Context Protocol) server and CLI toolkit for comprehensive scientific literature retrieval and analysis using PMIDs, DOIs, PMCIDs, and keyword searches.
## Quick Start
### MCP Server (Recommended)
Add this to your Claude Desktop MCP configuration:
```json
{
"mcpServers": {
"artl-mcp": {
"command": "uvx",
"args": ["artl-mcp"]
}
}
}
```
### Standalone CLI
```bash
# Install and use CLI commands
uvx artl-cli get-doi-metadata --doi "10.1038/nature12373"
uvx artl-cli search-papers-by-keyword --query "CRISPR gene editing" --max-results 5
```
## Core Features
### 🔍 **Literature Search & Discovery**
- Keyword-based paper search with advanced filtering
- Recent publication discovery
- PubMed search with multiple output formats
### 📄 **Metadata & Content Retrieval**
- DOI/PMID/PMCID metadata extraction
- Abstract retrieval from PubMed
- Full-text access via multiple sources (PMC, Unpaywall, BioC)
- PDF text extraction and processing
### 🔗 **Identifier Management**
- Universal identifier conversion (DOI ↔ PMID ↔ PMCID)
- Support for multiple input formats (URLs, CURIEs, raw IDs)
- Comprehensive identifier validation
### 📊 **Citation Networks**
- Reference analysis (papers cited BY a given paper)
- Citation analysis (papers that CITE a given paper)
- Multi-source citation data (CrossRef, OpenAlex, Semantic Scholar)
- Related paper discovery through citation networks
### 💾 **File Management**
- **Save path reporting** - tools tell you exactly where files were saved
- **Content size management** - large content (>100KB) automatically truncated for LLM responses
- **Memory-efficient streaming** for large files (PDFs, datasets)
- **Backward compatible** - existing code continues to work
- **Cross-platform filename sanitization**
- **Multiple output formats** (JSON, TXT, CSV, PDF)
- **Configurable directories** and temp file management
## Available MCP Tools
When running as an MCP server, you get access to 32 tools organized into categories:
### 🔄 **Enhanced Return Values**
All file-saving tools now return **structured data** with save path information:
```json
{
"data": { /* tool-specific content */ },
"saved_to": "/path/to/saved/file.json"
}
```
**Text-based tools** include content size management:
```json
{
"content": "Full or truncated content...",
"saved_to": "/path/to/saved/file.txt",
"truncated": false,
"content_length": 45678
}
```
### Literature Search
- `search_papers_by_keyword` - Advanced keyword search with filtering
- `search_recent_papers` - Find recent publications
- `search_pubmed_for_pmids` - PubMed search returning PMIDs
### Metadata & Abstracts
- `get_doi_metadata` - Comprehensive DOI metadata
- `get_abstract_from_pubmed_id` - PubMed abstracts
- `get_doi_fetcher_metadata` - Enhanced metadata (requires email)
- `get_unpaywall_info` - Open access availability
### Full Text Access
- `get_full_text_from_doi` - Multi-source full text (requires email)
- `extract_pdf_text` - PDF text extraction
- `get_pmcid_text` - PMC full text
- `get_full_text_from_bioc` - BioC format text
### Identifier Conversion
- `get_all_identifiers` - Get all IDs for any identifier
- `doi_to_pmid`, `pmid_to_doi` - Individual conversions
- `validate_identifier` - Format validation
### Citation Networks
- `get_paper_references` - Papers cited by a given paper
- `get_paper_citations` - Papers citing a given paper
- `get_citation_network` - Comprehensive citation data
- `find_related_papers` - Citation-based recommendations
## CLI Commands
The `artl-cli` command provides access to all functionality:
```bash
# Metadata retrieval
artl-cli get-doi-metadata --doi "10.1038/nature12373"
artl-cli get-abstract-from-pubmed-id --pmid "23851394"
# Literature search
artl-cli search-papers-by-keyword --query "machine learning" --max-results 10
artl-cli search-recent-papers --query "COVID-19" --years-back 2
# Full text (requires email for some sources)
artl-cli get-full-text-from-doi --doi "10.1038/nature12373" --email "user@institution.edu"
# Identifier conversion
artl-cli doi-to-pmid --doi "10.1038/nature12373"
artl-cli get-all-identifiers --identifier "PMC3737249"
# Citation analysis
artl-cli get-paper-citations --doi "10.1038/nature12373"
```
## Configuration
### Email Requirements
Several APIs require institutional email addresses:
```bash
export ARTL_EMAIL_ADDR="researcher@university.edu"
# or create local/.env file with: ARTL_EMAIL_ADDR=researcher@university.edu
```
**MCP Client Configuration:** Different MCP clients support configuration injection. ARTL-MCP's enhanced configuration system provides multiple methods for email setup:
- **Claude Desktop**: Inherits system environment variables automatically
- **Goose Desktop**: Requires MCP extension configuration (see [USERS.md](USERS.md#mcp-client-configuration-issues))
- **Other clients**: May support client-specific configuration injection
See [USERS.md](USERS.md#email-configuration-for-literature-access) for comprehensive configuration instructions.
### File Output
Configure where files are saved:
```bash
export ARTL_OUTPUT_DIR="~/Papers" # Default: ~/Documents/artl-mcp
export ARTL_TEMP_DIR="/tmp/my-artl-temp" # Default: system temp + artl-mcp
export ARTL_KEEP_TEMP_FILES=true # Default: false
```
## Supported Identifier Formats
**DOI**: `10.1038/nature12373`, `doi:10.1038/nature12373`, `https://doi.org/10.1038/nature12373`
**PMID**: `23851394`, `PMID:23851394`, `pmid:23851394`
**PMCID**: `PMC3737249`, `3737249`, `PMC:3737249`
All tools automatically detect and normalize identifier formats.
## Development Setup
```bash
git clone https://github.com/contextualizer-ai/artl-mcp.git
cd artl-mcp
uv sync --group dev
# Run tests
make test # Fast development tests
make test-coverage # Full test suite with coverage
# Code quality
make lint # Ruff linting
make format # Black formatting
make mypy # Type checking
```
## Documentation
- **[USERS.md](USERS.md)** - Comprehensive user guide with examples
- **[DEVELOPERS.md](DEVELOPERS.md)** - Development setup and architecture
Raw data
{
"_id": null,
"home_page": null,
"name": "artl-mcp",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "doi, mcp, pmid, pubmed, scientific-literature",
"author": null,
"author_email": "Mark Andrew Miller <MAM@lbl.gov>, Justin Reese <justaddcoffee@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/37/e9/f0f4b675748636097de265efe1bb452f56cf9f21bc621d1161a92e6e8a98/artl_mcp-0.31.0.tar.gz",
"platform": null,
"description": "# ARTL-MCP: All Roads to Literature\n\nAn MCP (Model Context Protocol) server and CLI toolkit for comprehensive scientific literature retrieval and analysis using PMIDs, DOIs, PMCIDs, and keyword searches.\n\n## Quick Start\n\n### MCP Server (Recommended)\n\nAdd this to your Claude Desktop MCP configuration:\n\n```json\n{\n \"mcpServers\": {\n \"artl-mcp\": {\n \"command\": \"uvx\",\n \"args\": [\"artl-mcp\"]\n }\n }\n}\n```\n\n### Standalone CLI\n\n```bash\n# Install and use CLI commands\nuvx artl-cli get-doi-metadata --doi \"10.1038/nature12373\"\nuvx artl-cli search-papers-by-keyword --query \"CRISPR gene editing\" --max-results 5\n```\n\n## Core Features\n\n### \ud83d\udd0d **Literature Search & Discovery**\n- Keyword-based paper search with advanced filtering\n- Recent publication discovery\n- PubMed search with multiple output formats\n\n### \ud83d\udcc4 **Metadata & Content Retrieval**\n- DOI/PMID/PMCID metadata extraction\n- Abstract retrieval from PubMed\n- Full-text access via multiple sources (PMC, Unpaywall, BioC)\n- PDF text extraction and processing\n\n### \ud83d\udd17 **Identifier Management**\n- Universal identifier conversion (DOI \u2194 PMID \u2194 PMCID)\n- Support for multiple input formats (URLs, CURIEs, raw IDs)\n- Comprehensive identifier validation\n\n### \ud83d\udcca **Citation Networks**\n- Reference analysis (papers cited BY a given paper)\n- Citation analysis (papers that CITE a given paper)\n- Multi-source citation data (CrossRef, OpenAlex, Semantic Scholar)\n- Related paper discovery through citation networks\n\n### \ud83d\udcbe **File Management**\n- **Save path reporting** - tools tell you exactly where files were saved\n- **Content size management** - large content (>100KB) automatically truncated for LLM responses\n- **Memory-efficient streaming** for large files (PDFs, datasets) \n- **Backward compatible** - existing code continues to work\n- **Cross-platform filename sanitization**\n- **Multiple output formats** (JSON, TXT, CSV, PDF)\n- **Configurable directories** and temp file management\n\n## Available MCP Tools\n\nWhen running as an MCP server, you get access to 32 tools organized into categories:\n\n### \ud83d\udd04 **Enhanced Return Values**\n\nAll file-saving tools now return **structured data** with save path information:\n\n```json\n{\n \"data\": { /* tool-specific content */ },\n \"saved_to\": \"/path/to/saved/file.json\"\n}\n```\n\n**Text-based tools** include content size management:\n\n```json\n{\n \"content\": \"Full or truncated content...\",\n \"saved_to\": \"/path/to/saved/file.txt\",\n \"truncated\": false,\n \"content_length\": 45678\n}\n```\n\n### Literature Search\n- `search_papers_by_keyword` - Advanced keyword search with filtering\n- `search_recent_papers` - Find recent publications \n- `search_pubmed_for_pmids` - PubMed search returning PMIDs\n\n### Metadata & Abstracts\n- `get_doi_metadata` - Comprehensive DOI metadata\n- `get_abstract_from_pubmed_id` - PubMed abstracts\n- `get_doi_fetcher_metadata` - Enhanced metadata (requires email)\n- `get_unpaywall_info` - Open access availability\n\n### Full Text Access\n- `get_full_text_from_doi` - Multi-source full text (requires email)\n- `extract_pdf_text` - PDF text extraction\n- `get_pmcid_text` - PMC full text\n- `get_full_text_from_bioc` - BioC format text\n\n### Identifier Conversion\n- `get_all_identifiers` - Get all IDs for any identifier\n- `doi_to_pmid`, `pmid_to_doi` - Individual conversions\n- `validate_identifier` - Format validation\n\n### Citation Networks \n- `get_paper_references` - Papers cited by a given paper\n- `get_paper_citations` - Papers citing a given paper\n- `get_citation_network` - Comprehensive citation data\n- `find_related_papers` - Citation-based recommendations\n\n## CLI Commands\n\nThe `artl-cli` command provides access to all functionality:\n\n```bash\n# Metadata retrieval\nartl-cli get-doi-metadata --doi \"10.1038/nature12373\"\nartl-cli get-abstract-from-pubmed-id --pmid \"23851394\"\n\n# Literature search\nartl-cli search-papers-by-keyword --query \"machine learning\" --max-results 10\nartl-cli search-recent-papers --query \"COVID-19\" --years-back 2\n\n# Full text (requires email for some sources)\nartl-cli get-full-text-from-doi --doi \"10.1038/nature12373\" --email \"user@institution.edu\"\n\n# Identifier conversion\nartl-cli doi-to-pmid --doi \"10.1038/nature12373\"\nartl-cli get-all-identifiers --identifier \"PMC3737249\"\n\n# Citation analysis \nartl-cli get-paper-citations --doi \"10.1038/nature12373\"\n```\n\n## Configuration\n\n### Email Requirements\nSeveral APIs require institutional email addresses:\n```bash\nexport ARTL_EMAIL_ADDR=\"researcher@university.edu\"\n# or create local/.env file with: ARTL_EMAIL_ADDR=researcher@university.edu\n```\n\n**MCP Client Configuration:** Different MCP clients support configuration injection. ARTL-MCP's enhanced configuration system provides multiple methods for email setup:\n\n- **Claude Desktop**: Inherits system environment variables automatically\n- **Goose Desktop**: Requires MCP extension configuration (see [USERS.md](USERS.md#mcp-client-configuration-issues)) \n- **Other clients**: May support client-specific configuration injection\n\nSee [USERS.md](USERS.md#email-configuration-for-literature-access) for comprehensive configuration instructions.\n\n### File Output\nConfigure where files are saved:\n```bash\nexport ARTL_OUTPUT_DIR=\"~/Papers\" # Default: ~/Documents/artl-mcp\nexport ARTL_TEMP_DIR=\"/tmp/my-artl-temp\" # Default: system temp + artl-mcp\nexport ARTL_KEEP_TEMP_FILES=true # Default: false\n```\n\n## Supported Identifier Formats\n\n**DOI**: `10.1038/nature12373`, `doi:10.1038/nature12373`, `https://doi.org/10.1038/nature12373`\n\n**PMID**: `23851394`, `PMID:23851394`, `pmid:23851394`\n\n**PMCID**: `PMC3737249`, `3737249`, `PMC:3737249`\n\nAll tools automatically detect and normalize identifier formats.\n\n## Development Setup\n\n```bash\ngit clone https://github.com/contextualizer-ai/artl-mcp.git\ncd artl-mcp\nuv sync --group dev\n\n# Run tests\nmake test # Fast development tests\nmake test-coverage # Full test suite with coverage\n\n# Code quality\nmake lint # Ruff linting\nmake format # Black formatting\nmake mypy # Type checking\n```\n\n## Documentation\n\n- **[USERS.md](USERS.md)** - Comprehensive user guide with examples\n- **[DEVELOPERS.md](DEVELOPERS.md)** - Development setup and architecture",
"bugtrack_url": null,
"license": "MIT",
"summary": "PydanticAI and MCP approaches for getting textual representations of scientific literature from PMIDs, DOIs, etc.",
"version": "0.31.0",
"project_urls": {
"Documentation": "https://github.com/contextualizer-ai/artl-mcp#readme",
"Homepage": "https://github.com/contextualizer-ai/artl-mcp",
"Issues": "https://github.com/contextualizer-ai/artl-mcp/issues",
"Repository": "https://github.com/contextualizer-ai/artl-mcp"
},
"split_keywords": [
"doi",
" mcp",
" pmid",
" pubmed",
" scientific-literature"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "661c34cb4689f3dbd0faffdfeeff5d09fa97ceb1542937e4489d3295c8b47351",
"md5": "9b61e0a7f318d4afaf109883bb2b5ad0",
"sha256": "cb82e4caf298a8473d46a789c8eb9041ccc11dfc3a61b5ca9ed450f485d17575"
},
"downloads": -1,
"filename": "artl_mcp-0.31.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9b61e0a7f318d4afaf109883bb2b5ad0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 51345,
"upload_time": "2025-07-26T17:21:18",
"upload_time_iso_8601": "2025-07-26T17:21:18.079101Z",
"url": "https://files.pythonhosted.org/packages/66/1c/34cb4689f3dbd0faffdfeeff5d09fa97ceb1542937e4489d3295c8b47351/artl_mcp-0.31.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "37e9f0f4b675748636097de265efe1bb452f56cf9f21bc621d1161a92e6e8a98",
"md5": "07f329630a62df927d2dc31b262cda5f",
"sha256": "66a0df8af0e4347afa7f1ac3ffd55aa32ca1514f4592bd74b040ef3d16700c0d"
},
"downloads": -1,
"filename": "artl_mcp-0.31.0.tar.gz",
"has_sig": false,
"md5_digest": "07f329630a62df927d2dc31b262cda5f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 86993,
"upload_time": "2025-07-26T17:21:19",
"upload_time_iso_8601": "2025-07-26T17:21:19.606669Z",
"url": "https://files.pythonhosted.org/packages/37/e9/f0f4b675748636097de265efe1bb452f56cf9f21bc621d1161a92e6e8a98/artl_mcp-0.31.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-26 17:21:19",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "contextualizer-ai",
"github_project": "artl-mcp#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "artl-mcp"
}