# MedLitAnno: Medical Literature Annotation System
[](LICENSE)
[](https://www.python.org/)
[](https://pypi.org/project/medlitanno/)
[](https://github.com/chenxingqiang/medlitanno/actions)
MedLitAnno is a comprehensive medical literature analysis platform that combines automated annotation, PubMed search integration, and causal knowledge discovery. Extract structured information about bacteria-disease relationships from scientific texts, search and annotate PubMed literature automatically, and discover causal relationships through Mendelian Randomization (MR) analysis.
## 🌟 Features
### 🔍 PubMed Literature Search & Annotation
- **Direct PubMed Integration**: Search medical literature using keywords, diseases, bacteria, or recent publications
- **Automated Annotation Pipeline**: Seamlessly combine literature search with LLM-powered annotation
- **Multiple Search Strategies**: Basic search, disease-bacteria relationships, recent articles, keyword combinations
- **Excel Export**: Save search results with comprehensive metadata and citation information
- **Rate-Limited API Access**: Compliant with PubMed guidelines for responsible usage
### 📝 Advanced Medical Literature Annotation
- **Multi-model Support**: Use OpenAI, DeepSeek, DeepSeek Reasoner, or Qianwen models
- **Automatic Position Matching**: Intelligent text position calculation with 100% success rate
- **Smart Content Recognition**: LLM focuses on content identification while system handles positioning
- **Robust Processing**: Breakpoint resume and error retry mechanisms for network stability
- **Comprehensive Annotation**: Entity recognition, relation extraction, evidence detection
- **Batch Processing**: Process entire directories of Excel files with progress monitoring
- **Format Conversion**: Export to Label Studio compatible format
### MRAgent: Causal Knowledge Discovery
- **Automated Literature Analysis**: Scans scientific literature to discover potential exposure-outcome pairs
- **Causal Inference**: Performs Mendelian Randomization using GWAS data
- **Knowledge Discovery Mode**: Autonomously identifies potential causal factors for diseases
- **Causal Validation Mode**: Validates specific causal hypotheses
- **GWAS Integration**: Seamless integration with OpenGWAS database
## 🚀 Installation
### From PyPI (Recommended)
```bash
pip install medlitanno
```
### From Source
```bash
# Clone the repository
git clone https://github.com/chenxingqiang/medlitanno.git
cd medlitanno
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install the package
pip install -e .
```
## ⚙️ Configuration
### API Keys and Environment Setup
Set your API keys and configuration as environment variables:
```bash
# For LLM models
export DEEPSEEK_API_KEY="your-deepseek-api-key"
export QIANWEN_API_KEY="your-qianwen-api-key"
export OPENAI_API_KEY="your-openai-api-key" # Optional
# For PubMed search (required for literature search)
export PUBMED_EMAIL="your_email@example.com" # Required by PubMed API
export PUBMED_TOOL="medlitanno" # Tool identifier
# For MR analysis (optional)
export OPENGWAS_JWT="your-opengwas-jwt-token"
```
### Configuration File
You can also create a `.env` file in your project directory:
```bash
# Copy the example configuration
cp config/env.example .env
# Edit .env with your actual API keys and settings
```
## 📊 Usage
### 🔍 PubMed Literature Search
#### Command Line Interface
```bash
# Search PubMed and automatically annotate results
medlitanno search "Helicobacter pylori gastric cancer" --max-results 50
# Search for disease-bacteria relationships
medlitanno search "diabetes microbiome" --disease "diabetes" --bacteria "gut bacteria"
# Search recent publications (last 30 days)
medlitanno search "COVID-19 microbiome" --recent-days 30
# Search and save results to Excel (without annotation)
medlitanno search "inflammatory bowel disease" --output-dir ./results --max-results 100
```
#### Python API
```python
from medlitanno.pubmed import PubMedSearcher, search_and_annotate
import os
# Initialize PubMed searcher
searcher = PubMedSearcher(
email=os.environ.get("PUBMED_EMAIL"),
tool="medlitanno"
)
# Search for articles
results = searcher.search("Helicobacter pylori gastric cancer", max_results=50)
print(f"Found {len(results.articles)} articles")
# Search and automatically annotate
search_and_annotate(
query="microbiome inflammatory disease",
api_key=os.environ.get("DEEPSEEK_API_KEY"),
model="deepseek-chat",
max_results=20,
output_dir="./results"
)
```
### 📝 Medical Literature Annotation
#### Command Line Interface
```bash
# Annotate medical literature
medlitanno annotate --data-dir datatrain --model deepseek-chat
# Use DeepSeek Reasoner for enhanced inference
medlitanno annotate --data-dir datatrain --model deepseek-reasoner --model-type deepseek
```
#### Python API
```python
from medlitanno.annotation import MedicalAnnotationLLM
import os
# Initialize the annotator with automatic position matching
annotator = MedicalAnnotationLLM(
api_key=os.environ.get("DEEPSEEK_API_KEY"),
model="deepseek-chat",
model_type="deepseek"
)
# Annotate text with automatic position calculation
text = "Helicobacter pylori infection is associated with gastric cancer."
result = annotator.annotate_text(text)
# Print results with position information
print(f"Entities: {result.entities}")
for entity in result.entities:
print(f" - {entity.text} ({entity.label}): pos {entity.start_pos}-{entity.end_pos}, confidence: {entity.confidence:.2f}")
print(f"Relations: {result.relations}")
print(f"Evidences: {result.evidences}")
for evidence in result.evidences:
print(f" - {evidence.text}: pos {evidence.start_pos}-{evidence.end_pos}, confidence: {evidence.confidence:.2f}")
```
### MRAgent: Causal Knowledge Discovery
#### Command Line Interface
```bash
# Knowledge Discovery mode
medlitanno mr --outcome "back pain" --model gpt-4o
# Causal Validation mode
medlitanno mr --exposure "osteoarthritis" --outcome "back pain" --mode causal
```
#### Python API
```python
from medlitanno.mragent import MRAgent, MRAgentOE
import os
# Knowledge Discovery mode
agent = MRAgent(
outcome="back pain",
AI_key=os.environ.get("OPENAI_API_KEY"),
LLM_model="gpt-4o",
gwas_token=os.environ.get("OPENGWAS_JWT")
)
agent.run()
# Causal Validation mode
agent_oe = MRAgentOE(
exposure="osteoarthritis",
outcome="back pain",
AI_key=os.environ.get("OPENAI_API_KEY"),
LLM_model="gpt-4o",
gwas_token=os.environ.get("OPENGWAS_JWT")
)
agent_oe.run()
```
## 📄 Output Format
### PubMed Search Results
PubMed search provides:
1. **Article Metadata**: Title, abstract, authors, publication date, journal
2. **Citation Information**: PMID, DOI, publication details
3. **Search Statistics**: Total results, query details, search timestamp
4. **Excel Export**: Structured data export for further analysis
### Annotation System
The annotation system extracts structured information with automatic position matching:
1. **Entities**: Bacteria and Disease mentions with precise text positions
2. **Relations**: Connections between entities with relation types
3. **Evidences**: Text spans supporting the relations with confidence scores
4. **Position Statistics**: Success rates and confidence metrics for quality assessment
#### Relation Types
- `contributes_to`: Bacteria contributes to disease development
- `ameliorates`: Bacteria improves or alleviates disease
- `correlated_with`: Bacteria and disease show correlation
- `biomarker_for`: Bacteria serves as a biomarker for disease
#### Position Matching Features
- **100% Success Rate**: Intelligent matching strategies ensure reliable position detection
- **Multiple Strategies**: Exact, case-insensitive, normalized, fuzzy, and partial matching
- **Confidence Scoring**: Average confidence >0.8 for quality assessment
- **Automatic Fallback**: Progressive matching strategies for robust results
### MRAgent System
MRAgent provides:
1. **Literature Analysis**: Summary of relevant scientific papers
2. **Potential Exposures**: List of potential causal factors
3. **MR Results**: Statistical evidence for causal relationships
4. **Visualizations**: Forest plots and other visual representations
5. **Recommendations**: Insights for further research
## 🚀 Performance
### Literature Search
- **PubMed Integration**: Real-time search with rate limiting (3 requests/second)
- **Search Speed**: ~2-5 seconds per query (depends on result count)
- **Result Processing**: Handles thousands of articles efficiently
### Annotation System
- **Processing Speed**: ~30-60 seconds per document (depends on model and text length)
- **Position Matching**: 100% success rate with <1 second processing per document
- **Batch Processing**: Optimized for large-scale literature analysis
- **Accuracy**: Comparable to manual annotation in controlled tests
### MR Analysis
- **Literature Processing**: Handles hundreds of articles and GWAS datasets efficiently
- **Causal Discovery**: Automated analysis of complex exposure-outcome relationships
## 💪 Stability & Reliability
### Network Resilience
- **Automatic Retry**: Smart retry mechanisms for network instability
- **Rate Limiting**: Compliant with API guidelines and rate limits
- **Connection Recovery**: Robust handling of network interruptions
### Processing Reliability
- **Breakpoint Resume**: Automatically continues from the last processed file
- **Error Recovery**: Graceful handling of parsing and processing errors
- **Progress Monitoring**: Real-time tracking with detailed statistics
- **Data Validation**: Comprehensive validation of results and outputs
## 📋 Project Structure
```
medlitanno/
├── src/ # Source code
│ └── medlitanno/ # Main package
│ ├── annotation/ # Annotation system with position matching
│ ├── pubmed/ # PubMed search integration
│ ├── common/ # Shared utilities and base classes
│ └── mragent/ # MR analysis (optional, requires biopython)
├── docs/ # Documentation
│ ├── PUBMED_SEARCH_GUIDE.md # PubMed search usage guide
│ ├── README_annotation.md # Annotation system documentation
│ └── ...
├── examples/ # Example scripts and demos
│ ├── pubmed_search_demo.py # PubMed search examples
│ ├── position_matching_demo.py # Position matching examples
│ └── ...
├── tests/ # Unit tests
├── scripts/ # Utility scripts
├── config/ # Configuration files
│ ├── env.example # Environment configuration template
│ └── requirements.txt # Dependencies
├── CHANGELOG.md # Version history
└── ...
```
## 📚 Documentation
- **[PubMed Search Guide](docs/PUBMED_SEARCH_GUIDE.md)**: Complete guide for literature search functionality
- **[Annotation Documentation](docs/README_annotation.md)**: Detailed annotation system documentation
- **[Setup Guide](docs/SETUP.md)**: Installation and configuration instructions
- **[Examples](examples/)**: Working examples and demo scripts
## 🔄 Version History
See [CHANGELOG.md](CHANGELOG.md) for detailed version history and feature updates.
## 🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
### Development Setup
```bash
# Clone the repository
git clone https://github.com/chenxingqiang/medlitanno.git
cd medlitanno
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest tests/
```
## 📜 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 📧 Contact
For questions or feedback, please contact [joy66777@gmail.com](mailto:joy66777@gmail.com).
## 🙏 Acknowledgments
- **[MRAgent](https://github.com/xuwei1997/MRAgent)**: Innovative automated agent for causal knowledge discovery via Mendelian Randomization
- **[PyMed](https://github.com/gijswobben/pymed)**: Python library for PubMed API access
- **[OpenGWAS](https://gwas.mrcieu.ac.uk/)**: GWAS summary data for causal inference
---
**Latest Version**: v1.1.0 - Now with PubMed search integration and automatic position matching!
Raw data
{
"_id": null,
"home_page": "https://github.com/chenxingqiang/medlitanno",
"name": "medlitanno",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "medical literature, annotation, pubmed search, mendelian randomization, llm, biomedical nlp, causal inference, gwas, automation, literature mining",
"author": "Chen Xingqiang",
"author_email": "Chen Xingqiang <chenxingqiang@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/04/c5/5542fa77e58aca420ec82e2daf703ee295afe9eda6e02eb86c966342782f/medlitanno-1.1.1.tar.gz",
"platform": null,
"description": "# MedLitAnno: Medical Literature Annotation System\n\n[](LICENSE)\n[](https://www.python.org/)\n[](https://pypi.org/project/medlitanno/)\n[](https://github.com/chenxingqiang/medlitanno/actions)\n\nMedLitAnno is a comprehensive medical literature analysis platform that combines automated annotation, PubMed search integration, and causal knowledge discovery. Extract structured information about bacteria-disease relationships from scientific texts, search and annotate PubMed literature automatically, and discover causal relationships through Mendelian Randomization (MR) analysis.\n\n## \ud83c\udf1f Features\n\n### \ud83d\udd0d PubMed Literature Search & Annotation\n- **Direct PubMed Integration**: Search medical literature using keywords, diseases, bacteria, or recent publications\n- **Automated Annotation Pipeline**: Seamlessly combine literature search with LLM-powered annotation\n- **Multiple Search Strategies**: Basic search, disease-bacteria relationships, recent articles, keyword combinations\n- **Excel Export**: Save search results with comprehensive metadata and citation information\n- **Rate-Limited API Access**: Compliant with PubMed guidelines for responsible usage\n\n### \ud83d\udcdd Advanced Medical Literature Annotation\n- **Multi-model Support**: Use OpenAI, DeepSeek, DeepSeek Reasoner, or Qianwen models\n- **Automatic Position Matching**: Intelligent text position calculation with 100% success rate\n- **Smart Content Recognition**: LLM focuses on content identification while system handles positioning\n- **Robust Processing**: Breakpoint resume and error retry mechanisms for network stability\n- **Comprehensive Annotation**: Entity recognition, relation extraction, evidence detection\n- **Batch Processing**: Process entire directories of Excel files with progress monitoring\n- **Format Conversion**: Export to Label Studio compatible format\n\n### MRAgent: Causal Knowledge Discovery\n- **Automated Literature Analysis**: Scans scientific literature to discover potential exposure-outcome pairs\n- **Causal Inference**: Performs Mendelian Randomization using GWAS data\n- **Knowledge Discovery Mode**: Autonomously identifies potential causal factors for diseases\n- **Causal Validation Mode**: Validates specific causal hypotheses\n- **GWAS Integration**: Seamless integration with OpenGWAS database\n\n## \ud83d\ude80 Installation\n\n### From PyPI (Recommended)\n\n```bash\npip install medlitanno\n```\n\n### From Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/chenxingqiang/medlitanno.git\ncd medlitanno\n\n# Create and activate virtual environment\npython -m venv venv\nsource venv/bin/activate # On Windows: venv\\Scripts\\activate\n\n# Install the package\npip install -e .\n```\n\n## \u2699\ufe0f Configuration\n\n### API Keys and Environment Setup\n\nSet your API keys and configuration as environment variables:\n\n```bash\n# For LLM models\nexport DEEPSEEK_API_KEY=\"your-deepseek-api-key\"\nexport QIANWEN_API_KEY=\"your-qianwen-api-key\"\nexport OPENAI_API_KEY=\"your-openai-api-key\" # Optional\n\n# For PubMed search (required for literature search)\nexport PUBMED_EMAIL=\"your_email@example.com\" # Required by PubMed API\nexport PUBMED_TOOL=\"medlitanno\" # Tool identifier\n\n# For MR analysis (optional)\nexport OPENGWAS_JWT=\"your-opengwas-jwt-token\"\n```\n\n### Configuration File\n\nYou can also create a `.env` file in your project directory:\n\n```bash\n# Copy the example configuration\ncp config/env.example .env\n# Edit .env with your actual API keys and settings\n```\n\n## \ud83d\udcca Usage\n\n### \ud83d\udd0d PubMed Literature Search\n\n#### Command Line Interface\n\n```bash\n# Search PubMed and automatically annotate results\nmedlitanno search \"Helicobacter pylori gastric cancer\" --max-results 50\n\n# Search for disease-bacteria relationships\nmedlitanno search \"diabetes microbiome\" --disease \"diabetes\" --bacteria \"gut bacteria\"\n\n# Search recent publications (last 30 days)\nmedlitanno search \"COVID-19 microbiome\" --recent-days 30\n\n# Search and save results to Excel (without annotation)\nmedlitanno search \"inflammatory bowel disease\" --output-dir ./results --max-results 100\n```\n\n#### Python API\n\n```python\nfrom medlitanno.pubmed import PubMedSearcher, search_and_annotate\nimport os\n\n# Initialize PubMed searcher\nsearcher = PubMedSearcher(\n email=os.environ.get(\"PUBMED_EMAIL\"),\n tool=\"medlitanno\"\n)\n\n# Search for articles\nresults = searcher.search(\"Helicobacter pylori gastric cancer\", max_results=50)\nprint(f\"Found {len(results.articles)} articles\")\n\n# Search and automatically annotate\nsearch_and_annotate(\n query=\"microbiome inflammatory disease\",\n api_key=os.environ.get(\"DEEPSEEK_API_KEY\"),\n model=\"deepseek-chat\",\n max_results=20,\n output_dir=\"./results\"\n)\n```\n\n### \ud83d\udcdd Medical Literature Annotation\n\n#### Command Line Interface\n\n```bash\n# Annotate medical literature\nmedlitanno annotate --data-dir datatrain --model deepseek-chat\n\n# Use DeepSeek Reasoner for enhanced inference\nmedlitanno annotate --data-dir datatrain --model deepseek-reasoner --model-type deepseek\n```\n\n#### Python API\n\n```python\nfrom medlitanno.annotation import MedicalAnnotationLLM\nimport os\n\n# Initialize the annotator with automatic position matching\nannotator = MedicalAnnotationLLM(\n api_key=os.environ.get(\"DEEPSEEK_API_KEY\"),\n model=\"deepseek-chat\",\n model_type=\"deepseek\"\n)\n\n# Annotate text with automatic position calculation\ntext = \"Helicobacter pylori infection is associated with gastric cancer.\"\nresult = annotator.annotate_text(text)\n\n# Print results with position information\nprint(f\"Entities: {result.entities}\")\nfor entity in result.entities:\n print(f\" - {entity.text} ({entity.label}): pos {entity.start_pos}-{entity.end_pos}, confidence: {entity.confidence:.2f}\")\n\nprint(f\"Relations: {result.relations}\")\nprint(f\"Evidences: {result.evidences}\")\nfor evidence in result.evidences:\n print(f\" - {evidence.text}: pos {evidence.start_pos}-{evidence.end_pos}, confidence: {evidence.confidence:.2f}\")\n```\n\n### MRAgent: Causal Knowledge Discovery\n\n#### Command Line Interface\n\n```bash\n# Knowledge Discovery mode\nmedlitanno mr --outcome \"back pain\" --model gpt-4o\n\n# Causal Validation mode\nmedlitanno mr --exposure \"osteoarthritis\" --outcome \"back pain\" --mode causal\n```\n\n#### Python API\n\n```python\nfrom medlitanno.mragent import MRAgent, MRAgentOE\nimport os\n\n# Knowledge Discovery mode\nagent = MRAgent(\n outcome=\"back pain\",\n AI_key=os.environ.get(\"OPENAI_API_KEY\"),\n LLM_model=\"gpt-4o\",\n gwas_token=os.environ.get(\"OPENGWAS_JWT\")\n)\nagent.run()\n\n# Causal Validation mode\nagent_oe = MRAgentOE(\n exposure=\"osteoarthritis\",\n outcome=\"back pain\",\n AI_key=os.environ.get(\"OPENAI_API_KEY\"),\n LLM_model=\"gpt-4o\",\n gwas_token=os.environ.get(\"OPENGWAS_JWT\")\n)\nagent_oe.run()\n```\n\n## \ud83d\udcc4 Output Format\n\n### PubMed Search Results\n\nPubMed search provides:\n\n1. **Article Metadata**: Title, abstract, authors, publication date, journal\n2. **Citation Information**: PMID, DOI, publication details\n3. **Search Statistics**: Total results, query details, search timestamp\n4. **Excel Export**: Structured data export for further analysis\n\n### Annotation System\n\nThe annotation system extracts structured information with automatic position matching:\n\n1. **Entities**: Bacteria and Disease mentions with precise text positions\n2. **Relations**: Connections between entities with relation types\n3. **Evidences**: Text spans supporting the relations with confidence scores\n4. **Position Statistics**: Success rates and confidence metrics for quality assessment\n\n#### Relation Types\n\n- `contributes_to`: Bacteria contributes to disease development\n- `ameliorates`: Bacteria improves or alleviates disease\n- `correlated_with`: Bacteria and disease show correlation\n- `biomarker_for`: Bacteria serves as a biomarker for disease\n\n#### Position Matching Features\n\n- **100% Success Rate**: Intelligent matching strategies ensure reliable position detection\n- **Multiple Strategies**: Exact, case-insensitive, normalized, fuzzy, and partial matching\n- **Confidence Scoring**: Average confidence >0.8 for quality assessment\n- **Automatic Fallback**: Progressive matching strategies for robust results\n\n### MRAgent System\n\nMRAgent provides:\n\n1. **Literature Analysis**: Summary of relevant scientific papers\n2. **Potential Exposures**: List of potential causal factors\n3. **MR Results**: Statistical evidence for causal relationships\n4. **Visualizations**: Forest plots and other visual representations\n5. **Recommendations**: Insights for further research\n\n## \ud83d\ude80 Performance\n\n### Literature Search\n- **PubMed Integration**: Real-time search with rate limiting (3 requests/second)\n- **Search Speed**: ~2-5 seconds per query (depends on result count)\n- **Result Processing**: Handles thousands of articles efficiently\n\n### Annotation System\n- **Processing Speed**: ~30-60 seconds per document (depends on model and text length)\n- **Position Matching**: 100% success rate with <1 second processing per document\n- **Batch Processing**: Optimized for large-scale literature analysis\n- **Accuracy**: Comparable to manual annotation in controlled tests\n\n### MR Analysis\n- **Literature Processing**: Handles hundreds of articles and GWAS datasets efficiently\n- **Causal Discovery**: Automated analysis of complex exposure-outcome relationships\n\n## \ud83d\udcaa Stability & Reliability\n\n### Network Resilience\n- **Automatic Retry**: Smart retry mechanisms for network instability\n- **Rate Limiting**: Compliant with API guidelines and rate limits\n- **Connection Recovery**: Robust handling of network interruptions\n\n### Processing Reliability\n- **Breakpoint Resume**: Automatically continues from the last processed file\n- **Error Recovery**: Graceful handling of parsing and processing errors\n- **Progress Monitoring**: Real-time tracking with detailed statistics\n- **Data Validation**: Comprehensive validation of results and outputs\n\n## \ud83d\udccb Project Structure\n\n```\nmedlitanno/\n\u251c\u2500\u2500 src/ # Source code\n\u2502 \u2514\u2500\u2500 medlitanno/ # Main package\n\u2502 \u251c\u2500\u2500 annotation/ # Annotation system with position matching\n\u2502 \u251c\u2500\u2500 pubmed/ # PubMed search integration\n\u2502 \u251c\u2500\u2500 common/ # Shared utilities and base classes\n\u2502 \u2514\u2500\u2500 mragent/ # MR analysis (optional, requires biopython)\n\u251c\u2500\u2500 docs/ # Documentation\n\u2502 \u251c\u2500\u2500 PUBMED_SEARCH_GUIDE.md # PubMed search usage guide\n\u2502 \u251c\u2500\u2500 README_annotation.md # Annotation system documentation\n\u2502 \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 examples/ # Example scripts and demos\n\u2502 \u251c\u2500\u2500 pubmed_search_demo.py # PubMed search examples\n\u2502 \u251c\u2500\u2500 position_matching_demo.py # Position matching examples\n\u2502 \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 tests/ # Unit tests\n\u251c\u2500\u2500 scripts/ # Utility scripts\n\u251c\u2500\u2500 config/ # Configuration files\n\u2502 \u251c\u2500\u2500 env.example # Environment configuration template\n\u2502 \u2514\u2500\u2500 requirements.txt # Dependencies\n\u251c\u2500\u2500 CHANGELOG.md # Version history\n\u2514\u2500\u2500 ...\n```\n\n## \ud83d\udcda Documentation\n\n- **[PubMed Search Guide](docs/PUBMED_SEARCH_GUIDE.md)**: Complete guide for literature search functionality\n- **[Annotation Documentation](docs/README_annotation.md)**: Detailed annotation system documentation\n- **[Setup Guide](docs/SETUP.md)**: Installation and configuration instructions\n- **[Examples](examples/)**: Working examples and demo scripts\n\n## \ud83d\udd04 Version History\n\nSee [CHANGELOG.md](CHANGELOG.md) for detailed version history and feature updates.\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n### Development Setup\n\n```bash\n# Clone the repository\ngit clone https://github.com/chenxingqiang/medlitanno.git\ncd medlitanno\n\n# Install in development mode\npip install -e \".[dev]\"\n\n# Run tests\npytest tests/\n```\n\n## \ud83d\udcdc License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udce7 Contact\n\nFor questions or feedback, please contact [joy66777@gmail.com](mailto:joy66777@gmail.com).\n\n## \ud83d\ude4f Acknowledgments\n\n- **[MRAgent](https://github.com/xuwei1997/MRAgent)**: Innovative automated agent for causal knowledge discovery via Mendelian Randomization\n- **[PyMed](https://github.com/gijswobben/pymed)**: Python library for PubMed API access\n- **[OpenGWAS](https://gwas.mrcieu.ac.uk/)**: GWAS summary data for causal inference\n\n---\n\n**Latest Version**: v1.1.0 - Now with PubMed search integration and automatic position matching! \n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Medical Literature Analysis and Annotation System with LLM-powered automation",
"version": "1.1.1",
"project_urls": {
"Bug Tracker": "https://github.com/chenxingqiang/medlitanno/issues",
"Documentation": "https://github.com/chenxingqiang/medlitanno/blob/main/docs/",
"Homepage": "https://github.com/chenxingqiang/medlitanno",
"Source Code": "https://github.com/chenxingqiang/medlitanno"
},
"split_keywords": [
"medical literature",
" annotation",
" pubmed search",
" mendelian randomization",
" llm",
" biomedical nlp",
" causal inference",
" gwas",
" automation",
" literature mining"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "30200a57822e8264c4a0196c00a6b10e582e15d316357719062f4b7ca85c2c4d",
"md5": "634f374e9af647d03e008e30ecc8af28",
"sha256": "d02f2eab331460c0296b7c888bfdb4ded7ddd35ac5651603aa5c7724e43b0146"
},
"downloads": -1,
"filename": "medlitanno-1.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "634f374e9af647d03e008e30ecc8af28",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 1671340,
"upload_time": "2025-08-14T09:35:18",
"upload_time_iso_8601": "2025-08-14T09:35:18.164557Z",
"url": "https://files.pythonhosted.org/packages/30/20/0a57822e8264c4a0196c00a6b10e582e15d316357719062f4b7ca85c2c4d/medlitanno-1.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "04c55542fa77e58aca420ec82e2daf703ee295afe9eda6e02eb86c966342782f",
"md5": "495c416dc83690bbab2899ac8c00e2dd",
"sha256": "7aeae60d27918cecdadbb26ba335d001323fb80d723ba36d3b41c20932a5276d"
},
"downloads": -1,
"filename": "medlitanno-1.1.1.tar.gz",
"has_sig": false,
"md5_digest": "495c416dc83690bbab2899ac8c00e2dd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 1624512,
"upload_time": "2025-08-14T09:35:20",
"upload_time_iso_8601": "2025-08-14T09:35:20.794873Z",
"url": "https://files.pythonhosted.org/packages/04/c5/5542fa77e58aca420ec82e2daf703ee295afe9eda6e02eb86c966342782f/medlitanno-1.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-14 09:35:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "chenxingqiang",
"github_project": "medlitanno",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "pandas",
"specs": [
[
">=",
"1.5.0"
]
]
},
{
"name": "openpyxl",
"specs": [
[
">=",
"3.0.0"
]
]
},
{
"name": "openai",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "requests",
"specs": [
[
">=",
"2.28.0"
]
]
},
{
"name": "beautifulsoup4",
"specs": [
[
">=",
"4.11.0"
]
]
},
{
"name": "biopython",
"specs": [
[
">=",
"1.81"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.24.0"
]
]
},
{
"name": "scipy",
"specs": [
[
">=",
"1.10.0"
]
]
},
{
"name": "dataclasses-json",
"specs": [
[
">=",
"0.5.0"
]
]
},
{
"name": "pathlib2",
"specs": [
[
">=",
"2.3.0"
]
]
},
{
"name": "typing-extensions",
"specs": [
[
">=",
"4.5.0"
]
]
},
{
"name": "reportlab",
"specs": [
[
">=",
"4.0.0"
]
]
},
{
"name": "PyPDF2",
"specs": [
[
">=",
"3.0.0"
]
]
},
{
"name": "rpy2",
"specs": [
[
">=",
"3.5.0"
]
]
},
{
"name": "streamlit",
"specs": [
[
">=",
"1.28.0"
]
]
},
{
"name": "plotly",
"specs": [
[
">=",
"5.17.0"
]
]
},
{
"name": "pytest",
"specs": [
[
">=",
"7.0.0"
]
]
},
{
"name": "pytest-cov",
"specs": [
[
">=",
"4.0.0"
]
]
},
{
"name": "black",
"specs": [
[
">=",
"23.0.0"
]
]
},
{
"name": "flake8",
"specs": [
[
">=",
"6.0.0"
]
]
},
{
"name": "mypy",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
">=",
"4.65.0"
]
]
},
{
"name": "colorama",
"specs": [
[
">=",
"0.4.6"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
">=",
"1.0.0"
]
]
}
],
"lcname": "medlitanno"
}