# WikiRaces
WikiRaces is an AI-powered tool for navigating Wikipedia using semantic similarity. Instead of randomly clicking links, it finds intelligent paths between Wikipedia articles by understanding their content semantically.
## Features
- **Semantic Navigation**: Uses sentence transformers to understand article content and find meaningful connections
- **Smart Path Finding**: Avoids dead ends and cycles while navigating toward the target
- **Real-time Progress**: Shows progress with confidence metrics and current article information
- **Robust Error Handling**: Gracefully handles missing pages, disambiguation pages, and network issues
- **Local AI Models**: No external API dependencies - everything runs locally
## Installation
```bash
pip install wikiraces
```
## Quick Start
```python
from wikiraces import WikiBot
# Create a bot to navigate from Python to Artificial Intelligence
bot = WikiBot("Python (programming language)", "Artificial intelligence")
# Run the navigation
success = bot.run()
if success:
print(f"Found path in {len(bot.path) - 1} steps!")
print(" -> ".join(bot.path))
else:
print("Could not find a path")
```
## Advanced Usage
### Customize Search Parameters
```python
# Limit the number of candidate links to consider at each step
bot = WikiBot("Source Article", "Target Article", limit=20)
# Check if articles exist before starting
if bot.exists("Some Article"):
print("Article exists!")
# Get links from any Wikipedia page
links = bot.links("Python (programming language)")
print(f"Found {len(links)} outgoing links")
```
### Semantic Similarity
```python
from wikiraces.embed import most_similar_with_scores
# Find most semantically similar articles
candidates = ["Machine Learning", "Data Science", "Web Development"]
similar = most_similar_with_scores("Artificial Intelligence", candidates)
for article, score in similar:
print(f"{article}: {score:.3f}")
```
## How It Works
1. **Start** at the source Wikipedia article
2. **Extract** all outgoing links from the current article
3. **Filter** out dead ends and previously visited pages
4. **Rank** candidate links by semantic similarity to the target
5. **Rerank** using article summaries for better context understanding
6. **Move** to the most promising next article
7. **Repeat** until reaching the target or getting stuck
## API Reference
### WikiBot Class
```python
class WikiBot:
def __init__(self, source: str, destination: str, limit: int = 15)
def run() -> bool
def exists(page: str) -> bool
def links(page: str) -> list[str]
```
**Parameters:**
- `source`: Starting Wikipedia article title
- `destination`: Target Wikipedia article title
- `limit`: Maximum number of candidate links to consider (default: 15)
**Returns:**
- `run()`: True if path found, False otherwise
- `exists()`: True if Wikipedia page exists
- `links()`: List of outgoing links from the page
## Development
```bash
# Clone the repository
git clone https://github.com/markshteyn/wikiraces.git
cd wikiraces
# Install with Poetry
poetry install
# Run tests
poetry run pytest
# Run with verbose output
poetry run pytest -v -s
```
## Requirements
- Python 3.9+
- sentence-transformers
- wikipedia
- numpy
- tqdm
## License
MIT License - see LICENSE file for details.
## Contributing
Contributions welcome! Please feel free to submit a Pull Request.
## Acknowledgments
- Built with [sentence-transformers](https://www.sbert.net/) for semantic understanding
- Uses the [wikipedia](https://pypi.org/project/wikipedia/) library for API access
- Progress bars powered by [tqdm](https://tqdm.github.io/)
Raw data
{
"_id": null,
"home_page": "https://github.com/markshteyn/wikiraces",
"name": "wikiraces",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "wikipedia, ai, semantic-search, nlp, navigation",
"author": "Mark Shteyn",
"author_email": "markshteyn@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/4d/81/435891d198d760175a07e6a46942bf79a32a805935fe85fad690f484f18c/wikiraces-0.1.1.tar.gz",
"platform": null,
"description": "# WikiRaces\n\nWikiRaces is an AI-powered tool for navigating Wikipedia using semantic similarity. Instead of randomly clicking links, it finds intelligent paths between Wikipedia articles by understanding their content semantically.\n\n## Features\n\n- **Semantic Navigation**: Uses sentence transformers to understand article content and find meaningful connections\n- **Smart Path Finding**: Avoids dead ends and cycles while navigating toward the target\n- **Real-time Progress**: Shows progress with confidence metrics and current article information\n- **Robust Error Handling**: Gracefully handles missing pages, disambiguation pages, and network issues\n- **Local AI Models**: No external API dependencies - everything runs locally\n\n## Installation\n\n```bash\npip install wikiraces\n```\n\n## Quick Start\n\n```python\nfrom wikiraces import WikiBot\n\n# Create a bot to navigate from Python to Artificial Intelligence\nbot = WikiBot(\"Python (programming language)\", \"Artificial intelligence\")\n\n# Run the navigation\nsuccess = bot.run()\n\nif success:\n print(f\"Found path in {len(bot.path) - 1} steps!\")\n print(\" -> \".join(bot.path))\nelse:\n print(\"Could not find a path\")\n```\n\n## Advanced Usage\n\n### Customize Search Parameters\n\n```python\n# Limit the number of candidate links to consider at each step\nbot = WikiBot(\"Source Article\", \"Target Article\", limit=20)\n\n# Check if articles exist before starting\nif bot.exists(\"Some Article\"):\n print(\"Article exists!\")\n\n# Get links from any Wikipedia page\nlinks = bot.links(\"Python (programming language)\")\nprint(f\"Found {len(links)} outgoing links\")\n```\n\n### Semantic Similarity\n\n```python\nfrom wikiraces.embed import most_similar_with_scores\n\n# Find most semantically similar articles\ncandidates = [\"Machine Learning\", \"Data Science\", \"Web Development\"]\nsimilar = most_similar_with_scores(\"Artificial Intelligence\", candidates)\n\nfor article, score in similar:\n print(f\"{article}: {score:.3f}\")\n```\n\n## How It Works\n\n1. **Start** at the source Wikipedia article\n2. **Extract** all outgoing links from the current article\n3. **Filter** out dead ends and previously visited pages\n4. **Rank** candidate links by semantic similarity to the target\n5. **Rerank** using article summaries for better context understanding\n6. **Move** to the most promising next article\n7. **Repeat** until reaching the target or getting stuck\n\n## API Reference\n\n### WikiBot Class\n\n```python\nclass WikiBot:\n def __init__(self, source: str, destination: str, limit: int = 15)\n def run() -> bool\n def exists(page: str) -> bool\n def links(page: str) -> list[str]\n```\n\n**Parameters:**\n- `source`: Starting Wikipedia article title\n- `destination`: Target Wikipedia article title \n- `limit`: Maximum number of candidate links to consider (default: 15)\n\n**Returns:**\n- `run()`: True if path found, False otherwise\n- `exists()`: True if Wikipedia page exists\n- `links()`: List of outgoing links from the page\n\n## Development\n\n```bash\n# Clone the repository\ngit clone https://github.com/markshteyn/wikiraces.git\ncd wikiraces\n\n# Install with Poetry\npoetry install\n\n# Run tests\npoetry run pytest\n\n# Run with verbose output\npoetry run pytest -v -s\n```\n\n## Requirements\n\n- Python 3.9+\n- sentence-transformers\n- wikipedia\n- numpy\n- tqdm\n\n## License\n\nMIT License - see LICENSE file for details.\n\n## Contributing\n\nContributions welcome! Please feel free to submit a Pull Request.\n\n## Acknowledgments\n\n- Built with [sentence-transformers](https://www.sbert.net/) for semantic understanding\n- Uses the [wikipedia](https://pypi.org/project/wikipedia/) library for API access\n- Progress bars powered by [tqdm](https://tqdm.github.io/)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "AI-powered Wikipedia navigation using semantic similarity",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/markshteyn/wikiraces",
"Issues": "https://github.com/markshteyn/wikiraces/issues",
"Repository": "https://github.com/markshteyn/wikiraces"
},
"split_keywords": [
"wikipedia",
" ai",
" semantic-search",
" nlp",
" navigation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6fe81b396c36bb6dbbf332236b55261c892b11d0d5893c54f20a456c1e3b31c9",
"md5": "fd4fd46c0888ed29e33827cab4f43fee",
"sha256": "ab9f0d1d3ea0995247f9b264b2dd77a465c07a01ed9fa8b08141ec0422cf82ca"
},
"downloads": -1,
"filename": "wikiraces-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fd4fd46c0888ed29e33827cab4f43fee",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 7404,
"upload_time": "2025-07-25T19:50:34",
"upload_time_iso_8601": "2025-07-25T19:50:34.286155Z",
"url": "https://files.pythonhosted.org/packages/6f/e8/1b396c36bb6dbbf332236b55261c892b11d0d5893c54f20a456c1e3b31c9/wikiraces-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4d81435891d198d760175a07e6a46942bf79a32a805935fe85fad690f484f18c",
"md5": "85886096bfea5dca21dc164f65b16ba5",
"sha256": "5c4a5ade077481a74318bb3f950157201b5176877f83094f17a655581e46b990"
},
"downloads": -1,
"filename": "wikiraces-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "85886096bfea5dca21dc164f65b16ba5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 5771,
"upload_time": "2025-07-25T19:50:35",
"upload_time_iso_8601": "2025-07-25T19:50:35.650394Z",
"url": "https://files.pythonhosted.org/packages/4d/81/435891d198d760175a07e6a46942bf79a32a805935fe85fad690f484f18c/wikiraces-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-25 19:50:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "markshteyn",
"github_project": "wikiraces",
"github_not_found": true,
"lcname": "wikiraces"
}