# Hunter
<p align="center">
<img src="assets/hunter-logo.png" alt="Hunter Logo" width="800"/>
</p>
![Build Status](https://img.shields.io/github/actions/workflow/status/joenandez/codename_hunter/hunter-cicd.yml?branch=main&style=for-the-badge)
![License](https://img.shields.io/badge/license-MIT-green?style=for-the-badge)
![Python Version](https://img.shields.io/badge/python-3.8%2B-blue?style=for-the-badge)
![Code Style](https://img.shields.io/badge/code%20style-flake8-black?style=for-the-badge)
![Last Commit](https://img.shields.io/github/last-commit/joenandez/codename_hunter/main?style=for-the-badge)
**Hunter** (package name: `codename_hunter`) makes it easy to convert any web page content into clean, well-formatted Markdown. Built primarily for passing web page content to AI Code Editing tools, but useful for any web content to Markdown conversion needs.
## Table of Contents
- [Features](#features)
- [Installation](#installation)
- [Usage](#usage)
- [Configuration](#configuration)
- [Development](#development)
- [Testing](#testing)
- [Contributing](#contributing)
- [License](#license)
## Features
- 🔍 **Smart Content Extraction**: Seamlessly extract structured content (headings, paragraphs, lists, code blocks, links, images) from any web page.
- 🤖 **AI-Powered Enhancement**: Optional integration with Together.ai to automatically refine and enhance Markdown formatting
- 📋 **Clipboard Integration**: Instantly copy the processed Markdown content to your clipboard
- 💾 **File Saving**: Save extracted content to disk with automatic URL-based filenames and timestamps - helpful when working with AI Code Editors that support file tagging for context.
## Installation
### Prerequisites
- Python 3.8+
- pip (Python package installer)
### Install from PyPI
```bash
pip install codename_hunter # Installs as 'hunter' command-line tool
```
### Install from Source
```bash
git clone https://github.com/joenandez/codename_hunter.git
cd codename_hunter
pip install -e .
```
### Package Name Note
While the package is named `codename_hunter` on PyPI, you'll use it simply as `hunter` in your terminal:
```bash
# Install the package
pip install codename_hunter
# Use the tool
hunter https://example.com/article
```
## Usage
Hunter provides a simple command-line interface to extract and enhance Markdown content from web pages.
### Basic Usage
```bash
# Extract and enhance content from a URL (copies to clipboard)
hunter https://example.com/article
# Save output to disk (defaults to "hunter_docs" folder)
hunter https://example.com/article -d
# Save to a custom folder
hunter https://example.com/article -d custom_folder
# Save to disk and force directory creation
hunter https://example.com/article -d custom_folder --force-dir
# Extract without AI enhancement
hunter https://example.com/article --no-enhance
# Extract without copying to clipboard
hunter https://example.com/article --no-copy
```
### Command Options
- `-d/--save-to-disk [folder]`: Save output to disk (defaults to "hunter_docs")
- `--force-dir`: Create output directory without prompting
- `--no-enhance`: Disable AI-powered content enhancement
- `--no-copy`: Disable automatic copying to clipboard
## Configuration
Hunter uses environment variables and an optional `.env` file for configuration.
### Together AI Configuration
To enable AI-powered enhancements, you need a Together.ai API key.
#### Method 1: Environment Variable (Recommended)
```bash
export TOGETHER_API_KEY='your_api_key_here' # On Windows: set TOGETHER_API_KEY=your_api_key_here
```
To unset the API key:
```bash
unset TOGETHER_API_KEY # On Unix/macOS
set TOGETHER_API_KEY= # On Windows
```
#### Method 2: .env File
Create a `.env` file in your working directory:
```env
TOGETHER_API_KEY=your_api_key_here
```
### Additional Settings
```env
# Model Selection
TOGETHER_MODEL=mistralai/Mistral-7B-Instruct-v0.2
# Token Limits
TOGETHER_MAX_TOKENS=4000
# Temperature Setting
TOGETHER_TEMPERATURE=0.1
# Output Format
OUTPUT_FORMAT=markdown
# Console Style (dark/light)
CONSOLE_STYLE=dark
```
## Development
### Setup Development Environment
1. Clone the repository
```bash
git clone https://github.com/joenandez/codename_hunter.git
cd codename_hunter
```
2. Create a virtual environment
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install development dependencies
```bash
pip install -e ".[dev]"
```
### Project Structure
```
codename_hunter/
├── hunter/
│ ├── __init__.py
│ ├── __main__.py
│ ├── main.py
│ ├── constants.py
│ ├── formatters.py
│ ├── parsers.py
│ └── utils/
│ ├── ai.py
│ ├── errors.py
│ ├── fetcher.py
│ └── progress.py
├── tests/
│ ├── test_parsers.py
│ ├── test_formatters.py
│ └── test_utils.py
├── project_docs/ # Project documentation
├── hunter_docs/ # Generated documentation
├── assets/ # Project assets
├── .github/ # GitHub configuration
├── README.md
├── CHANGELOG.md
├── CONTRIBUTING.md
├── LICENSE
└── pyproject.toml
```
## Testing
Run the test suite:
```bash
pytest
```
## Contributing
This project is currently in a read-only state and is not accepting pull requests. However, we welcome:
- Bug reports and feature requests through GitHub Issues
- Questions and discussions in the Issues section
- Using and forking the project for your own needs
See [CONTRIBUTING.md](CONTRIBUTING.md) for more details about this policy and how to effectively report issues.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "codename-hunter",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "markdown, web-scraping, content-extraction, ai-enhancement",
"author": null,
"author_email": "Joe <joevfernandez@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/3c/f2/82e4bd6b3df9ac79831edc9f4ab8375f856e5f25d3c925adac574aebced3/codename_hunter-0.1.3.tar.gz",
"platform": null,
"description": "# Hunter\n\n<p align=\"center\">\n <img src=\"assets/hunter-logo.png\" alt=\"Hunter Logo\" width=\"800\"/>\n</p>\n\n![Build Status](https://img.shields.io/github/actions/workflow/status/joenandez/codename_hunter/hunter-cicd.yml?branch=main&style=for-the-badge)\n![License](https://img.shields.io/badge/license-MIT-green?style=for-the-badge)\n![Python Version](https://img.shields.io/badge/python-3.8%2B-blue?style=for-the-badge)\n![Code Style](https://img.shields.io/badge/code%20style-flake8-black?style=for-the-badge)\n![Last Commit](https://img.shields.io/github/last-commit/joenandez/codename_hunter/main?style=for-the-badge) \n\n**Hunter** (package name: `codename_hunter`) makes it easy to convert any web page content into clean, well-formatted Markdown. Built primarily for passing web page content to AI Code Editing tools, but useful for any web content to Markdown conversion needs.\n\n## Table of Contents\n\n- [Features](#features)\n- [Installation](#installation)\n- [Usage](#usage)\n- [Configuration](#configuration)\n- [Development](#development)\n- [Testing](#testing)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Features\n\n- \ud83d\udd0d **Smart Content Extraction**: Seamlessly extract structured content (headings, paragraphs, lists, code blocks, links, images) from any web page.\n- \ud83e\udd16 **AI-Powered Enhancement**: Optional integration with Together.ai to automatically refine and enhance Markdown formatting\n- \ud83d\udccb **Clipboard Integration**: Instantly copy the processed Markdown content to your clipboard\n- \ud83d\udcbe **File Saving**: Save extracted content to disk with automatic URL-based filenames and timestamps - helpful when working with AI Code Editors that support file tagging for context.\n\n## Installation\n\n### Prerequisites\n\n- Python 3.8+\n- pip (Python package installer)\n\n### Install from PyPI\n\n```bash\npip install codename_hunter # Installs as 'hunter' command-line tool\n```\n\n### Install from Source\n\n```bash\ngit clone https://github.com/joenandez/codename_hunter.git\ncd codename_hunter\npip install -e .\n```\n\n### Package Name Note\n\nWhile the package is named `codename_hunter` on PyPI, you'll use it simply as `hunter` in your terminal:\n\n```bash\n# Install the package\npip install codename_hunter\n\n# Use the tool\nhunter https://example.com/article\n```\n\n## Usage\n\nHunter provides a simple command-line interface to extract and enhance Markdown content from web pages.\n\n### Basic Usage\n\n```bash\n# Extract and enhance content from a URL (copies to clipboard)\nhunter https://example.com/article\n\n# Save output to disk (defaults to \"hunter_docs\" folder)\nhunter https://example.com/article -d\n\n# Save to a custom folder\nhunter https://example.com/article -d custom_folder\n\n# Save to disk and force directory creation\nhunter https://example.com/article -d custom_folder --force-dir\n\n# Extract without AI enhancement\nhunter https://example.com/article --no-enhance\n\n# Extract without copying to clipboard\nhunter https://example.com/article --no-copy\n\n\n```\n\n### Command Options\n\n- `-d/--save-to-disk [folder]`: Save output to disk (defaults to \"hunter_docs\")\n- `--force-dir`: Create output directory without prompting\n- `--no-enhance`: Disable AI-powered content enhancement\n- `--no-copy`: Disable automatic copying to clipboard\n\n## Configuration\n\nHunter uses environment variables and an optional `.env` file for configuration.\n\n### Together AI Configuration\n\nTo enable AI-powered enhancements, you need a Together.ai API key.\n\n#### Method 1: Environment Variable (Recommended)\n\n```bash\nexport TOGETHER_API_KEY='your_api_key_here' # On Windows: set TOGETHER_API_KEY=your_api_key_here\n```\n\nTo unset the API key:\n```bash\nunset TOGETHER_API_KEY # On Unix/macOS\nset TOGETHER_API_KEY= # On Windows\n```\n\n#### Method 2: .env File\n\nCreate a `.env` file in your working directory:\n\n```env\nTOGETHER_API_KEY=your_api_key_here\n```\n\n### Additional Settings\n\n```env\n# Model Selection\nTOGETHER_MODEL=mistralai/Mistral-7B-Instruct-v0.2\n\n# Token Limits\nTOGETHER_MAX_TOKENS=4000\n\n# Temperature Setting\nTOGETHER_TEMPERATURE=0.1\n\n# Output Format\nOUTPUT_FORMAT=markdown\n\n# Console Style (dark/light)\nCONSOLE_STYLE=dark\n```\n\n## Development\n\n### Setup Development Environment\n\n1. Clone the repository\n```bash\ngit clone https://github.com/joenandez/codename_hunter.git\ncd codename_hunter\n```\n\n2. Create a virtual environment\n```bash\npython -m venv venv\nsource venv/bin/activate # On Windows: venv\\Scripts\\activate\n```\n\n3. Install development dependencies\n```bash\npip install -e \".[dev]\"\n```\n\n### Project Structure\n\n```\ncodename_hunter/\n\u251c\u2500\u2500 hunter/\n\u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u251c\u2500\u2500 __main__.py\n\u2502 \u251c\u2500\u2500 main.py\n\u2502 \u251c\u2500\u2500 constants.py\n\u2502 \u251c\u2500\u2500 formatters.py\n\u2502 \u251c\u2500\u2500 parsers.py\n\u2502 \u2514\u2500\u2500 utils/\n\u2502 \u251c\u2500\u2500 ai.py\n\u2502 \u251c\u2500\u2500 errors.py\n\u2502 \u251c\u2500\u2500 fetcher.py\n\u2502 \u2514\u2500\u2500 progress.py\n\u251c\u2500\u2500 tests/\n\u2502 \u251c\u2500\u2500 test_parsers.py\n\u2502 \u251c\u2500\u2500 test_formatters.py\n\u2502 \u2514\u2500\u2500 test_utils.py\n\u251c\u2500\u2500 project_docs/ # Project documentation\n\u251c\u2500\u2500 hunter_docs/ # Generated documentation\n\u251c\u2500\u2500 assets/ # Project assets\n\u251c\u2500\u2500 .github/ # GitHub configuration\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 CHANGELOG.md\n\u251c\u2500\u2500 CONTRIBUTING.md\n\u251c\u2500\u2500 LICENSE\n\u2514\u2500\u2500 pyproject.toml\n```\n\n## Testing\n\nRun the test suite:\n\n```bash\npytest\n```\n\n## Contributing\n\nThis project is currently in a read-only state and is not accepting pull requests. However, we welcome:\n\n- Bug reports and feature requests through GitHub Issues\n- Questions and discussions in the Issues section\n- Using and forking the project for your own needs\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for more details about this policy and how to effectively report issues.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A powerful tool for extracting and enhancing markdown content from web pages",
"version": "0.1.3",
"project_urls": {
"Bug Tracker": "https://github.com/joenandez/codename_hunter/issues",
"Documentation": "https://github.com/joenandez/codename_hunter#readme",
"Homepage": "https://github.com/joenandez/codename_hunter"
},
"split_keywords": [
"markdown",
" web-scraping",
" content-extraction",
" ai-enhancement"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4dc0526efd574d81bfbc3d9e1267f140b63967c23f4d3cf06b194597af6fa6f6",
"md5": "d352e47826f8fb07d7fb34fe6e553b92",
"sha256": "49233e81ef8f0a07535ff3389e8084f4ff97391a0572e1b0a4a9e2dfed10964d"
},
"downloads": -1,
"filename": "codename_hunter-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d352e47826f8fb07d7fb34fe6e553b92",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 27164,
"upload_time": "2025-01-07T02:58:34",
"upload_time_iso_8601": "2025-01-07T02:58:34.169465Z",
"url": "https://files.pythonhosted.org/packages/4d/c0/526efd574d81bfbc3d9e1267f140b63967c23f4d3cf06b194597af6fa6f6/codename_hunter-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3cf282e4bd6b3df9ac79831edc9f4ab8375f856e5f25d3c925adac574aebced3",
"md5": "27daa9ad63711cfac898d5e9b8f80912",
"sha256": "ed463b199c4e7f0cd3eca5bf6b2b97bcd39de12a7563c2470f5da9a165bca728"
},
"downloads": -1,
"filename": "codename_hunter-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "27daa9ad63711cfac898d5e9b8f80912",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 29800,
"upload_time": "2025-01-07T02:58:39",
"upload_time_iso_8601": "2025-01-07T02:58:39.494406Z",
"url": "https://files.pythonhosted.org/packages/3c/f2/82e4bd6b3df9ac79831edc9f4ab8375f856e5f25d3c925adac574aebced3/codename_hunter-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-07 02:58:39",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "joenandez",
"github_project": "codename_hunter",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "codename-hunter"
}