codename-hunter


Namecodename-hunter JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryA powerful tool for extracting and enhancing markdown content from web pages
upload_time2025-01-07 02:58:39
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords markdown web-scraping content-extraction ai-enhancement
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Hunter

<p align="center">
  <img src="assets/hunter-logo.png" alt="Hunter Logo" width="800"/>
</p>

![Build Status](https://img.shields.io/github/actions/workflow/status/joenandez/codename_hunter/hunter-cicd.yml?branch=main&style=for-the-badge)
![License](https://img.shields.io/badge/license-MIT-green?style=for-the-badge)
![Python Version](https://img.shields.io/badge/python-3.8%2B-blue?style=for-the-badge)
![Code Style](https://img.shields.io/badge/code%20style-flake8-black?style=for-the-badge)
![Last Commit](https://img.shields.io/github/last-commit/joenandez/codename_hunter/main?style=for-the-badge)                                         

**Hunter** (package name: `codename_hunter`) makes it easy to convert any web page content into clean, well-formatted Markdown. Built primarily for passing web page content to AI Code Editing tools, but useful for any web content to Markdown conversion needs.

## Table of Contents

- [Features](#features)
- [Installation](#installation)
- [Usage](#usage)
- [Configuration](#configuration)
- [Development](#development)
- [Testing](#testing)
- [Contributing](#contributing)
- [License](#license)

## Features

- 🔍 **Smart Content Extraction**: Seamlessly extract structured content (headings, paragraphs, lists, code blocks, links, images) from any web page.
- 🤖 **AI-Powered Enhancement**: Optional integration with Together.ai to automatically refine and enhance Markdown formatting
- 📋 **Clipboard Integration**: Instantly copy the processed Markdown content to your clipboard
- 💾 **File Saving**: Save extracted content to disk with automatic URL-based filenames and timestamps - helpful when working with AI Code Editors that support file tagging for context.

## Installation

### Prerequisites

- Python 3.8+
- pip (Python package installer)

### Install from PyPI

```bash
pip install codename_hunter  # Installs as 'hunter' command-line tool
```

### Install from Source

```bash
git clone https://github.com/joenandez/codename_hunter.git
cd codename_hunter
pip install -e .
```

### Package Name Note

While the package is named `codename_hunter` on PyPI, you'll use it simply as `hunter` in your terminal:

```bash
# Install the package
pip install codename_hunter

# Use the tool
hunter https://example.com/article
```

## Usage

Hunter provides a simple command-line interface to extract and enhance Markdown content from web pages.

### Basic Usage

```bash
# Extract and enhance content from a URL (copies to clipboard)
hunter https://example.com/article

# Save output to disk (defaults to "hunter_docs" folder)
hunter https://example.com/article -d

# Save to a custom folder
hunter https://example.com/article -d custom_folder

# Save to disk and force directory creation
hunter https://example.com/article -d custom_folder --force-dir

# Extract without AI enhancement
hunter https://example.com/article --no-enhance

# Extract without copying to clipboard
hunter https://example.com/article --no-copy


```

### Command Options

- `-d/--save-to-disk [folder]`: Save output to disk (defaults to "hunter_docs")
- `--force-dir`: Create output directory without prompting
- `--no-enhance`: Disable AI-powered content enhancement
- `--no-copy`: Disable automatic copying to clipboard

## Configuration

Hunter uses environment variables and an optional `.env` file for configuration.

### Together AI Configuration

To enable AI-powered enhancements, you need a Together.ai API key.

#### Method 1: Environment Variable (Recommended)

```bash
export TOGETHER_API_KEY='your_api_key_here'  # On Windows: set TOGETHER_API_KEY=your_api_key_here
```

To unset the API key:
```bash
unset TOGETHER_API_KEY  # On Unix/macOS
set TOGETHER_API_KEY=   # On Windows
```

#### Method 2: .env File

Create a `.env` file in your working directory:

```env
TOGETHER_API_KEY=your_api_key_here
```

### Additional Settings

```env
# Model Selection
TOGETHER_MODEL=mistralai/Mistral-7B-Instruct-v0.2

# Token Limits
TOGETHER_MAX_TOKENS=4000

# Temperature Setting
TOGETHER_TEMPERATURE=0.1

# Output Format
OUTPUT_FORMAT=markdown

# Console Style (dark/light)
CONSOLE_STYLE=dark
```

## Development

### Setup Development Environment

1. Clone the repository
```bash
git clone https://github.com/joenandez/codename_hunter.git
cd codename_hunter
```

2. Create a virtual environment
```bash
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
```

3. Install development dependencies
```bash
pip install -e ".[dev]"
```

### Project Structure

```
codename_hunter/
├── hunter/
│   ├── __init__.py
│   ├── __main__.py
│   ├── main.py
│   ├── constants.py
│   ├── formatters.py
│   ├── parsers.py
│   └── utils/
│       ├── ai.py
│       ├── errors.py
│       ├── fetcher.py
│       └── progress.py
├── tests/
│   ├── test_parsers.py
│   ├── test_formatters.py
│   └── test_utils.py
├── project_docs/      # Project documentation
├── hunter_docs/       # Generated documentation
├── assets/           # Project assets
├── .github/          # GitHub configuration
├── README.md
├── CHANGELOG.md
├── CONTRIBUTING.md
├── LICENSE
└── pyproject.toml
```

## Testing

Run the test suite:

```bash
pytest
```

## Contributing

This project is currently in a read-only state and is not accepting pull requests. However, we welcome:

- Bug reports and feature requests through GitHub Issues
- Questions and discussions in the Issues section
- Using and forking the project for your own needs

See [CONTRIBUTING.md](CONTRIBUTING.md) for more details about this policy and how to effectively report issues.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "codename-hunter",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "markdown, web-scraping, content-extraction, ai-enhancement",
    "author": null,
    "author_email": "Joe <joevfernandez@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/3c/f2/82e4bd6b3df9ac79831edc9f4ab8375f856e5f25d3c925adac574aebced3/codename_hunter-0.1.3.tar.gz",
    "platform": null,
    "description": "# Hunter\n\n<p align=\"center\">\n  <img src=\"assets/hunter-logo.png\" alt=\"Hunter Logo\" width=\"800\"/>\n</p>\n\n![Build Status](https://img.shields.io/github/actions/workflow/status/joenandez/codename_hunter/hunter-cicd.yml?branch=main&style=for-the-badge)\n![License](https://img.shields.io/badge/license-MIT-green?style=for-the-badge)\n![Python Version](https://img.shields.io/badge/python-3.8%2B-blue?style=for-the-badge)\n![Code Style](https://img.shields.io/badge/code%20style-flake8-black?style=for-the-badge)\n![Last Commit](https://img.shields.io/github/last-commit/joenandez/codename_hunter/main?style=for-the-badge)                                         \n\n**Hunter** (package name: `codename_hunter`) makes it easy to convert any web page content into clean, well-formatted Markdown. Built primarily for passing web page content to AI Code Editing tools, but useful for any web content to Markdown conversion needs.\n\n## Table of Contents\n\n- [Features](#features)\n- [Installation](#installation)\n- [Usage](#usage)\n- [Configuration](#configuration)\n- [Development](#development)\n- [Testing](#testing)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Features\n\n- \ud83d\udd0d **Smart Content Extraction**: Seamlessly extract structured content (headings, paragraphs, lists, code blocks, links, images) from any web page.\n- \ud83e\udd16 **AI-Powered Enhancement**: Optional integration with Together.ai to automatically refine and enhance Markdown formatting\n- \ud83d\udccb **Clipboard Integration**: Instantly copy the processed Markdown content to your clipboard\n- \ud83d\udcbe **File Saving**: Save extracted content to disk with automatic URL-based filenames and timestamps - helpful when working with AI Code Editors that support file tagging for context.\n\n## Installation\n\n### Prerequisites\n\n- Python 3.8+\n- pip (Python package installer)\n\n### Install from PyPI\n\n```bash\npip install codename_hunter  # Installs as 'hunter' command-line tool\n```\n\n### Install from Source\n\n```bash\ngit clone https://github.com/joenandez/codename_hunter.git\ncd codename_hunter\npip install -e .\n```\n\n### Package Name Note\n\nWhile the package is named `codename_hunter` on PyPI, you'll use it simply as `hunter` in your terminal:\n\n```bash\n# Install the package\npip install codename_hunter\n\n# Use the tool\nhunter https://example.com/article\n```\n\n## Usage\n\nHunter provides a simple command-line interface to extract and enhance Markdown content from web pages.\n\n### Basic Usage\n\n```bash\n# Extract and enhance content from a URL (copies to clipboard)\nhunter https://example.com/article\n\n# Save output to disk (defaults to \"hunter_docs\" folder)\nhunter https://example.com/article -d\n\n# Save to a custom folder\nhunter https://example.com/article -d custom_folder\n\n# Save to disk and force directory creation\nhunter https://example.com/article -d custom_folder --force-dir\n\n# Extract without AI enhancement\nhunter https://example.com/article --no-enhance\n\n# Extract without copying to clipboard\nhunter https://example.com/article --no-copy\n\n\n```\n\n### Command Options\n\n- `-d/--save-to-disk [folder]`: Save output to disk (defaults to \"hunter_docs\")\n- `--force-dir`: Create output directory without prompting\n- `--no-enhance`: Disable AI-powered content enhancement\n- `--no-copy`: Disable automatic copying to clipboard\n\n## Configuration\n\nHunter uses environment variables and an optional `.env` file for configuration.\n\n### Together AI Configuration\n\nTo enable AI-powered enhancements, you need a Together.ai API key.\n\n#### Method 1: Environment Variable (Recommended)\n\n```bash\nexport TOGETHER_API_KEY='your_api_key_here'  # On Windows: set TOGETHER_API_KEY=your_api_key_here\n```\n\nTo unset the API key:\n```bash\nunset TOGETHER_API_KEY  # On Unix/macOS\nset TOGETHER_API_KEY=   # On Windows\n```\n\n#### Method 2: .env File\n\nCreate a `.env` file in your working directory:\n\n```env\nTOGETHER_API_KEY=your_api_key_here\n```\n\n### Additional Settings\n\n```env\n# Model Selection\nTOGETHER_MODEL=mistralai/Mistral-7B-Instruct-v0.2\n\n# Token Limits\nTOGETHER_MAX_TOKENS=4000\n\n# Temperature Setting\nTOGETHER_TEMPERATURE=0.1\n\n# Output Format\nOUTPUT_FORMAT=markdown\n\n# Console Style (dark/light)\nCONSOLE_STYLE=dark\n```\n\n## Development\n\n### Setup Development Environment\n\n1. Clone the repository\n```bash\ngit clone https://github.com/joenandez/codename_hunter.git\ncd codename_hunter\n```\n\n2. Create a virtual environment\n```bash\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n```\n\n3. Install development dependencies\n```bash\npip install -e \".[dev]\"\n```\n\n### Project Structure\n\n```\ncodename_hunter/\n\u251c\u2500\u2500 hunter/\n\u2502   \u251c\u2500\u2500 __init__.py\n\u2502   \u251c\u2500\u2500 __main__.py\n\u2502   \u251c\u2500\u2500 main.py\n\u2502   \u251c\u2500\u2500 constants.py\n\u2502   \u251c\u2500\u2500 formatters.py\n\u2502   \u251c\u2500\u2500 parsers.py\n\u2502   \u2514\u2500\u2500 utils/\n\u2502       \u251c\u2500\u2500 ai.py\n\u2502       \u251c\u2500\u2500 errors.py\n\u2502       \u251c\u2500\u2500 fetcher.py\n\u2502       \u2514\u2500\u2500 progress.py\n\u251c\u2500\u2500 tests/\n\u2502   \u251c\u2500\u2500 test_parsers.py\n\u2502   \u251c\u2500\u2500 test_formatters.py\n\u2502   \u2514\u2500\u2500 test_utils.py\n\u251c\u2500\u2500 project_docs/      # Project documentation\n\u251c\u2500\u2500 hunter_docs/       # Generated documentation\n\u251c\u2500\u2500 assets/           # Project assets\n\u251c\u2500\u2500 .github/          # GitHub configuration\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 CHANGELOG.md\n\u251c\u2500\u2500 CONTRIBUTING.md\n\u251c\u2500\u2500 LICENSE\n\u2514\u2500\u2500 pyproject.toml\n```\n\n## Testing\n\nRun the test suite:\n\n```bash\npytest\n```\n\n## Contributing\n\nThis project is currently in a read-only state and is not accepting pull requests. However, we welcome:\n\n- Bug reports and feature requests through GitHub Issues\n- Questions and discussions in the Issues section\n- Using and forking the project for your own needs\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for more details about this policy and how to effectively report issues.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A powerful tool for extracting and enhancing markdown content from web pages",
    "version": "0.1.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/joenandez/codename_hunter/issues",
        "Documentation": "https://github.com/joenandez/codename_hunter#readme",
        "Homepage": "https://github.com/joenandez/codename_hunter"
    },
    "split_keywords": [
        "markdown",
        " web-scraping",
        " content-extraction",
        " ai-enhancement"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4dc0526efd574d81bfbc3d9e1267f140b63967c23f4d3cf06b194597af6fa6f6",
                "md5": "d352e47826f8fb07d7fb34fe6e553b92",
                "sha256": "49233e81ef8f0a07535ff3389e8084f4ff97391a0572e1b0a4a9e2dfed10964d"
            },
            "downloads": -1,
            "filename": "codename_hunter-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d352e47826f8fb07d7fb34fe6e553b92",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 27164,
            "upload_time": "2025-01-07T02:58:34",
            "upload_time_iso_8601": "2025-01-07T02:58:34.169465Z",
            "url": "https://files.pythonhosted.org/packages/4d/c0/526efd574d81bfbc3d9e1267f140b63967c23f4d3cf06b194597af6fa6f6/codename_hunter-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3cf282e4bd6b3df9ac79831edc9f4ab8375f856e5f25d3c925adac574aebced3",
                "md5": "27daa9ad63711cfac898d5e9b8f80912",
                "sha256": "ed463b199c4e7f0cd3eca5bf6b2b97bcd39de12a7563c2470f5da9a165bca728"
            },
            "downloads": -1,
            "filename": "codename_hunter-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "27daa9ad63711cfac898d5e9b8f80912",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 29800,
            "upload_time": "2025-01-07T02:58:39",
            "upload_time_iso_8601": "2025-01-07T02:58:39.494406Z",
            "url": "https://files.pythonhosted.org/packages/3c/f2/82e4bd6b3df9ac79831edc9f4ab8375f856e5f25d3c925adac574aebced3/codename_hunter-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-07 02:58:39",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "joenandez",
    "github_project": "codename_hunter",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "codename-hunter"
}
        
Elapsed time: 1.11509s