quickfetch


Namequickfetch JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/quickfetch/quickfetch
SummaryFast, simple, and beautifully minimal web scraping
upload_time2025-08-08 21:27:20
maintainerNone
docs_urlNone
authorquickfetch
requires_python>=3.8
licenseMIT
keywords web scraping http html json requests beautifulsoup lxml
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 🚀 quickfetch - Quick Fetch

**Fast, simple, and beautifully minimal web scraping.**

[![PyPI version](https://badge.fury.io/py/quickfetch.svg)](https://badge.fury.io/py/quickfetch)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## ✨ The Problem

Python developers who need to grab HTML or JSON from the web must currently:

1. Import `requests` to fetch content
2. Import `BeautifulSoup` or `lxml` to parse HTML  
3. Write **6–10 lines** of boilerplate code just to get basic data

This is slow, repetitive, and messy for quick scripts.

## 🎯 The Solution

**quickfetch** makes web scraping a **1–2 line task** with auto-detection of HTML/JSON content.

## 🚀 Quick Start

### Installation

```bash
pip install quickfetch
```

### Basic Usage

```python
from quickfetch import get

# Fetch a webpage
page = get("https://example.com")
print(page.select("h1").text)  # Get first h1 text
print(page.links)              # Get all links
print(page.images)             # Get all images

# Fetch JSON API
data = get("https://api.github.com/users/octocat")
print(data.json["name"])       # Access JSON data directly
```

## 🎨 Features

### Core Features

- **`get(url)`** → Returns a response object with:
  - `.html` → Raw HTML content (if HTML response)
  - `.json` → Parsed JSON data (if JSON response)
  - `.select(css_selector)` → CSS selector matching
  - `.xpath(xpath_expr)` → XPath expression matching
  - `.links` → All hyperlinks on the page
  - `.images` → All image URLs on the page

### Auto-Detection

quickfetch automatically detects if the response is HTML or JSON:

```python
# HTML page
page = get("https://example.com")
print(page.html)    # Raw HTML
print(page.json)    # None

# JSON API
data = get("https://api.github.com/users/octocat")
print(data.html)    # None
print(data.json)    # Parsed JSON dict
```

### CSS Selectors

```python
page = get("https://example.com")

# Get first element
title = page.select("h1")[0].text

# Get all elements
links = page.select("a[href]")
for link in links:
    print(link.text, link["href"])

# Nested selection
articles = page.select("article")
for article in articles:
    title = article.select("h2")[0].text
    content = article.select("p")[0].text
```

### XPath Support

```python
page = get("https://example.com")

# XPath expressions
elements = page.xpath("//h1[@class='title']")
for elem in elements:
    print(elem.text)
```

### Link & Image Extraction

```python
page = get("https://example.com")

# Get all links
all_links = page.links
print(f"Found {len(all_links)} links")

# Get all images
all_images = page.images
print(f"Found {len(all_images)} images")
```

### Element Properties

```python
page = get("https://example.com")
element = page.select("a")[0]

print(element.text)     # Text content
print(element.html)     # HTML content
print(element.attrs)    # All attributes
print(element["href"])  # Specific attribute
```

## 📚 Examples

### Scraping News Headlines

```python
from quickfetch import get

# Get headlines from a news site
page = get("https://news.ycombinator.com")
headlines = page.select(".titleline > a")

for headline in headlines[:5]:
    print(f"• {headline.text}")
    print(f"  {headline['href']}\n")
```

### API Data Extraction

```python
from quickfetch import get

# Get GitHub user data
user = get("https://api.github.com/users/octocat")
data = user.json

print(f"Name: {data['name']}")
print(f"Location: {data['location']}")
print(f"Followers: {data['followers']}")
```

### Image Gallery Scraper

```python
from quickfetch import get

# Get all images from a gallery
page = get("https://example.com/gallery")
images = page.images

print("Found images:")
for img_url in images:
    print(f"  {img_url}")
```

### Form Data Extraction

```python
from quickfetch import get

# Get form fields
page = get("https://example.com/contact")
forms = page.select("form")

for form in forms:
    inputs = form.select("input")
    for input_elem in inputs:
        name = input_elem.get("name", "")
        type_attr = input_elem.get("type", "text")
        print(f"Input: {name} ({type_attr})")
```

## 🔧 Advanced Usage

### Custom Headers

```python
from quickfetch import get

# Add custom headers
page = get("https://api.example.com", 
           headers={"User-Agent": "MyBot/1.0"})
```

### Error Handling

```python
from quickfetch import get

try:
    page = get("https://example.com")
    if page.html:
        print("Successfully fetched HTML")
    elif page.json:
        print("Successfully fetched JSON")
    else:
        print("No content found")
except Exception as e:
    print(f"Error: {e}")
```

### Batch Processing

```python
from quickfetch import get

urls = [
    "https://example1.com",
    "https://example2.com", 
    "https://example3.com"
]

for url in urls:
    page = get(url)
    title = page.select("title")[0].text if page.select("title") else "No title"
    print(f"{url}: {title}")
```

## 📦 Installation

### From PyPI

```bash
pip install quickfetch
```

### From Source

```bash
git clone https://github.com/quickfetch/quickfetch.git
cd quickfetch
pip install -e .
```

## 🛠 Dependencies

- **requests** ≥ 2.25.0 - HTTP library
- **selectolax** ≥ 0.3.0 - Fast HTML parser

## 🤝 Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Submit a pull request

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- Built with [requests](https://requests.readthedocs.io/) for HTTP
- Powered by [selectolax](https://selectolax.readthedocs.io/) for HTML parsing
- Inspired by the need for simpler web scraping workflows

---

**Made with ❤️ for Python developers who just want to get things done quickly.**

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/quickfetch/quickfetch",
    "name": "quickfetch",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "quickfetch <quickfetch@example.com>",
    "keywords": "web scraping, http, html, json, requests, beautifulsoup, lxml",
    "author": "quickfetch",
    "author_email": "quickfetch <quickfetch@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/e7/1b/c27357c1e42127dec9b9687a0f5ed99be1bc7baa76ad23e3a070654e89f7/quickfetch-0.1.0.tar.gz",
    "platform": null,
    "description": "# \ud83d\ude80 quickfetch - Quick Fetch\r\n\r\n**Fast, simple, and beautifully minimal web scraping.**\r\n\r\n[![PyPI version](https://badge.fury.io/py/quickfetch.svg)](https://badge.fury.io/py/quickfetch)\r\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\r\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\r\n\r\n## \u2728 The Problem\r\n\r\nPython developers who need to grab HTML or JSON from the web must currently:\r\n\r\n1. Import `requests` to fetch content\r\n2. Import `BeautifulSoup` or `lxml` to parse HTML  \r\n3. Write **6\u201310 lines** of boilerplate code just to get basic data\r\n\r\nThis is slow, repetitive, and messy for quick scripts.\r\n\r\n## \ud83c\udfaf The Solution\r\n\r\n**quickfetch** makes web scraping a **1\u20132 line task** with auto-detection of HTML/JSON content.\r\n\r\n## \ud83d\ude80 Quick Start\r\n\r\n### Installation\r\n\r\n```bash\r\npip install quickfetch\r\n```\r\n\r\n### Basic Usage\r\n\r\n```python\r\nfrom quickfetch import get\r\n\r\n# Fetch a webpage\r\npage = get(\"https://example.com\")\r\nprint(page.select(\"h1\").text)  # Get first h1 text\r\nprint(page.links)              # Get all links\r\nprint(page.images)             # Get all images\r\n\r\n# Fetch JSON API\r\ndata = get(\"https://api.github.com/users/octocat\")\r\nprint(data.json[\"name\"])       # Access JSON data directly\r\n```\r\n\r\n## \ud83c\udfa8 Features\r\n\r\n### Core Features\r\n\r\n- **`get(url)`** \u2192 Returns a response object with:\r\n  - `.html` \u2192 Raw HTML content (if HTML response)\r\n  - `.json` \u2192 Parsed JSON data (if JSON response)\r\n  - `.select(css_selector)` \u2192 CSS selector matching\r\n  - `.xpath(xpath_expr)` \u2192 XPath expression matching\r\n  - `.links` \u2192 All hyperlinks on the page\r\n  - `.images` \u2192 All image URLs on the page\r\n\r\n### Auto-Detection\r\n\r\nquickfetch automatically detects if the response is HTML or JSON:\r\n\r\n```python\r\n# HTML page\r\npage = get(\"https://example.com\")\r\nprint(page.html)    # Raw HTML\r\nprint(page.json)    # None\r\n\r\n# JSON API\r\ndata = get(\"https://api.github.com/users/octocat\")\r\nprint(data.html)    # None\r\nprint(data.json)    # Parsed JSON dict\r\n```\r\n\r\n### CSS Selectors\r\n\r\n```python\r\npage = get(\"https://example.com\")\r\n\r\n# Get first element\r\ntitle = page.select(\"h1\")[0].text\r\n\r\n# Get all elements\r\nlinks = page.select(\"a[href]\")\r\nfor link in links:\r\n    print(link.text, link[\"href\"])\r\n\r\n# Nested selection\r\narticles = page.select(\"article\")\r\nfor article in articles:\r\n    title = article.select(\"h2\")[0].text\r\n    content = article.select(\"p\")[0].text\r\n```\r\n\r\n### XPath Support\r\n\r\n```python\r\npage = get(\"https://example.com\")\r\n\r\n# XPath expressions\r\nelements = page.xpath(\"//h1[@class='title']\")\r\nfor elem in elements:\r\n    print(elem.text)\r\n```\r\n\r\n### Link & Image Extraction\r\n\r\n```python\r\npage = get(\"https://example.com\")\r\n\r\n# Get all links\r\nall_links = page.links\r\nprint(f\"Found {len(all_links)} links\")\r\n\r\n# Get all images\r\nall_images = page.images\r\nprint(f\"Found {len(all_images)} images\")\r\n```\r\n\r\n### Element Properties\r\n\r\n```python\r\npage = get(\"https://example.com\")\r\nelement = page.select(\"a\")[0]\r\n\r\nprint(element.text)     # Text content\r\nprint(element.html)     # HTML content\r\nprint(element.attrs)    # All attributes\r\nprint(element[\"href\"])  # Specific attribute\r\n```\r\n\r\n## \ud83d\udcda Examples\r\n\r\n### Scraping News Headlines\r\n\r\n```python\r\nfrom quickfetch import get\r\n\r\n# Get headlines from a news site\r\npage = get(\"https://news.ycombinator.com\")\r\nheadlines = page.select(\".titleline > a\")\r\n\r\nfor headline in headlines[:5]:\r\n    print(f\"\u2022 {headline.text}\")\r\n    print(f\"  {headline['href']}\\n\")\r\n```\r\n\r\n### API Data Extraction\r\n\r\n```python\r\nfrom quickfetch import get\r\n\r\n# Get GitHub user data\r\nuser = get(\"https://api.github.com/users/octocat\")\r\ndata = user.json\r\n\r\nprint(f\"Name: {data['name']}\")\r\nprint(f\"Location: {data['location']}\")\r\nprint(f\"Followers: {data['followers']}\")\r\n```\r\n\r\n### Image Gallery Scraper\r\n\r\n```python\r\nfrom quickfetch import get\r\n\r\n# Get all images from a gallery\r\npage = get(\"https://example.com/gallery\")\r\nimages = page.images\r\n\r\nprint(\"Found images:\")\r\nfor img_url in images:\r\n    print(f\"  {img_url}\")\r\n```\r\n\r\n### Form Data Extraction\r\n\r\n```python\r\nfrom quickfetch import get\r\n\r\n# Get form fields\r\npage = get(\"https://example.com/contact\")\r\nforms = page.select(\"form\")\r\n\r\nfor form in forms:\r\n    inputs = form.select(\"input\")\r\n    for input_elem in inputs:\r\n        name = input_elem.get(\"name\", \"\")\r\n        type_attr = input_elem.get(\"type\", \"text\")\r\n        print(f\"Input: {name} ({type_attr})\")\r\n```\r\n\r\n## \ud83d\udd27 Advanced Usage\r\n\r\n### Custom Headers\r\n\r\n```python\r\nfrom quickfetch import get\r\n\r\n# Add custom headers\r\npage = get(\"https://api.example.com\", \r\n           headers={\"User-Agent\": \"MyBot/1.0\"})\r\n```\r\n\r\n### Error Handling\r\n\r\n```python\r\nfrom quickfetch import get\r\n\r\ntry:\r\n    page = get(\"https://example.com\")\r\n    if page.html:\r\n        print(\"Successfully fetched HTML\")\r\n    elif page.json:\r\n        print(\"Successfully fetched JSON\")\r\n    else:\r\n        print(\"No content found\")\r\nexcept Exception as e:\r\n    print(f\"Error: {e}\")\r\n```\r\n\r\n### Batch Processing\r\n\r\n```python\r\nfrom quickfetch import get\r\n\r\nurls = [\r\n    \"https://example1.com\",\r\n    \"https://example2.com\", \r\n    \"https://example3.com\"\r\n]\r\n\r\nfor url in urls:\r\n    page = get(url)\r\n    title = page.select(\"title\")[0].text if page.select(\"title\") else \"No title\"\r\n    print(f\"{url}: {title}\")\r\n```\r\n\r\n## \ud83d\udce6 Installation\r\n\r\n### From PyPI\r\n\r\n```bash\r\npip install quickfetch\r\n```\r\n\r\n### From Source\r\n\r\n```bash\r\ngit clone https://github.com/quickfetch/quickfetch.git\r\ncd quickfetch\r\npip install -e .\r\n```\r\n\r\n## \ud83d\udee0 Dependencies\r\n\r\n- **requests** \u2265 2.25.0 - HTTP library\r\n- **selectolax** \u2265 0.3.0 - Fast HTML parser\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\n1. Fork the repository\r\n2. Create a feature branch\r\n3. Make your changes\r\n4. Add tests\r\n5. Submit a pull request\r\n\r\n## \ud83d\udcc4 License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n## \ud83d\ude4f Acknowledgments\r\n\r\n- Built with [requests](https://requests.readthedocs.io/) for HTTP\r\n- Powered by [selectolax](https://selectolax.readthedocs.io/) for HTML parsing\r\n- Inspired by the need for simpler web scraping workflows\r\n\r\n---\r\n\r\n**Made with \u2764\ufe0f for Python developers who just want to get things done quickly.**\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Fast, simple, and beautifully minimal web scraping",
    "version": "0.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/quickfetch/quickfetch/issues",
        "Documentation": "https://github.com/quickfetch/quickfetch#readme",
        "Homepage": "https://github.com/quickfetch/quickfetch",
        "Repository": "https://github.com/quickfetch/quickfetch"
    },
    "split_keywords": [
        "web scraping",
        " http",
        " html",
        " json",
        " requests",
        " beautifulsoup",
        " lxml"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "16b0d5426344166bf2f6d065e23ff30dfb7307560a87eebb67bc47d412b5a481",
                "md5": "1420b8cd62d8dd1fc2ba99ed844d92ce",
                "sha256": "2cb16a0286988916fdb191f9e1a5d4366358f31f6c74cee677b0ca5bc24cfb49"
            },
            "downloads": -1,
            "filename": "quickfetch-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1420b8cd62d8dd1fc2ba99ed844d92ce",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 6821,
            "upload_time": "2025-08-08T21:27:18",
            "upload_time_iso_8601": "2025-08-08T21:27:18.892123Z",
            "url": "https://files.pythonhosted.org/packages/16/b0/d5426344166bf2f6d065e23ff30dfb7307560a87eebb67bc47d412b5a481/quickfetch-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e71bc27357c1e42127dec9b9687a0f5ed99be1bc7baa76ad23e3a070654e89f7",
                "md5": "fb79b8e4909056799be6499956129c26",
                "sha256": "730f2c06b15e5abca6e472f3d5ad37157ec1e0823128161436267194a4978f43"
            },
            "downloads": -1,
            "filename": "quickfetch-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "fb79b8e4909056799be6499956129c26",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 8897,
            "upload_time": "2025-08-08T21:27:20",
            "upload_time_iso_8601": "2025-08-08T21:27:20.479528Z",
            "url": "https://files.pythonhosted.org/packages/e7/1b/c27357c1e42127dec9b9687a0f5ed99be1bc7baa76ad23e3a070654e89f7/quickfetch-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-08 21:27:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "quickfetch",
    "github_project": "quickfetch",
    "github_not_found": true,
    "lcname": "quickfetch"
}
        
Elapsed time: 2.01419s