pywebtask

Name	pywebtask JSON
Version	0.2.1 JSON
	download
home_page	None
Summary	LLM-powered web automation library with autonomous agents and natural language selectors
upload_time	2025-10-24 00:05:08
maintainer	None
docs_url	None
author	Steve Wang
requires_python	>=3.10
license	MIT
keywords	automation web-automation browser-automation llm ai-agent playwright web-agent natural-language
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # webtask

LLM-powered web automation library with autonomous agents and natural language selectors.

---

## What it does

Three ways to use it:

**High-level** - Give it a task, let it figure out the steps
**Step-by-step** - Execute tasks one step at a time for debugging/control
**Low-level** - Tell it exactly what to do with natural language selectors

Uses LLMs to understand pages, plan actions, and select elements. Built with Playwright for the browser stuff.

---

## Quick look

**Setup:**
```python
from webtask import Webtask
from webtask.integrations.llm import GeminiLLM

# Create Webtask manager (browser launches lazily)
wt = Webtask()

# Choose your LLM (Gemini or OpenAI)
llm = GeminiLLM.create(model="gemini-2.5-flash")

# Create agent
agent = await wt.create_agent(llm=llm)
```

**High-level autonomous:**
```python
# Agent figures out the steps
result = await agent.execute("search for cats and click the first result")
print(f"Completed: {result.completed}")
```

**Step-by-step execution:**
```python
# Execute task one step at a time
agent.set_task("add 2 items to cart")

for i in range(10):
    step = await agent.run_step()

    print(f"Step {i+1}: {len(step.proposals)} actions")
    print(f"Verification: {step.verification.message}")

    if step.verification.complete:
        break

# Useful for debugging, progress tracking, or custom control flow
```

**Low-level imperative:**
```python
# You control the steps, agent handles the selectors
await agent.navigate("https://google.com")

search_box = await agent.select("search box")
await search_box.fill("cats")

button = await agent.select("search button")
await button.click()

# Wait for page to stabilize
await agent.wait_for_idle()

# Take screenshot
await agent.screenshot("result.png")
```

No CSS selectors. No XPath. Just describe what you want.

---

## How it works

**High-level mode** - The agent loop:
1. Proposer looks at the page and task, decides next action
2. Executer runs it (click, type, navigate, etc.)
3. Verifier checks if task complete
4. Repeat until done

**Step-by-step mode** - Same as high-level but you control the loop:
- `agent.set_task(description)` - Set the task
- `agent.execute_step()` - Execute one step (propose → execute → verify)
- `agent.clear_history()` - Reset for new task

**Low-level mode** - You call methods directly:
- `agent.navigate(url)` - Go to a page
- `agent.select(description)` - Find element by natural language
- `element.click()`, `element.fill(text)`, `element.type(text)` - Interact with elements
- `agent.wait(seconds)` - Wait for specific duration
- `agent.wait_for_idle()` - Wait for network/DOM to stabilize
- `agent.screenshot(path)` - Capture page screenshot

All modes use the same core: LLM sees cleaned DOM with element IDs like `button-0` instead of raw HTML. Clean input, clean output.

---

## Status

🚧 Work in progress

Core implementation complete. See [TODO](docs/todo.md) for testing plan and future work.

---

## Benchmarks

Evaluate webtask on standard web agent benchmarks:

**[webtask-benchmarks](https://github.com/steve-z-wang/webtask-benchmarks)** - Evaluation framework for Mind2Web and other benchmarks

---

## Install

```bash
pip install pywebtask
playwright install chromium
```

---

## License

MIT

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pywebtask",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "automation, web-automation, browser-automation, llm, ai-agent, playwright, web-agent, natural-language",
    "author": "Steve Wang",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/a2/df/1e9c140b1fd2cbe86b2ffb7af19f7da1d1d9fb71feb53f201d9fd0f502de/pywebtask-0.2.1.tar.gz",
    "platform": null,
    "description": "# webtask\n\nLLM-powered web automation library with autonomous agents and natural language selectors.\n\n---\n\n## What it does\n\nThree ways to use it:\n\n**High-level** - Give it a task, let it figure out the steps\n**Step-by-step** - Execute tasks one step at a time for debugging/control\n**Low-level** - Tell it exactly what to do with natural language selectors\n\nUses LLMs to understand pages, plan actions, and select elements. Built with Playwright for the browser stuff.\n\n---\n\n## Quick look\n\n**Setup:**\n```python\nfrom webtask import Webtask\nfrom webtask.integrations.llm import GeminiLLM\n\n# Create Webtask manager (browser launches lazily)\nwt = Webtask()\n\n# Choose your LLM (Gemini or OpenAI)\nllm = GeminiLLM.create(model=\"gemini-2.5-flash\")\n\n# Create agent\nagent = await wt.create_agent(llm=llm)\n```\n\n**High-level autonomous:**\n```python\n# Agent figures out the steps\nresult = await agent.execute(\"search for cats and click the first result\")\nprint(f\"Completed: {result.completed}\")\n```\n\n**Step-by-step execution:**\n```python\n# Execute task one step at a time\nagent.set_task(\"add 2 items to cart\")\n\nfor i in range(10):\n    step = await agent.run_step()\n\n    print(f\"Step {i+1}: {len(step.proposals)} actions\")\n    print(f\"Verification: {step.verification.message}\")\n\n    if step.verification.complete:\n        break\n\n# Useful for debugging, progress tracking, or custom control flow\n```\n\n**Low-level imperative:**\n```python\n# You control the steps, agent handles the selectors\nawait agent.navigate(\"https://google.com\")\n\nsearch_box = await agent.select(\"search box\")\nawait search_box.fill(\"cats\")\n\nbutton = await agent.select(\"search button\")\nawait button.click()\n\n# Wait for page to stabilize\nawait agent.wait_for_idle()\n\n# Take screenshot\nawait agent.screenshot(\"result.png\")\n```\n\nNo CSS selectors. No XPath. Just describe what you want.\n\n---\n\n## How it works\n\n**High-level mode** - The agent loop:\n1. Proposer looks at the page and task, decides next action\n2. Executer runs it (click, type, navigate, etc.)\n3. Verifier checks if task complete\n4. Repeat until done\n\n**Step-by-step mode** - Same as high-level but you control the loop:\n- `agent.set_task(description)` - Set the task\n- `agent.execute_step()` - Execute one step (propose \u2192 execute \u2192 verify)\n- `agent.clear_history()` - Reset for new task\n\n**Low-level mode** - You call methods directly:\n- `agent.navigate(url)` - Go to a page\n- `agent.select(description)` - Find element by natural language\n- `element.click()`, `element.fill(text)`, `element.type(text)` - Interact with elements\n- `agent.wait(seconds)` - Wait for specific duration\n- `agent.wait_for_idle()` - Wait for network/DOM to stabilize\n- `agent.screenshot(path)` - Capture page screenshot\n\nAll modes use the same core: LLM sees cleaned DOM with element IDs like `button-0` instead of raw HTML. Clean input, clean output.\n\n---\n\n## Status\n\n\ud83d\udea7 Work in progress\n\nCore implementation complete. See [TODO](docs/todo.md) for testing plan and future work.\n\n---\n\n## Benchmarks\n\nEvaluate webtask on standard web agent benchmarks:\n\n**[webtask-benchmarks](https://github.com/steve-z-wang/webtask-benchmarks)** - Evaluation framework for Mind2Web and other benchmarks\n\n---\n\n## Install\n\n```bash\npip install pywebtask\nplaywright install chromium\n```\n\n---\n\n## License\n\nMIT\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "LLM-powered web automation library with autonomous agents and natural language selectors",
    "version": "0.2.1",
    "project_urls": {
        "Documentation": "https://github.com/steve-z-wang/webtask#readme",
        "Homepage": "https://github.com/steve-z-wang/webtask",
        "Issues": "https://github.com/steve-z-wang/webtask/issues",
        "Repository": "https://github.com/steve-z-wang/webtask"
    },
    "split_keywords": [
        "automation",
        " web-automation",
        " browser-automation",
        " llm",
        " ai-agent",
        " playwright",
        " web-agent",
        " natural-language"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fc45306dc66a56de94137ac2d68ac658aae524ee1e35f6deb2a0d08316d3a3df",
                "md5": "e9a6fdb6dd5b8a1e8ccc04e28c6e4247",
                "sha256": "0e30ca667919a615db06819de723cd2de1e9168873f18ffb26d971fd9d18e1d6"
            },
            "downloads": -1,
            "filename": "pywebtask-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e9a6fdb6dd5b8a1e8ccc04e28c6e4247",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 65820,
            "upload_time": "2025-10-24T00:05:06",
            "upload_time_iso_8601": "2025-10-24T00:05:06.868575Z",
            "url": "https://files.pythonhosted.org/packages/fc/45/306dc66a56de94137ac2d68ac658aae524ee1e35f6deb2a0d08316d3a3df/pywebtask-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a2df1e9c140b1fd2cbe86b2ffb7af19f7da1d1d9fb71feb53f201d9fd0f502de",
                "md5": "9dbb5caf8b5750e80dba6d0d8c79288e",
                "sha256": "b52d52f113fc2127c27d27b0a3f9036a08101fa52bc8f465b3400f9d3b937fb2"
            },
            "downloads": -1,
            "filename": "pywebtask-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "9dbb5caf8b5750e80dba6d0d8c79288e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 42910,
            "upload_time": "2025-10-24T00:05:08",
            "upload_time_iso_8601": "2025-10-24T00:05:08.457285Z",
            "url": "https://files.pythonhosted.org/packages/a2/df/1e9c140b1fd2cbe86b2ffb7af19f7da1d1d9fb71feb53f201d9fd0f502de/pywebtask-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-24 00:05:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "steve-z-wang",
    "github_project": "webtask#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pywebtask"
}

Steve Wang