t-screenwise

Name	t-screenwise JSON
Version	1.0.2 JSON
	download
home_page	https://www.thoughtful.ai/
Summary	A Python package for detecting and interacting with screen elements using computer vision and OCR.
upload_time	2025-01-14 23:21:50
maintainer	None
docs_url	None
author	Nikolas Cohn, Alejandro Muñoz
requires_python	>=3.9
license	None
keywords	t_screenwise
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            Screenwise Framework
===================

A Python framework for screen element detection and interaction using computer vision and machine learning.

Overview
--------
Screenwise provides automated detection and interaction with UI elements through:

* Screenshot capture and analysis
* ML-based element detection 
* Coordinate-based interaction
* OCR capabilities
* Debug and capture modes
* Cross-platform support

Installation
-----------
.. code-block:: bash

    pip install screenwise

Basic Usage
----------

Initialize Framework
~~~~~~~~~~~~~~~~~~
.. code-block:: python

    from t_screenwise.screenwise import Framework

    # Initialize with default settings
    framework = Framework()

    # Initialize with custom settings
    framework = Framework(
        mode="CAPTURE",
        model_path="path/to/model.pth",
        labels="path/to/labels.json",
        device="cpu"
    )

Detect Elements
~~~~~~~~~~~~~~
.. code-block:: python

    # Get all detected elements
    elements = framework.get()

    # Filter for specific element types
    buttons = framework.get(filter=["button"])
    text = framework.get(filter=["text"])

Interact with Elements
~~~~~~~~~~~~~~~~~~~~
.. code-block:: python

    # Click element
    element.click()

    # Click at specific position
    element.click(coords="up_right")

    # Type text
    element.send_keys("Hello World")

    # Click and type
    element.click_and_send_keys("Hello World")

Process OCR Elements
~~~~~~~~~~~~~~~~~~
.. code-block:: python

    framework = Framework()
    results = framework.get(image="path/to/image.png", process_ocr=True)

    # Work with both types of elements
    for element in results:
        if isinstance(element, OCRElement):
            print(f"OCR Text: {element.text} (Confidence: {element.confidence})")
        else:
            print(f"Box Label: {element.label}")

OCR Elements
~~~~~~~~~~~
* Text content extraction
* Confidence scoring
* Spatial relationship analysis
* Text-based element search

OCR Spatial Analysis
~~~~~~~~~~~~~~~~~~
The OCRElement class provides powerful spatial analysis capabilities through the ``get_nearest_boxes`` method:

.. code-block:: python

    # Get OCR elements from an image
    ocr_elements = framework.get(image="screenshot.png", process_ocr=True)

    # For a specific OCR element, find nearest elements in all directions
    nearest = ocr_element.get_nearest_boxes(ocr_elements, n=1)

    # Access nearest elements by direction
    right_element = nearest["right"][0]  # Nearest element to the right
    left_element = nearest["left"][0]    # Nearest element to the left
    above_element = nearest["above"][0]   # Nearest element above
    below_element = nearest["below"][0]   # Nearest element below

Features:

* Find n nearest elements in each direction (right, left, above, below)
* Considers spatial overlap when determining nearest elements
* Returns elements sorted by distance
* Useful for understanding layout and relationships between text elements

Features
--------

Screen Elements
~~~~~~~~~~~~~~
* Coordinate-based positioning
* Margin calculations
* Drawing capabilities

Mouse and keyboard interaction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Debug visualization

Operating Modes
~~~~~~~~~~~~~
* CAPTURE: Live interaction with screen elements
* DEBUG: Visualization and testing without actual interaction

Configuration
------------

Labels
~~~~~~
Labels are defined in a JSON file mapping element types to numeric IDs:

.. code-block:: json

    {
        "button": 1,
        "text": 2,
        "input": 3
        // etc...
    }

Model
~~~~~
Supports custom trained object detection models:

* Default model trained for common UI elements
* Configurable confidence thresholds

Contributing
-----------
1. Clone the repository
2. Create a feature branch
3. Commit changes
4. Push to branch
5. Create Pull Request

Raw data

            {
    "_id": null,
    "home_page": "https://www.thoughtful.ai/",
    "name": "t-screenwise",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "t_screenwise",
    "author": "Nikolas Cohn, Alejandro Mu\u00f1oz",
    "author_email": "support@thoughtful.ai",
    "download_url": "https://files.pythonhosted.org/packages/f4/ea/af24ae776dc3c4e2c7012b76c7d7141fd5eb178d9235562d388ba993c59c/t_screenwise-1.0.2.tar.gz",
    "platform": null,
    "description": "Screenwise Framework\n===================\n\nA Python framework for screen element detection and interaction using computer vision and machine learning.\n\nOverview\n--------\nScreenwise provides automated detection and interaction with UI elements through:\n\n* Screenshot capture and analysis\n* ML-based element detection \n* Coordinate-based interaction\n* OCR capabilities\n* Debug and capture modes\n* Cross-platform support\n\nInstallation\n-----------\n.. code-block:: bash\n\n    pip install screenwise\n\nBasic Usage\n----------\n\nInitialize Framework\n~~~~~~~~~~~~~~~~~~\n.. code-block:: python\n\n    from t_screenwise.screenwise import Framework\n\n    # Initialize with default settings\n    framework = Framework()\n\n    # Initialize with custom settings\n    framework = Framework(\n        mode=\"CAPTURE\",\n        model_path=\"path/to/model.pth\",\n        labels=\"path/to/labels.json\",\n        device=\"cpu\"\n    )\n\nDetect Elements\n~~~~~~~~~~~~~~\n.. code-block:: python\n\n    # Get all detected elements\n    elements = framework.get()\n\n    # Filter for specific element types\n    buttons = framework.get(filter=[\"button\"])\n    text = framework.get(filter=[\"text\"])\n\nInteract with Elements\n~~~~~~~~~~~~~~~~~~~~\n.. code-block:: python\n\n    # Click element\n    element.click()\n\n    # Click at specific position\n    element.click(coords=\"up_right\")\n\n    # Type text\n    element.send_keys(\"Hello World\")\n\n    # Click and type\n    element.click_and_send_keys(\"Hello World\")\n\nProcess OCR Elements\n~~~~~~~~~~~~~~~~~~\n.. code-block:: python\n\n    framework = Framework()\n    results = framework.get(image=\"path/to/image.png\", process_ocr=True)\n\n    # Work with both types of elements\n    for element in results:\n        if isinstance(element, OCRElement):\n            print(f\"OCR Text: {element.text} (Confidence: {element.confidence})\")\n        else:\n            print(f\"Box Label: {element.label}\")\n\nOCR Elements\n~~~~~~~~~~~\n* Text content extraction\n* Confidence scoring\n* Spatial relationship analysis\n* Text-based element search\n\nOCR Spatial Analysis\n~~~~~~~~~~~~~~~~~~\nThe OCRElement class provides powerful spatial analysis capabilities through the ``get_nearest_boxes`` method:\n\n.. code-block:: python\n\n    # Get OCR elements from an image\n    ocr_elements = framework.get(image=\"screenshot.png\", process_ocr=True)\n\n    # For a specific OCR element, find nearest elements in all directions\n    nearest = ocr_element.get_nearest_boxes(ocr_elements, n=1)\n\n    # Access nearest elements by direction\n    right_element = nearest[\"right\"][0]  # Nearest element to the right\n    left_element = nearest[\"left\"][0]    # Nearest element to the left\n    above_element = nearest[\"above\"][0]   # Nearest element above\n    below_element = nearest[\"below\"][0]   # Nearest element below\n\nFeatures:\n\n* Find n nearest elements in each direction (right, left, above, below)\n* Considers spatial overlap when determining nearest elements\n* Returns elements sorted by distance\n* Useful for understanding layout and relationships between text elements\n\nFeatures\n--------\n\nScreen Elements\n~~~~~~~~~~~~~~\n* Coordinate-based positioning\n* Margin calculations\n* Drawing capabilities\n\nMouse and keyboard interaction\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n* Debug visualization\n\nOperating Modes\n~~~~~~~~~~~~~\n* CAPTURE: Live interaction with screen elements\n* DEBUG: Visualization and testing without actual interaction\n\nConfiguration\n------------\n\nLabels\n~~~~~~\nLabels are defined in a JSON file mapping element types to numeric IDs:\n\n.. code-block:: json\n\n    {\n        \"button\": 1,\n        \"text\": 2,\n        \"input\": 3\n        // etc...\n    }\n\nModel\n~~~~~\nSupports custom trained object detection models:\n\n* Default model trained for common UI elements\n* Configurable confidence thresholds\n\nContributing\n-----------\n1. Clone the repository\n2. Create a feature branch\n3. Commit changes\n4. Push to branch\n5. Create Pull Request\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python package for detecting and interacting with screen elements using computer vision and OCR.",
    "version": "1.0.2",
    "project_urls": {
        "Homepage": "https://www.thoughtful.ai/"
    },
    "split_keywords": [
        "t_screenwise"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f4eaaf24ae776dc3c4e2c7012b76c7d7141fd5eb178d9235562d388ba993c59c",
                "md5": "9ca61a78bc6711708e8e2e5de52752df",
                "sha256": "10ab4f9bba0ebab831bac00175978be2593bd7c6ab3f51ed968fb881b1500fec"
            },
            "downloads": -1,
            "filename": "t_screenwise-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "9ca61a78bc6711708e8e2e5de52752df",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 16137,
            "upload_time": "2025-01-14T23:21:50",
            "upload_time_iso_8601": "2025-01-14T23:21:50.523127Z",
            "url": "https://files.pythonhosted.org/packages/f4/ea/af24ae776dc3c4e2c7012b76c7d7141fd5eb178d9235562d388ba993c59c/t_screenwise-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-14 23:21:50",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "t-screenwise"
}

Nikolas Cohn, Alejandro Muñoz