speech2caret

Name	speech2caret JSON
Version	0.1.2 JSON
	download
home_page	None
Summary	Use your speech to write to the current caret position!
upload_time	2025-09-09 18:32:34
maintainer	None
docs_url	None
author	asmith26
requires_python	>=3.13
license	None
keywords	speech recognition speech to text caret artificial intelligence cli
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # speech2caret

<p align="center">
    <img src="https://github.com/asmith26/speech2caret/raw/refs/heads/main/assets/speech2caret_logo.svg" alt="speech2caret logo" width="250"/>
</p>
<p align="center">
Use your speech to write to the current caret position!
</p>


## Goals

- ✅ **Simple**: A minimalist tool that does one thing well.
- ✅ **Local**: Runs entirely on your machine (uses [Hugging Face models](https://huggingface.co/models) for speech recognition).
- ✅ **Efficient**: Optimised for low CPU and memory usage, thanks to an event-driven architecture that responds instantly to key presses without wasting resources.

**Note**: Tested only on Linux (Ubuntu). Other operating systems are currently unsupported.

**Demo (turn volume on):**

[demo video](https://github.com/user-attachments/assets/6de72da8-0aa2-40c4-802d-82772881c862)

## Installation

### 1. System Dependencies

First, install the required system libraries:

```bash
sudo apt update
sudo apt install libportaudio2 ffmpeg
```

### 2. Grant Permissions

To read keyboard events and simulate key presses, [`evdev`](https://python-evdev.readthedocs.io/en/latest/usage.html#listing-accessible-event-devices) needs access to your keyboard input device. Add your user to the `input` group to grant the necessary permissions:

```bash
sudo usermod -aG input $USER
newgrp input  # or log out and back in 
```
    
### 3. Install and Run

You can install and run `speech2caret` using `pip` or `uv`:

```bash
# Install the package
uv add speech2caret  # or pip install speech2caret

# Run the application
speech2caret
```

Alternatively, you can run it directly without installation using `uvx`(the `--index pytorch-cpu=...` flag ensures only CPU packages are downloaded, avoiding GPU-related dependencies):

```bash
uvx --index pytorch-cpu=https://download.pytorch.org/whl/cpu --from speech2caret speech2caret
```

## Configuration

The first time you run `speech2caret`, it creates a config file at `~/.config/speech2caret/config.ini`.

You’ll need to manually edit it with the following values:

#### `keyboard_device_path`
This is the path to your keyboard input device. You can find the path either following [this](https://python-evdev.readthedocs.io/en/latest/usage.html#listing-accessible-event-devices), or by running the command below and looking for an entry that ends with `-event-kbd`.

```bash
ls /dev/input/by-path/
```

#### `start_stop_key` and `resume_pause_key`
These are the keys you'll use to control the app.

To find the correct name for a key, you can use the provided Python script below. First, ensure you have your `keyboard_device_path` from the step above, then run this command:

```bash
uvx --from evdev python -c '
keyboard_device_path = "PASTE_YOUR_KEYBOARD_DEVICE_PATH_HERE"

from evdev import InputDevice, categorize, ecodes, KeyEvent
dev = InputDevice(keyboard_device_path)
print(f"Listening for key presses on {dev.name}...")
for event in dev.read_loop():
    if event.type == ecodes.EV_KEY:
        key_event = categorize(event)
        if key_event.keystate == KeyEvent.key_down:
            print(f" {key_event.keycode}")
'
```
Press the keys you wish to use, and their names will be printed to the terminal. For a full list of available key names, see [here](https://github.com/torvalds/linux/blob/a79a588fc1761dc12a3064fc2f648ae66cea3c5a/include/uapi/linux/input-event-codes.h#L65).

## How to Use

1.  Run the `speech2caret` command in your terminal.
2.  Press your configured `start_stop_key` to begin recording.
3.  Press the `resume_pause_key` to toggle between pausing and resuming.
4.  When you are finished, press the `start_stop_key` again.
5.  The recorded audio will be transcribed and typed at your current caret position.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "speech2caret",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.13",
    "maintainer_email": null,
    "keywords": "speech recognition, speech to text, caret, artificial intelligence, cli",
    "author": "asmith26",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/92/c6/c9d1f4cefb99cfe4df207a6cec3efdc9777869d0d1677383a545683968e7/speech2caret-0.1.2.tar.gz",
    "platform": null,
    "description": "# speech2caret\n\n<p align=\"center\">\n    <img src=\"https://github.com/asmith26/speech2caret/raw/refs/heads/main/assets/speech2caret_logo.svg\" alt=\"speech2caret logo\" width=\"250\"/>\n</p>\n<p align=\"center\">\nUse your speech to write to the current caret position!\n</p>\n\n\n## Goals\n\n- \u2705 **Simple**: A minimalist tool that does one thing well.\n- \u2705 **Local**: Runs entirely on your machine (uses [Hugging Face models](https://huggingface.co/models) for speech recognition).\n- \u2705 **Efficient**: Optimised for low CPU and memory usage, thanks to an event-driven architecture that responds instantly to key presses without wasting resources.\n\n**Note**: Tested only on Linux (Ubuntu). Other operating systems are currently unsupported.\n\n**Demo (turn volume on):**\n\n[demo video](https://github.com/user-attachments/assets/6de72da8-0aa2-40c4-802d-82772881c862)\n\n## Installation\n\n### 1. System Dependencies\n\nFirst, install the required system libraries:\n\n```bash\nsudo apt update\nsudo apt install libportaudio2 ffmpeg\n```\n\n### 2. Grant Permissions\n\nTo read keyboard events and simulate key presses, [`evdev`](https://python-evdev.readthedocs.io/en/latest/usage.html#listing-accessible-event-devices) needs access to your keyboard input device. Add your user to the `input` group to grant the necessary permissions:\n\n```bash\nsudo usermod -aG input $USER\nnewgrp input  # or log out and back in \n```\n    \n### 3. Install and Run\n\nYou can install and run `speech2caret` using `pip` or `uv`:\n\n```bash\n# Install the package\nuv add speech2caret  # or pip install speech2caret\n\n# Run the application\nspeech2caret\n```\n\nAlternatively, you can run it directly without installation using `uvx`(the `--index pytorch-cpu=...` flag ensures only CPU packages are downloaded, avoiding GPU-related dependencies):\n\n```bash\nuvx --index pytorch-cpu=https://download.pytorch.org/whl/cpu --from speech2caret speech2caret\n```\n\n## Configuration\n\nThe first time you run `speech2caret`, it creates a config file at `~/.config/speech2caret/config.ini`.\n\nYou\u2019ll need to manually edit it with the following values:\n\n#### `keyboard_device_path`\nThis is the path to your keyboard input device. You can find the path either following [this](https://python-evdev.readthedocs.io/en/latest/usage.html#listing-accessible-event-devices), or by running the command below and looking for an entry that ends with `-event-kbd`.\n\n```bash\nls /dev/input/by-path/\n```\n\n#### `start_stop_key` and `resume_pause_key`\nThese are the keys you'll use to control the app.\n\nTo find the correct name for a key, you can use the provided Python script below. First, ensure you have your `keyboard_device_path` from the step above, then run this command:\n\n```bash\nuvx --from evdev python -c '\nkeyboard_device_path = \"PASTE_YOUR_KEYBOARD_DEVICE_PATH_HERE\"\n\nfrom evdev import InputDevice, categorize, ecodes, KeyEvent\ndev = InputDevice(keyboard_device_path)\nprint(f\"Listening for key presses on {dev.name}...\")\nfor event in dev.read_loop():\n    if event.type == ecodes.EV_KEY:\n        key_event = categorize(event)\n        if key_event.keystate == KeyEvent.key_down:\n            print(f\" {key_event.keycode}\")\n'\n```\nPress the keys you wish to use, and their names will be printed to the terminal. For a full list of available key names, see [here](https://github.com/torvalds/linux/blob/a79a588fc1761dc12a3064fc2f648ae66cea3c5a/include/uapi/linux/input-event-codes.h#L65).\n\n## How to Use\n\n1.  Run the `speech2caret` command in your terminal.\n2.  Press your configured `start_stop_key` to begin recording.\n3.  Press the `resume_pause_key` to toggle between pausing and resuming.\n4.  When you are finished, press the `start_stop_key` again.\n5.  The recorded audio will be transcribed and typed at your current caret position.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Use your speech to write to the current caret position!",
    "version": "0.1.2",
    "project_urls": {
        "Homepage": "https://github.com/asmith26/speech2caret"
    },
    "split_keywords": [
        "speech recognition",
        " speech to text",
        " caret",
        " artificial intelligence",
        " cli"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "eebab5a3310893f22772f4e579b778336d4479685a534db975cbb8a78ddec070",
                "md5": "6e9b95187cce91fbf48801e00e2a7671",
                "sha256": "077d2201feb8b595673f3025239aecf16540eccac77a0b6a4b31cc66d105ad08"
            },
            "downloads": -1,
            "filename": "speech2caret-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6e9b95187cce91fbf48801e00e2a7671",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.13",
            "size": 8282,
            "upload_time": "2025-09-09T18:32:33",
            "upload_time_iso_8601": "2025-09-09T18:32:33.023765Z",
            "url": "https://files.pythonhosted.org/packages/ee/ba/b5a3310893f22772f4e579b778336d4479685a534db975cbb8a78ddec070/speech2caret-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "92c6c9d1f4cefb99cfe4df207a6cec3efdc9777869d0d1677383a545683968e7",
                "md5": "5fec13045fc33eb6a16b5b8e1f592b26",
                "sha256": "1453fc4354bb64272f3f1def979ba72f8bd649bcc8d85333695edb9b7441a2c0"
            },
            "downloads": -1,
            "filename": "speech2caret-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "5fec13045fc33eb6a16b5b8e1f592b26",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.13",
            "size": 6154,
            "upload_time": "2025-09-09T18:32:34",
            "upload_time_iso_8601": "2025-09-09T18:32:34.418450Z",
            "url": "https://files.pythonhosted.org/packages/92/c6/c9d1f4cefb99cfe4df207a6cec3efdc9777869d0d1677383a545683968e7/speech2caret-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-09 18:32:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "asmith26",
    "github_project": "speech2caret",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "speech2caret"
}

asmith26