text_to_action

Name	text_to_action JSON
Version	2.0.2 JSON
	download
home_page	https://github.com/sri0606/text_to_action
Summary	A system that translates natural language queries into programmatic actions
upload_time	2024-10-01 02:41:50
maintainer	None
docs_url	None
author	Sriram Seelamneni
requires_python	<4.0,>=3.8
license	MIT
keywords	text_to_action function calling natural language processing automation
VCS
bugtrack_url
requirements	deepdiff groq h5py matplotlib numpy openai pydantic python-dotenv python_dateutil scipy sentence-transformers spacy torch transformers litellm
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Text-to-Action

## Overview

Text-to-Action is a system that transaltes natural language commands to programmatic actions. It interprets user input, determines the most appropriate action to execute, extracts relevant parameters, and performs corresponding actions.

You can use this to automate tasks, either within your application or for external use, by letting users give natural language commands. For example, if you're building an image editing app, you can use TextToAction to understand what the user wants (like resizing or cropping an image) and even perform the action automatically.

### How to use

```bash
git clone https://github.com/sri0606/text_to_action.git
or
pip install text-to-action
```

Below is a simple example of how to use TextToAction to handle user input and automatically perform actions like simple calculator operations:

```python
  import os
  from src.text_to_action import TextToAction, LLMClient
  from dotenv import load_dotenv
  load_dotenv()

  llm_client = LLMClient(model="groq/llama3-70b-8192")
  # Get the path to the actions folder
  current_directory = os.path.dirname(os.path.abspath(__file__))
  calculator_actions_folder = os.path.join(current_directory,"src","text_to_action","example_actions","calculator")

  # Initialize TextToAction dispatcher with actions folder and LLM client
  dispatcher = TextToAction(
      actions_folder=calculator_actions_folder, 
      llm_client=llm_client,
      verbose_output=True, 
      application_context="Calculator", 
      filter_input=True
  )

  user_input = input("Enter your query: ") # (mulitply 3,4,5) or (add 3,4 and multiply 3,4)
  results = dispatcher.run(user_input)
  # Example output:
  # {'message': 'Detected multiple actions.', 
  # 'results': [
  #    {'action': 'add', 'args': {'values': [3, 4]}, 'output': 7}, 
  #    {'action': 'multiply', 'args': {'values': [3, 4]}, 'output': 12}
  #   ]
  #}
```

Apart from directly running actions, TextToAction also allows you to extract actions and parameters separately. This can be useful when you want more control over how the system processes user input.

```python
# Extract actions based on the user's query
result1 = dispatcher.extract_actions(query_text="multiply 3,4,5")
# Output: {'actions': ['multiply'], 'message': 'Sure, I can help you with that.'}

# Extract parameters for a specific action (e.g., 'multiply') from the user's query
result2 = dispatcher.extract_parameters(
    query_text="multiply 3,4,5", 
    action_name="multiply", 
    args={"values": {"type": "List[int]", "required": True}}
)
# Output: {'values': [3, 4, 5]}

# Extract both actions and parameters together
result3 = dispatcher.extract_actions_with_args(query_text="multiply 3,4,5")
# Output: {'actions': [{'action': 'multiply', 'args': {'values': [3, 4, 5]}}], 
#          'message': 'Sure I can help you with that. Starting calculation now.'}

```
### Quick Notes:

- Get an API keyfrom services like Groq (free-tier available), OpenAI or any other service [check supported services](https://docs.litellm.ai/docs/providers). Create a `.env` file and set the api keys values (like `GROQ_API_KEY`, `OPENAI_API_KEY`).

- If you are using NER (not recommended) for parameters extraction, download the corresponding model from spacy.

  ```
  python -m spacy download en_core_web_trf
  ```

# Where to start

## Step 1: Describe actions `descriptions.json`

  First, create a json file listing actions descriptions strictly in the following format:

  ```json
  {
      "add": {
          "description": "Add or sum a list of numbers",
          "examples": ["20+50", "add 10, 30, 69", "sum of 1,3,4", "combine numbers", "find the total"],
          "args": {
              "values": {
                  "type": "List[int]",
                  "required": true
              }
          }
      },
      "subtract": {
          "description": "Subtract two numbers",
          "examples": ["10 - 5", "subtract 8 from 20", "what's 50 minus 15?", "deduct 5 from 10"],
          "args": {
              "a": {
                  "type": "int",
                  "required": true
              },
              "b": {
                  "type": "int",
                  "required": true
              }
          }
      }
  }
```
  Better and diverse descriptions for each function, better accuracy.

## Step 2: Create embeddings `embeddings.h5`

  Next, you should create embeddings for actions.

  ```python
  from text_to_action import create_action_embeddings

  # you can use SBERT or other huggingface models to create embeddings
  descriptions_filepath = os.path.join("example_actions", "calculator", "descriptions.json")
  save_to = os.path.join("example_actions", "calculator", "embeddings.h5")

  create_actions_embeddings(descriptions_filepath, save_to=save_to,validate_data=True)
  ```

## Step 3: (Optional) Define actions/functions `implementation.py`

Optionally, define the necessary functions and save them to a file. Infact, you can define the functions in any language you want. You can use TextToAction through a server. Checkout [server.py](server.py)

  ```python
  def add(values: List[int]) -> int:
      """
      Returns the sum of a list of integers.
      """
      return sum(values)

  def subtract(a: int, b: int) -> int:
      """
      Returns the difference between a and b.
      """
      return a - b
  ```

## Use

**Save the `descriptions.json`, `embeddings.h5` and `implementations.py` (optional) to a single folder.**

```python
from text_to_action import TextToAction
from dotenv import load_dotenv
load_dotenv()

# use the same embedding model, model source you used when creating the actions embeddings
dispatcher = TextToAction(actions_folder = calculator_actions_folder, llm_client=llm_client,
                            verbose_output=True,application_context="Calculator", filter_input=True)

```

## Key Components

1. **Text to Action**: The core component that orchestrates the flow from query to action execution.

2. **Vector Store**: Stores embeddings of function descriptions and associated metadata for efficient similarity search.

3. **Parameter Extractor**: Extracts function arguments from the input text using NER or LLM-based approaches.

## How it works

1. The system receives a natural language query from the user.
2. The query is processed by the Vector Store to identify the most relevant function(s).
3. The Parameter Extractor analyzes the query to extract required function arguments.
4. The Action Dispatcher selects the most appropriate function based on similarity scores and parameter availability.
5. The selected function is executed with the extracted parameters.
6. The result is returned to the user.

## Possible use Cases

- Natural Language Interfaces for APIs
- Chatbots and Virtual Assistants
- Automated Task Execution Systems
- Voice-Controlled Applications

## Contributions

Contributions are welcome.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/sri0606/text_to_action",
    "name": "text_to_action",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "text_to_action, function calling, natural language processing, automation",
    "author": "Sriram Seelamneni",
    "author_email": "srirams0606@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/51/b1/61a0d3018a92d83fc9b09525fc099b81dfd82eaad57b26954280493c2753/text_to_action-2.0.2.tar.gz",
    "platform": null,
    "description": "# Text-to-Action\n\n## Overview\n\nText-to-Action is a system that transaltes natural language commands to programmatic actions. It interprets user input, determines the most appropriate action to execute, extracts relevant parameters, and performs corresponding actions.\n\nYou can use this to automate tasks, either within your application or for external use, by letting users give natural language commands. For example, if you're building an image editing app, you can use TextToAction to understand what the user wants (like resizing or cropping an image) and even perform the action automatically.\n\n### How to use\n\n```bash\ngit clone https://github.com/sri0606/text_to_action.git\nor\npip install text-to-action\n```\n\nBelow is a simple example of how to use TextToAction to handle user input and automatically perform actions like simple calculator operations:\n\n```python\n  import os\n  from src.text_to_action import TextToAction, LLMClient\n  from dotenv import load_dotenv\n  load_dotenv()\n\n  llm_client = LLMClient(model=\"groq/llama3-70b-8192\")\n  # Get the path to the actions folder\n  current_directory = os.path.dirname(os.path.abspath(__file__))\n  calculator_actions_folder = os.path.join(current_directory,\"src\",\"text_to_action\",\"example_actions\",\"calculator\")\n\n  # Initialize TextToAction dispatcher with actions folder and LLM client\n  dispatcher = TextToAction(\n      actions_folder=calculator_actions_folder, \n      llm_client=llm_client,\n      verbose_output=True, \n      application_context=\"Calculator\", \n      filter_input=True\n  )\n\n  user_input = input(\"Enter your query: \") # (mulitply 3,4,5) or (add 3,4 and multiply 3,4)\n  results = dispatcher.run(user_input)\n  # Example output:\n  # {'message': 'Detected multiple actions.', \n  # 'results': [\n  #    {'action': 'add', 'args': {'values': [3, 4]}, 'output': 7}, \n  #    {'action': 'multiply', 'args': {'values': [3, 4]}, 'output': 12}\n  #   ]\n  #}\n```\n\nApart from directly running actions, TextToAction also allows you to extract actions and parameters separately. This can be useful when you want more control over how the system processes user input.\n\n```python\n# Extract actions based on the user's query\nresult1 = dispatcher.extract_actions(query_text=\"multiply 3,4,5\")\n# Output: {'actions': ['multiply'], 'message': 'Sure, I can help you with that.'}\n\n# Extract parameters for a specific action (e.g., 'multiply') from the user's query\nresult2 = dispatcher.extract_parameters(\n    query_text=\"multiply 3,4,5\", \n    action_name=\"multiply\", \n    args={\"values\": {\"type\": \"List[int]\", \"required\": True}}\n)\n# Output: {'values': [3, 4, 5]}\n\n# Extract both actions and parameters together\nresult3 = dispatcher.extract_actions_with_args(query_text=\"multiply 3,4,5\")\n# Output: {'actions': [{'action': 'multiply', 'args': {'values': [3, 4, 5]}}], \n#          'message': 'Sure I can help you with that. Starting calculation now.'}\n\n```\n### Quick Notes:\n\n- Get an API keyfrom services like Groq (free-tier available), OpenAI or any other service [check supported services](https://docs.litellm.ai/docs/providers). Create a `.env` file and set the api keys values (like `GROQ_API_KEY`, `OPENAI_API_KEY`).\n\n- If you are using NER (not recommended) for parameters extraction, download the corresponding model from spacy.\n\n  ```\n  python -m spacy download en_core_web_trf\n  ```\n\n# Where to start\n\n## Step 1: Describe actions `descriptions.json`\n\n  First, create a json file listing actions descriptions strictly in the following format:\n\n  ```json\n  {\n      \"add\": {\n          \"description\": \"Add or sum a list of numbers\",\n          \"examples\": [\"20+50\", \"add 10, 30, 69\", \"sum of 1,3,4\", \"combine numbers\", \"find the total\"],\n          \"args\": {\n              \"values\": {\n                  \"type\": \"List[int]\",\n                  \"required\": true\n              }\n          }\n      },\n      \"subtract\": {\n          \"description\": \"Subtract two numbers\",\n          \"examples\": [\"10 - 5\", \"subtract 8 from 20\", \"what's 50 minus 15?\", \"deduct 5 from 10\"],\n          \"args\": {\n              \"a\": {\n                  \"type\": \"int\",\n                  \"required\": true\n              },\n              \"b\": {\n                  \"type\": \"int\",\n                  \"required\": true\n              }\n          }\n      }\n  }\n```\n  Better and diverse descriptions for each function, better accuracy.\n\n## Step 2: Create embeddings `embeddings.h5`\n\n  Next, you should create embeddings for actions.\n\n  ```python\n  from text_to_action import create_action_embeddings\n\n  # you can use SBERT or other huggingface models to create embeddings\n  descriptions_filepath = os.path.join(\"example_actions\", \"calculator\", \"descriptions.json\")\n  save_to = os.path.join(\"example_actions\", \"calculator\", \"embeddings.h5\")\n\n  create_actions_embeddings(descriptions_filepath, save_to=save_to,validate_data=True)\n  ```\n\n## Step 3: (Optional) Define actions/functions `implementation.py`\n\nOptionally, define the necessary functions and save them to a file. Infact, you can define the functions in any language you want. You can use TextToAction through a server. Checkout [server.py](server.py)\n\n  ```python\n  def add(values: List[int]) -> int:\n      \"\"\"\n      Returns the sum of a list of integers.\n      \"\"\"\n      return sum(values)\n\n  def subtract(a: int, b: int) -> int:\n      \"\"\"\n      Returns the difference between a and b.\n      \"\"\"\n      return a - b\n  ```\n\n## Use\n\n**Save the `descriptions.json`, `embeddings.h5` and `implementations.py` (optional) to a single folder.**\n\n```python\nfrom text_to_action import TextToAction\nfrom dotenv import load_dotenv\nload_dotenv()\n\n# use the same embedding model, model source you used when creating the actions embeddings\ndispatcher = TextToAction(actions_folder = calculator_actions_folder, llm_client=llm_client,\n                            verbose_output=True,application_context=\"Calculator\", filter_input=True)\n\n```\n\n## Key Components\n\n1. **Text to Action**: The core component that orchestrates the flow from query to action execution.\n\n2. **Vector Store**: Stores embeddings of function descriptions and associated metadata for efficient similarity search.\n\n3. **Parameter Extractor**: Extracts function arguments from the input text using NER or LLM-based approaches.\n\n## How it works\n\n1. The system receives a natural language query from the user.\n2. The query is processed by the Vector Store to identify the most relevant function(s).\n3. The Parameter Extractor analyzes the query to extract required function arguments.\n4. The Action Dispatcher selects the most appropriate function based on similarity scores and parameter availability.\n5. The selected function is executed with the extracted parameters.\n6. The result is returned to the user.\n\n## Possible use Cases\n\n- Natural Language Interfaces for APIs\n- Chatbots and Virtual Assistants\n- Automated Task Execution Systems\n- Voice-Controlled Applications\n\n## Contributions\n\nContributions are welcome.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A system that translates natural language queries into programmatic actions",
    "version": "2.0.2",
    "project_urls": {
        "Homepage": "https://github.com/sri0606/text_to_action",
        "Repository": "https://github.com/sri0606/text_to_action"
    },
    "split_keywords": [
        "text_to_action",
        " function calling",
        " natural language processing",
        " automation"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fd53ef66e4adc464ab252faf74bead66eeed561fdbb26349501e957849369e0b",
                "md5": "0e387c26b747c3cebbd0ee44590c9e46",
                "sha256": "c6d5df60c6a5b8da1ec834ed472d36f8f4ec3da0b844f8902b38f190b452f0b9"
            },
            "downloads": -1,
            "filename": "text_to_action-2.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0e387c26b747c3cebbd0ee44590c9e46",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 335082,
            "upload_time": "2024-10-01T02:41:48",
            "upload_time_iso_8601": "2024-10-01T02:41:48.427200Z",
            "url": "https://files.pythonhosted.org/packages/fd/53/ef66e4adc464ab252faf74bead66eeed561fdbb26349501e957849369e0b/text_to_action-2.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "51b161a0d3018a92d83fc9b09525fc099b81dfd82eaad57b26954280493c2753",
                "md5": "257d1943d3a12e11906aeb19bd39b495",
                "sha256": "49a4f182a7912121c965cc368d1c4e346a6dfed4a08d13cfaeb5b8a8ae9b8c09"
            },
            "downloads": -1,
            "filename": "text_to_action-2.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "257d1943d3a12e11906aeb19bd39b495",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 328268,
            "upload_time": "2024-10-01T02:41:50",
            "upload_time_iso_8601": "2024-10-01T02:41:50.925963Z",
            "url": "https://files.pythonhosted.org/packages/51/b1/61a0d3018a92d83fc9b09525fc099b81dfd82eaad57b26954280493c2753/text_to_action-2.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-01 02:41:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sri0606",
    "github_project": "text_to_action",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "deepdiff",
            "specs": []
        },
        {
            "name": "groq",
            "specs": []
        },
        {
            "name": "h5py",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "openai",
            "specs": []
        },
        {
            "name": "pydantic",
            "specs": []
        },
        {
            "name": "python-dotenv",
            "specs": []
        },
        {
            "name": "python_dateutil",
            "specs": []
        },
        {
            "name": "scipy",
            "specs": []
        },
        {
            "name": "sentence-transformers",
            "specs": []
        },
        {
            "name": "spacy",
            "specs": []
        },
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "transformers",
            "specs": []
        },
        {
            "name": "litellm",
            "specs": []
        }
    ],
    "lcname": "text_to_action"
}

Sriram Seelamneni