robo_transformers


Namerobo_transformers JSON
Version 0.1.13 PyPI version JSON
download
home_page
SummaryRobotics Transformer Inference in Tensorflow. RT-1, RT-2, RT-X, PALME.
upload_time2023-12-20 12:33:05
maintainer
docs_urlNone
authorSebastian Peralta
requires_python>=3.9,<3.12
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Code Coverage](https://codecov.io/gh/sebbyjp/dgl_ros/branch/code_cov/graph/badge.svg?token=9225d677-c4f2-4607-a9dd-8c22446f13bc)](https://codecov.io/gh/sebbyjp/dgl_ros)

# Library for Robotic Transformers. RT-1 and RT-X-1.

## Installation:

Requirements:
python >= 3.9

### Recommended: Using PyPI
`pip install robo-transformers`

### From Source
Clone this repo:

`git clone https://github.com/sebbyjp/robo_transformers.git`

`cd robo_transformers`

Use poetry

`pip install poetry && poetry config virtualenvs.in-project true`

**Install dependencies**

`poetry install`

Poetry has installed the dependencies in a virtualenv so we need to activate it.

`source .venv/bin/activate`
  
## Run RT-1 Inference On Demo Images.
`python -m robo_transformers.rt1.rt1_inference`

## See usage:
You can specify a custom checkpoint path or the model_keys for the three mentioned in the RT-1 paper as well as RT-X.

`python -m robo_transformers.rt1.rt1_inference --help`

## Run Inference Server
The inference server takes care of all the internal state so all you need to specify is an instruction and image. You may also pass in a reward and termination signal. Batching is also supported.
```
from robo_transformers.inference_server import InferenceServer
import numpy as np

# Somewhere in your robot control stack code...

instruction = "pick block"
img = np.random.randn(256, 320, 3) # Width, Height, RGB
inference = InferenceServer()

action = inference(instruction, img)
```

  
## Data Types
`action, next_policy_state = model.act(time_step, curr_policy_state)`
### policy state is internal state of network:
In this case it is a 6-frame window of past observations,actions and the index in time.
```
{'action_tokens': ArraySpec(shape=(6, 11, 1, 1), dtype=dtype('int32'), name='action_tokens'),
 'image': ArraySpec(shape=(6, 256, 320, 3), dtype=dtype('uint8'), name='image'),
 'step_num': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='step_num'),
 't': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='t')}
 ```


### time_step is the input from the environment:
```
{'discount': BoundedArraySpec(shape=(), dtype=dtype('float32'), name='discount', minimum=0.0, maximum=1.0),
 'observation': {'base_pose_tool_reached': ArraySpec(shape=(7,), dtype=dtype('float32'), name='base_pose_tool_reached'),
                 'gripper_closed': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closed'),
                 'gripper_closedness_commanded': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_commanded'),
                 'height_to_bottom': ArraySpec(shape=(1,), dtype=dtype('float32'), name='height_to_bottom'),
                 'image': ArraySpec(shape=(256, 320, 3), dtype=dtype('uint8'), name='image'),
                 'natural_language_embedding': ArraySpec(shape=(512,), dtype=dtype('float32'), name='natural_language_embedding'),
                 'natural_language_instruction': ArraySpec(shape=(), dtype=dtype('O'), name='natural_language_instruction'),
                 'orientation_box': ArraySpec(shape=(2, 3), dtype=dtype('float32'), name='orientation_box'),
                 'orientation_start': ArraySpec(shape=(4,), dtype=dtype('float32'), name='orientation_in_camera_space'),
                 'robot_orientation_positions_box': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='robot_orientation_positions_box'),
                 'rotation_delta_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta_to_go'),
                 'src_rotation': ArraySpec(shape=(4,), dtype=dtype('float32'), name='transform_camera_robot'),
                 'vector_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='vector_to_go'),
                 'workspace_bounds': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='workspace_bounds')},
 'reward': ArraySpec(shape=(), dtype=dtype('float32'), name='reward'),
 'step_type': ArraySpec(shape=(), dtype=dtype('int32'), name='step_type')}
 ```

### action:
```
{'base_displacement_vector': BoundedArraySpec(shape=(2,), dtype=dtype('float32'), name='base_displacement_vector', minimum=-1.0, maximum=1.0),
 'base_displacement_vertical_rotation': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='base_displacement_vertical_rotation', minimum=-3.1415927410125732, maximum=3.1415927410125732),
 'gripper_closedness_action': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_action', minimum=-1.0, maximum=1.0),
 'rotation_delta': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta', minimum=-1.5707963705062866, maximum=1.5707963705062866),
 'terminate_episode': BoundedArraySpec(shape=(3,), dtype=dtype('int32'), name='terminate_episode', minimum=0, maximum=1),
 'world_vector': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='world_vector', minimum=-1.0, maximum=1.0)}
 ```

 ## TODO:
 - Render action, policy_state, observation specs in something prettier like pandas data frame.
            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "robo_transformers",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9,<3.12",
    "maintainer_email": "",
    "keywords": "",
    "author": "Sebastian Peralta",
    "author_email": "peraltas@seas.upenn.edu",
    "download_url": "https://files.pythonhosted.org/packages/3b/b5/098279f014063402464fbb856b5fd2b397cfada94143d105f2b46bca5c8e/robo_transformers-0.1.13.tar.gz",
    "platform": null,
    "description": "[![Code Coverage](https://codecov.io/gh/sebbyjp/dgl_ros/branch/code_cov/graph/badge.svg?token=9225d677-c4f2-4607-a9dd-8c22446f13bc)](https://codecov.io/gh/sebbyjp/dgl_ros)\n\n# Library for Robotic Transformers. RT-1 and RT-X-1.\n\n## Installation:\n\nRequirements:\npython >= 3.9\n\n### Recommended: Using PyPI\n`pip install robo-transformers`\n\n### From Source\nClone this repo:\n\n`git clone https://github.com/sebbyjp/robo_transformers.git`\n\n`cd robo_transformers`\n\nUse poetry\n\n`pip install poetry && poetry config virtualenvs.in-project true`\n\n**Install dependencies**\n\n`poetry install`\n\nPoetry has installed the dependencies in a virtualenv so we need to activate it.\n\n`source .venv/bin/activate`\n  \n## Run RT-1 Inference On Demo Images.\n`python -m robo_transformers.rt1.rt1_inference`\n\n## See usage:\nYou can specify a custom checkpoint path or the model_keys for the three mentioned in the RT-1 paper as well as RT-X.\n\n`python -m robo_transformers.rt1.rt1_inference --help`\n\n## Run Inference Server\nThe inference server takes care of all the internal state so all you need to specify is an instruction and image. You may also pass in a reward and termination signal. Batching is also supported.\n```\nfrom robo_transformers.inference_server import InferenceServer\nimport numpy as np\n\n# Somewhere in your robot control stack code...\n\ninstruction = \"pick block\"\nimg = np.random.randn(256, 320, 3) # Width, Height, RGB\ninference = InferenceServer()\n\naction = inference(instruction, img)\n```\n\n  \n## Data Types\n`action, next_policy_state = model.act(time_step, curr_policy_state)`\n### policy state is internal state of network:\nIn this case it is a 6-frame window of past observations,actions and the index in time.\n```\n{'action_tokens': ArraySpec(shape=(6, 11, 1, 1), dtype=dtype('int32'), name='action_tokens'),\n 'image': ArraySpec(shape=(6, 256, 320, 3), dtype=dtype('uint8'), name='image'),\n 'step_num': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='step_num'),\n 't': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='t')}\n ```\n\n\n### time_step is the input from the environment:\n```\n{'discount': BoundedArraySpec(shape=(), dtype=dtype('float32'), name='discount', minimum=0.0, maximum=1.0),\n 'observation': {'base_pose_tool_reached': ArraySpec(shape=(7,), dtype=dtype('float32'), name='base_pose_tool_reached'),\n                 'gripper_closed': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closed'),\n                 'gripper_closedness_commanded': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_commanded'),\n                 'height_to_bottom': ArraySpec(shape=(1,), dtype=dtype('float32'), name='height_to_bottom'),\n                 'image': ArraySpec(shape=(256, 320, 3), dtype=dtype('uint8'), name='image'),\n                 'natural_language_embedding': ArraySpec(shape=(512,), dtype=dtype('float32'), name='natural_language_embedding'),\n                 'natural_language_instruction': ArraySpec(shape=(), dtype=dtype('O'), name='natural_language_instruction'),\n                 'orientation_box': ArraySpec(shape=(2, 3), dtype=dtype('float32'), name='orientation_box'),\n                 'orientation_start': ArraySpec(shape=(4,), dtype=dtype('float32'), name='orientation_in_camera_space'),\n                 'robot_orientation_positions_box': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='robot_orientation_positions_box'),\n                 'rotation_delta_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta_to_go'),\n                 'src_rotation': ArraySpec(shape=(4,), dtype=dtype('float32'), name='transform_camera_robot'),\n                 'vector_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='vector_to_go'),\n                 'workspace_bounds': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='workspace_bounds')},\n 'reward': ArraySpec(shape=(), dtype=dtype('float32'), name='reward'),\n 'step_type': ArraySpec(shape=(), dtype=dtype('int32'), name='step_type')}\n ```\n\n### action:\n```\n{'base_displacement_vector': BoundedArraySpec(shape=(2,), dtype=dtype('float32'), name='base_displacement_vector', minimum=-1.0, maximum=1.0),\n 'base_displacement_vertical_rotation': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='base_displacement_vertical_rotation', minimum=-3.1415927410125732, maximum=3.1415927410125732),\n 'gripper_closedness_action': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_action', minimum=-1.0, maximum=1.0),\n 'rotation_delta': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta', minimum=-1.5707963705062866, maximum=1.5707963705062866),\n 'terminate_episode': BoundedArraySpec(shape=(3,), dtype=dtype('int32'), name='terminate_episode', minimum=0, maximum=1),\n 'world_vector': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='world_vector', minimum=-1.0, maximum=1.0)}\n ```\n\n ## TODO:\n - Render action, policy_state, observation specs in something prettier like pandas data frame.",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Robotics Transformer Inference in Tensorflow. RT-1, RT-2, RT-X, PALME.",
    "version": "0.1.13",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7c3af9d8ae3a2237cecb16f57cdf146956bb711d64d4dbfa9a165134e6429ab7",
                "md5": "673062cf81dd390cafb8eafa7b82ebbd",
                "sha256": "5ca7d85537794b0bf18c446c678b453bddc9f656f8029520d917131d578627c6"
            },
            "downloads": -1,
            "filename": "robo_transformers-0.1.13-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "673062cf81dd390cafb8eafa7b82ebbd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9,<3.12",
            "size": 4136103,
            "upload_time": "2023-12-20T12:33:02",
            "upload_time_iso_8601": "2023-12-20T12:33:02.501739Z",
            "url": "https://files.pythonhosted.org/packages/7c/3a/f9d8ae3a2237cecb16f57cdf146956bb711d64d4dbfa9a165134e6429ab7/robo_transformers-0.1.13-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3bb5098279f014063402464fbb856b5fd2b397cfada94143d105f2b46bca5c8e",
                "md5": "9fe733af15c9a7af2e7c19819e86b3d6",
                "sha256": "e75ceddc0e6a809be60333a6123575663ab2969309b1e9888c823becd12f3b82"
            },
            "downloads": -1,
            "filename": "robo_transformers-0.1.13.tar.gz",
            "has_sig": false,
            "md5_digest": "9fe733af15c9a7af2e7c19819e86b3d6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9,<3.12",
            "size": 4137960,
            "upload_time": "2023-12-20T12:33:05",
            "upload_time_iso_8601": "2023-12-20T12:33:05.665120Z",
            "url": "https://files.pythonhosted.org/packages/3b/b5/098279f014063402464fbb856b5fd2b397cfada94143d105f2b46bca5c8e/robo_transformers-0.1.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-20 12:33:05",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "robo_transformers"
}
        
Elapsed time: 2.45299s