Name | robo-transformers JSON |
Version |
1.0.0
JSON |
| download |
home_page | |
Summary | RT-1, RT-1-X, Octo Robotics Transformer Model Inference |
upload_time | 2024-01-13 08:22:04 |
maintainer | |
docs_url | None |
author | Sebastian Peralta |
requires_python | >=3.9,<3.12 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Library for Robotic Transformers. RT-1, RT-X-1, Octo
[![Code Coverage](https://codecov.io/gh/sebbyjp/dgl_ros/branch/code_cov/graph/badge.svg?token=9225d677-c4f2-4607-a9dd-8c22446f13bc)](https://codecov.io/gh/sebbyjp/dgl_ros)
[![ubuntu | python 3.9 | 3.10 | 3.11](https://github.com/sebbyjp/robo_transformers/actions/workflows/ubuntu.yml/badge.svg)](https://github.com/sebbyjp/robo_transformers/actions/workflows/ubuntu.yml)
[![macos | python 3.9 | 3.10 | 3.11](https://github.com/sebbyjp/robo_transformers/actions/workflows/macos.yml/badge.svg)](https://github.com/sebbyjp/robo_transformers/actions/workflows/macos.yml)
## Installation
Requirements:
python >= 3.9
### Install tensorflow version for your OS and Hardware
See [Tensorflow](https://www.tensorflow.org/install)
### Using Octo models
Follow their [installation procedure](https://github.com/octo-models/octo).
**Note**: You might not need conda if you are able to just clone their repo and run `pip install -e octo`.
### Recommended: Using PyPI
`pip install robo-transformers`
### From Source
Clone this repo:
`git clone https://github.com/sebbyjp/robo_transformers.git`
`cd robo_transformers`
Use poetry
`pip install poetry && poetry config virtualenvs.in-project true`
### Install dependencies
`poetry install`
Poetry has installed the dependencies in a virtualenv so we need to activate it.
`source .venv/bin/activate`
## Run Octo inference on demo images
`python -m robo_transformers.demo`
## Run RT-1 Inference On Demo Images
`python -m robo_transformers.models.rt1.inference`
## See usage
You can specify a custom checkpoint path or the model_keys for the three mentioned in the RT-1 paper as well as RT-X.
`python -m robo_transformers.models.rt1.inference --help`
## Run Inference Server
The inference server takes care of all the internal state so all you need to specify is an instruction and image. You may also pass in
```python
from robo_transformers.inference_server import InferenceServer
import numpy as np
# Somewhere in your robot control stack code...
instruction = "pick block"
img = np.random.randn(256, 320, 3) # Width, Height, RGB
inference = InferenceServer()
action = inference(instruction, img)
```
## Data Types
`action, next_policy_state = model.act(time_step, curr_policy_state)`
### policy state is internal state of network
In this case it is a 6-frame window of past observations,actions and the index in time.
```python
{'action_tokens': ArraySpec(shape=(6, 11, 1, 1), dtype=dtype('int32'), name='action_tokens'),
'image': ArraySpec(shape=(6, 256, 320, 3), dtype=dtype('uint8'), name='image'),
'step_num': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='step_num'),
't': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='t')}
```
### time_step is the input from the environment
```python
{'discount': BoundedArraySpec(shape=(), dtype=dtype('float32'), name='discount', minimum=0.0, maximum=1.0),
'observation': {'base_pose_tool_reached': ArraySpec(shape=(7,), dtype=dtype('float32'), name='base_pose_tool_reached'),
'gripper_closed': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closed'),
'gripper_closedness_commanded': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_commanded'),
'height_to_bottom': ArraySpec(shape=(1,), dtype=dtype('float32'), name='height_to_bottom'),
'image': ArraySpec(shape=(256, 320, 3), dtype=dtype('uint8'), name='image'),
'natural_language_embedding': ArraySpec(shape=(512,), dtype=dtype('float32'), name='natural_language_embedding'),
'natural_language_instruction': ArraySpec(shape=(), dtype=dtype('O'), name='natural_language_instruction'),
'orientation_box': ArraySpec(shape=(2, 3), dtype=dtype('float32'), name='orientation_box'),
'orientation_start': ArraySpec(shape=(4,), dtype=dtype('float32'), name='orientation_in_camera_space'),
'robot_orientation_positions_box': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='robot_orientation_positions_box'),
'rotation_delta_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta_to_go'),
'src_rotation': ArraySpec(shape=(4,), dtype=dtype('float32'), name='transform_camera_robot'),
'vector_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='vector_to_go'),
'workspace_bounds': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='workspace_bounds')},
'reward': ArraySpec(shape=(), dtype=dtype('float32'), name='reward'),
'step_type': ArraySpec(shape=(), dtype=dtype('int32'), name='step_type')}
```
### action
```python
{'base_displacement_vector': BoundedArraySpec(shape=(2,), dtype=dtype('float32'), name='base_displacement_vector', minimum=-1.0, maximum=1.0),
'base_displacement_vertical_rotation': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='base_displacement_vertical_rotation', minimum=-3.1415927410125732, maximum=3.1415927410125732),
'gripper_closedness_action': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_action', minimum=-1.0, maximum=1.0),
'rotation_delta': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta', minimum=-1.5707963705062866, maximum=1.5707963705062866),
'terminate_episode': BoundedArraySpec(shape=(3,), dtype=dtype('int32'), name='terminate_episode', minimum=0, maximum=1),
'world_vector': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='world_vector', minimum=-1.0, maximum=1.0)}
```
## TODO
- Render action, policy_state, observation specs in something prettier like pandas data frame.
Raw data
{
"_id": null,
"home_page": "",
"name": "robo-transformers",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9,<3.12",
"maintainer_email": "",
"keywords": "",
"author": "Sebastian Peralta",
"author_email": "peraltas@seas.upenn.edu",
"download_url": "https://files.pythonhosted.org/packages/64/39/737542c033a7735d2f08e07c2b14e57f593f8866e9c65c850c98c534cc17/robo_transformers-1.0.0.tar.gz",
"platform": null,
"description": "# Library for Robotic Transformers. RT-1, RT-X-1, Octo\n\n[![Code Coverage](https://codecov.io/gh/sebbyjp/dgl_ros/branch/code_cov/graph/badge.svg?token=9225d677-c4f2-4607-a9dd-8c22446f13bc)](https://codecov.io/gh/sebbyjp/dgl_ros)\n[![ubuntu | python 3.9 | 3.10 | 3.11](https://github.com/sebbyjp/robo_transformers/actions/workflows/ubuntu.yml/badge.svg)](https://github.com/sebbyjp/robo_transformers/actions/workflows/ubuntu.yml)\n[![macos | python 3.9 | 3.10 | 3.11](https://github.com/sebbyjp/robo_transformers/actions/workflows/macos.yml/badge.svg)](https://github.com/sebbyjp/robo_transformers/actions/workflows/macos.yml)\n\n## Installation\n\nRequirements:\npython >= 3.9\n\n### Install tensorflow version for your OS and Hardware\n\nSee [Tensorflow](https://www.tensorflow.org/install)\n\n### Using Octo models\n\nFollow their [installation procedure](https://github.com/octo-models/octo).\n\n**Note**: You might not need conda if you are able to just clone their repo and run `pip install -e octo`.\n\n### Recommended: Using PyPI\n\n`pip install robo-transformers`\n\n### From Source\n\nClone this repo:\n\n`git clone https://github.com/sebbyjp/robo_transformers.git`\n\n`cd robo_transformers`\n\nUse poetry\n\n`pip install poetry && poetry config virtualenvs.in-project true`\n\n### Install dependencies\n\n`poetry install`\n\nPoetry has installed the dependencies in a virtualenv so we need to activate it.\n\n`source .venv/bin/activate`\n\n## Run Octo inference on demo images\n\n`python -m robo_transformers.demo`\n \n## Run RT-1 Inference On Demo Images\n\n`python -m robo_transformers.models.rt1.inference`\n\n## See usage\n\nYou can specify a custom checkpoint path or the model_keys for the three mentioned in the RT-1 paper as well as RT-X.\n\n`python -m robo_transformers.models.rt1.inference --help`\n\n## Run Inference Server\n\nThe inference server takes care of all the internal state so all you need to specify is an instruction and image. You may also pass in \n\n```python\nfrom robo_transformers.inference_server import InferenceServer\nimport numpy as np\n\n# Somewhere in your robot control stack code...\n\ninstruction = \"pick block\"\nimg = np.random.randn(256, 320, 3) # Width, Height, RGB\ninference = InferenceServer()\n\naction = inference(instruction, img)\n```\n\n## Data Types\n\n`action, next_policy_state = model.act(time_step, curr_policy_state)`\n\n### policy state is internal state of network\n\nIn this case it is a 6-frame window of past observations,actions and the index in time.\n\n```python\n{'action_tokens': ArraySpec(shape=(6, 11, 1, 1), dtype=dtype('int32'), name='action_tokens'),\n 'image': ArraySpec(shape=(6, 256, 320, 3), dtype=dtype('uint8'), name='image'),\n 'step_num': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='step_num'),\n 't': ArraySpec(shape=(1, 1, 1, 1), dtype=dtype('int32'), name='t')}\n ```\n\n### time_step is the input from the environment\n\n```python\n{'discount': BoundedArraySpec(shape=(), dtype=dtype('float32'), name='discount', minimum=0.0, maximum=1.0),\n 'observation': {'base_pose_tool_reached': ArraySpec(shape=(7,), dtype=dtype('float32'), name='base_pose_tool_reached'),\n 'gripper_closed': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closed'),\n 'gripper_closedness_commanded': ArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_commanded'),\n 'height_to_bottom': ArraySpec(shape=(1,), dtype=dtype('float32'), name='height_to_bottom'),\n 'image': ArraySpec(shape=(256, 320, 3), dtype=dtype('uint8'), name='image'),\n 'natural_language_embedding': ArraySpec(shape=(512,), dtype=dtype('float32'), name='natural_language_embedding'),\n 'natural_language_instruction': ArraySpec(shape=(), dtype=dtype('O'), name='natural_language_instruction'),\n 'orientation_box': ArraySpec(shape=(2, 3), dtype=dtype('float32'), name='orientation_box'),\n 'orientation_start': ArraySpec(shape=(4,), dtype=dtype('float32'), name='orientation_in_camera_space'),\n 'robot_orientation_positions_box': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='robot_orientation_positions_box'),\n 'rotation_delta_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta_to_go'),\n 'src_rotation': ArraySpec(shape=(4,), dtype=dtype('float32'), name='transform_camera_robot'),\n 'vector_to_go': ArraySpec(shape=(3,), dtype=dtype('float32'), name='vector_to_go'),\n 'workspace_bounds': ArraySpec(shape=(3, 3), dtype=dtype('float32'), name='workspace_bounds')},\n 'reward': ArraySpec(shape=(), dtype=dtype('float32'), name='reward'),\n 'step_type': ArraySpec(shape=(), dtype=dtype('int32'), name='step_type')}\n ```\n\n### action\n\n```python\n{'base_displacement_vector': BoundedArraySpec(shape=(2,), dtype=dtype('float32'), name='base_displacement_vector', minimum=-1.0, maximum=1.0),\n 'base_displacement_vertical_rotation': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='base_displacement_vertical_rotation', minimum=-3.1415927410125732, maximum=3.1415927410125732),\n 'gripper_closedness_action': BoundedArraySpec(shape=(1,), dtype=dtype('float32'), name='gripper_closedness_action', minimum=-1.0, maximum=1.0),\n 'rotation_delta': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='rotation_delta', minimum=-1.5707963705062866, maximum=1.5707963705062866),\n 'terminate_episode': BoundedArraySpec(shape=(3,), dtype=dtype('int32'), name='terminate_episode', minimum=0, maximum=1),\n 'world_vector': BoundedArraySpec(shape=(3,), dtype=dtype('float32'), name='world_vector', minimum=-1.0, maximum=1.0)}\n ```\n\n## TODO\n\n- Render action, policy_state, observation specs in something prettier like pandas data frame.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "RT-1, RT-1-X, Octo Robotics Transformer Model Inference",
"version": "1.0.0",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "90d79d1737639cedde4f5a5e98daff8930fc124fde1db876f5c04da5fcfdea3a",
"md5": "c325bc7f19597fdaf23047ff42798ab0",
"sha256": "0ec6944630d72fac784cac2a78bce984cfad6c1510020be7f5177c3af4a6dd12"
},
"downloads": -1,
"filename": "robo_transformers-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c325bc7f19597fdaf23047ff42798ab0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9,<3.12",
"size": 4144218,
"upload_time": "2024-01-13T08:22:01",
"upload_time_iso_8601": "2024-01-13T08:22:01.911353Z",
"url": "https://files.pythonhosted.org/packages/90/d7/9d1737639cedde4f5a5e98daff8930fc124fde1db876f5c04da5fcfdea3a/robo_transformers-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6439737542c033a7735d2f08e07c2b14e57f593f8866e9c65c850c98c534cc17",
"md5": "ebe4afba68c010d64de6503d72369446",
"sha256": "df858fff31822728c860f68b0302948c44ff4bb8257a3eecb23fbce417649168"
},
"downloads": -1,
"filename": "robo_transformers-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "ebe4afba68c010d64de6503d72369446",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9,<3.12",
"size": 4143479,
"upload_time": "2024-01-13T08:22:04",
"upload_time_iso_8601": "2024-01-13T08:22:04.653411Z",
"url": "https://files.pythonhosted.org/packages/64/39/737542c033a7735d2f08e07c2b14e57f593f8866e9c65c850c98c534cc17/robo_transformers-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-13 08:22:04",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "robo-transformers"
}