footsies-gym


Namefootsies-gym JSON
Version 0.2.2 PyPI version JSON
download
home_pageNone
SummaryA reinforcement learning environment for HiFight's Footsies game
upload_time2025-08-27 00:27:58
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords reinforcement learning multi-agent fighting game footsies gymnasium
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FootsiesGym

Implementation of HiFight's [Footsies](https://hifight.github.io/footsies/) game as a reinforcement learning environment. This environment serves as a benchmark for multi-agent reinforcement learning in a (relatively) complex two-player zero-sum game.

The environment is derived from the open-source Unity implementation, which has been augmented to run a gRPC server that can be controlled through a Python harness. Training is implemented using Ray's [RLlib](https://docs.ray.io/en/latest/rllib/index.html).


### System Architecture

```mermaid
sequenceDiagram
    participant RLlib as Ray RLlib
    participant Env as FootsiesEnv
    participant gRPC as gRPC Client
    participant Server as Unity Game Server
    participant Game as Footsies Game

    Note over RLlib,Env: Python Environment
    Note over gRPC: Communication Layer
    Note over Server,Game: Unity Game

    RLlib->>Env: step(action)
    Env->>gRPC: SendAction(action)
    gRPC->>Server: gRPC Request
    Server->>Game: Update Game State
    Game->>Server: Game State
    Server->>gRPC: gRPC Response
    gRPC->>Env: Game State
    Env->>RLlib: (obs., rews., terms., truncs., infos)

    Note over RLlib,Game: Training Loop

```

The diagram above shows how the different components interact during training:
1. RLlib sends actions to the FootsiesEnv
2. The environment converts these actions into gRPC requests
3. The Unity Game Server processes the actions and updates the game state
4. The game state is sent back through gRPC to the environment
5. The environment processes the observation and returns it to RLlib


## Installation

```bash
conda create -n footsiesgym python=3.10
conda activate footsiesgym
pip install -r requirements.txt
```

On a Mac, you may need to ensure you have `cmake` installed. You can install it using Homebrew:

```bash
brew install cmake
```

## Training

### Game Servers
If you are on a Linux system, run `setup.sh` to unpack the binaries then run skip to the training procedure. Otherwise, follow the steps below. 


Before training, you'll need to launch the headless game servers. Scripts are provided to do so in `scripts/start_local_{mac, linux}_servers.sh`, but you must first unpack the binaries that are included into the `binaries/` directory (the launch scripts assume this location). _Important!_ If you are launching game servers manually, be sure to set `launch_binaries` to `False` in the environment configuration. 

```bash
./scripts/start_local_{mac, linux}_servers.sh <num-train-servers> <num-eval-servers>
```

The two arguments correspond to `num_env_runners` and `evaluation_num_env_runners`, which can be specified in the experiment configuration. You must launch a corresponding number of servers for each. If you are running local debugging (see below; `python -m experiments.train --debug`), just launch one of each. If you're launching a full experiment, you'll need to match the number specified in the experiment configuration (defaults to 40 training and 5 evaluation env runners).

The scripts will start:
- Training servers from port 50051 (incrementing for each server)
- Evaluation servers from port 40051 (incrementing for each server)

Importantly, we map environment runners to a single port, which means that you can only run a single environment per environment runner.

### Training Configuration

The default training utilizes the [APPO](https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#appo) algorithm (see the corresponding [IMPACT](https://arxiv.org/abs/1912.00167) paper). We also utilize a vanilla LSTM newtwork with parameters described in the respective experiment files.

Training can utilize either the new RLModule stack or old-stack in RLlib. Some functionality has yet to be implemented in the new stack (see open issues).

#### Old Stack
```bash
python -m experiments.train --experiment-name <experiment-name>
```

#### New Stack
```bash
python -m experiments.train_rlmodule --experiment-name <experiment-name>
```

Add the `--debug` flag to use only a single env runner (and single evaluation env runner) and local mode. This will enable breakpoint usage for local debugging.



## Visualizing a Policy

To visualize gameplay:

1. Unpack the windowed build binaries of your choice (Mac or Linux).

2. Add the trained policy specification to the `ModuleRepository` in `components/module_repository.py`:
```python
FootsiesModuleSpec(
    module_name="<policy-nickname>",
    experiment_name="<experiment-name>",
    trial_id="<trial-id>",  # specify if experiment has multiple trials
    checkpoint_number=-1,  # -1 for latest, otherwise specify checkpoint number
)
```

3. Run the game with:
```bash
./footsies_linux_windowed_021725 --port 80051
```

4. Configure policies in `scripts/local_inference.py` using the `MODULES` variable. Set `"p1"` to `"human"` to play against the AI (must install `pygame`).

## Project Architecture

### Core Components

- **Environment (`footsies/`)**: The main game environment implementation that interfaces with the Unity game through gRPC.
- **Models (`models/`)**: Neural network architectures for the RL agents
- **Experiments (`experiments/`)**: Training configurations and experiment management
- **Callbacks (`callbacks/`)**: Custom RLlib callbacks for monitoring and evaluation
- **Components (`components/`)**: Reusable components like the module repository for policy management
- **Utils (`utils/`)**: Utility functions and helper classes
- **Scripts (`scripts/`)**: Helper scripts for server management and visualization

### Key Features

- Multi-agent reinforcement learning environment
- gRPC-based communication with Unity game server
- Support for both headless and windowed game modes
- Integration with Ray RLlib for distributed training
- Custom LSTM-based policy networks
- Support for self-play training
- Evaluation against baseline policies (random, noop, back)
- Wandb integration for experiment tracking


## Development

### gRPC / Protobuf Updates

If updating the proto definitions:

1. Generate C# files (Windows):
```bash
.\protoc\bin\protoc.exe --csharp_out=.\env\game\proto\ --grpc_out=.\env\game\proto\ --plugin=protoc-gen-grpc=.\plugins\grpc_csharp_plugin.exe .\env\game\proto\footsies_service.proto
```

2. Generate Python files:
```bash
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. .\env\game\proto\footsies_service.proto
```

## Project Structure

```
FootsiesGym/
├── binaries/           # Game server binaries
├── callbacks/          # RLlib callbacks
├── components/         # Reusable components
├── experiments/        # Training configurations
├── footsies/          # Core environment
├── models/            # Neural network architectures
├── protoc/            # Protocol buffer tools
├── scripts/           # Helper scripts
├── testing/           # Test files
└── utils/             # Utility functions
```

## Contributing

1. Install pre-commit hooks to maintain code quality
2. Follow the existing code style and architecture
3. Add tests for new features
4. Update documentation as needed

## License

This project is based on the open-source Footsies game by HiFight. Please refer to the original game's license for more information.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "footsies-gym",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Chase McDonald <cmcdonald@example.com>",
    "keywords": "reinforcement learning, multi-agent, fighting game, footsies, gymnasium",
    "author": null,
    "author_email": "Chase McDonald <cmcdonald@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/1a/3d/49de92a19c2d7e8a95030083f1d42fd3bfb71aad72240586e574e7daad21/footsies_gym-0.2.2.tar.gz",
    "platform": null,
    "description": "# FootsiesGym\n\nImplementation of HiFight's [Footsies](https://hifight.github.io/footsies/) game as a reinforcement learning environment. This environment serves as a benchmark for multi-agent reinforcement learning in a (relatively) complex two-player zero-sum game.\n\nThe environment is derived from the open-source Unity implementation, which has been augmented to run a gRPC server that can be controlled through a Python harness. Training is implemented using Ray's [RLlib](https://docs.ray.io/en/latest/rllib/index.html).\n\n\n### System Architecture\n\n```mermaid\nsequenceDiagram\n    participant RLlib as Ray RLlib\n    participant Env as FootsiesEnv\n    participant gRPC as gRPC Client\n    participant Server as Unity Game Server\n    participant Game as Footsies Game\n\n    Note over RLlib,Env: Python Environment\n    Note over gRPC: Communication Layer\n    Note over Server,Game: Unity Game\n\n    RLlib->>Env: step(action)\n    Env->>gRPC: SendAction(action)\n    gRPC->>Server: gRPC Request\n    Server->>Game: Update Game State\n    Game->>Server: Game State\n    Server->>gRPC: gRPC Response\n    gRPC->>Env: Game State\n    Env->>RLlib: (obs., rews., terms., truncs., infos)\n\n    Note over RLlib,Game: Training Loop\n\n```\n\nThe diagram above shows how the different components interact during training:\n1. RLlib sends actions to the FootsiesEnv\n2. The environment converts these actions into gRPC requests\n3. The Unity Game Server processes the actions and updates the game state\n4. The game state is sent back through gRPC to the environment\n5. The environment processes the observation and returns it to RLlib\n\n\n## Installation\n\n```bash\nconda create -n footsiesgym python=3.10\nconda activate footsiesgym\npip install -r requirements.txt\n```\n\nOn a Mac, you may need to ensure you have `cmake` installed. You can install it using Homebrew:\n\n```bash\nbrew install cmake\n```\n\n## Training\n\n### Game Servers\nIf you are on a Linux system, run `setup.sh` to unpack the binaries then run skip to the training procedure. Otherwise, follow the steps below. \n\n\nBefore training, you'll need to launch the headless game servers. Scripts are provided to do so in `scripts/start_local_{mac, linux}_servers.sh`, but you must first unpack the binaries that are included into the `binaries/` directory (the launch scripts assume this location). _Important!_ If you are launching game servers manually, be sure to set `launch_binaries` to `False` in the environment configuration. \n\n```bash\n./scripts/start_local_{mac, linux}_servers.sh <num-train-servers> <num-eval-servers>\n```\n\nThe two arguments correspond to `num_env_runners` and `evaluation_num_env_runners`, which can be specified in the experiment configuration. You must launch a corresponding number of servers for each. If you are running local debugging (see below; `python -m experiments.train --debug`), just launch one of each. If you're launching a full experiment, you'll need to match the number specified in the experiment configuration (defaults to 40 training and 5 evaluation env runners).\n\nThe scripts will start:\n- Training servers from port 50051 (incrementing for each server)\n- Evaluation servers from port 40051 (incrementing for each server)\n\nImportantly, we map environment runners to a single port, which means that you can only run a single environment per environment runner.\n\n### Training Configuration\n\nThe default training utilizes the [APPO](https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#appo) algorithm (see the corresponding [IMPACT](https://arxiv.org/abs/1912.00167) paper). We also utilize a vanilla LSTM newtwork with parameters described in the respective experiment files.\n\nTraining can utilize either the new RLModule stack or old-stack in RLlib. Some functionality has yet to be implemented in the new stack (see open issues).\n\n#### Old Stack\n```bash\npython -m experiments.train --experiment-name <experiment-name>\n```\n\n#### New Stack\n```bash\npython -m experiments.train_rlmodule --experiment-name <experiment-name>\n```\n\nAdd the `--debug` flag to use only a single env runner (and single evaluation env runner) and local mode. This will enable breakpoint usage for local debugging.\n\n\n\n## Visualizing a Policy\n\nTo visualize gameplay:\n\n1. Unpack the windowed build binaries of your choice (Mac or Linux).\n\n2. Add the trained policy specification to the `ModuleRepository` in `components/module_repository.py`:\n```python\nFootsiesModuleSpec(\n    module_name=\"<policy-nickname>\",\n    experiment_name=\"<experiment-name>\",\n    trial_id=\"<trial-id>\",  # specify if experiment has multiple trials\n    checkpoint_number=-1,  # -1 for latest, otherwise specify checkpoint number\n)\n```\n\n3. Run the game with:\n```bash\n./footsies_linux_windowed_021725 --port 80051\n```\n\n4. Configure policies in `scripts/local_inference.py` using the `MODULES` variable. Set `\"p1\"` to `\"human\"` to play against the AI (must install `pygame`).\n\n## Project Architecture\n\n### Core Components\n\n- **Environment (`footsies/`)**: The main game environment implementation that interfaces with the Unity game through gRPC.\n- **Models (`models/`)**: Neural network architectures for the RL agents\n- **Experiments (`experiments/`)**: Training configurations and experiment management\n- **Callbacks (`callbacks/`)**: Custom RLlib callbacks for monitoring and evaluation\n- **Components (`components/`)**: Reusable components like the module repository for policy management\n- **Utils (`utils/`)**: Utility functions and helper classes\n- **Scripts (`scripts/`)**: Helper scripts for server management and visualization\n\n### Key Features\n\n- Multi-agent reinforcement learning environment\n- gRPC-based communication with Unity game server\n- Support for both headless and windowed game modes\n- Integration with Ray RLlib for distributed training\n- Custom LSTM-based policy networks\n- Support for self-play training\n- Evaluation against baseline policies (random, noop, back)\n- Wandb integration for experiment tracking\n\n\n## Development\n\n### gRPC / Protobuf Updates\n\nIf updating the proto definitions:\n\n1. Generate C# files (Windows):\n```bash\n.\\protoc\\bin\\protoc.exe --csharp_out=.\\env\\game\\proto\\ --grpc_out=.\\env\\game\\proto\\ --plugin=protoc-gen-grpc=.\\plugins\\grpc_csharp_plugin.exe .\\env\\game\\proto\\footsies_service.proto\n```\n\n2. Generate Python files:\n```bash\npython -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. .\\env\\game\\proto\\footsies_service.proto\n```\n\n## Project Structure\n\n```\nFootsiesGym/\n\u251c\u2500\u2500 binaries/           # Game server binaries\n\u251c\u2500\u2500 callbacks/          # RLlib callbacks\n\u251c\u2500\u2500 components/         # Reusable components\n\u251c\u2500\u2500 experiments/        # Training configurations\n\u251c\u2500\u2500 footsies/          # Core environment\n\u251c\u2500\u2500 models/            # Neural network architectures\n\u251c\u2500\u2500 protoc/            # Protocol buffer tools\n\u251c\u2500\u2500 scripts/           # Helper scripts\n\u251c\u2500\u2500 testing/           # Test files\n\u2514\u2500\u2500 utils/             # Utility functions\n```\n\n## Contributing\n\n1. Install pre-commit hooks to maintain code quality\n2. Follow the existing code style and architecture\n3. Add tests for new features\n4. Update documentation as needed\n\n## License\n\nThis project is based on the open-source Footsies game by HiFight. Please refer to the original game's license for more information.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A reinforcement learning environment for HiFight's Footsies game",
    "version": "0.2.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/cmcdonald/FootsiesGym/issues",
        "Documentation": "https://github.com/cmcdonald/FootsiesGym#readme",
        "Homepage": "https://github.com/cmcdonald/FootsiesGym",
        "Repository": "https://github.com/cmcdonald/FootsiesGym"
    },
    "split_keywords": [
        "reinforcement learning",
        " multi-agent",
        " fighting game",
        " footsies",
        " gymnasium"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "23ada57db3fd0d691dd363ce8253e02795b3181c4c05b926197a74e2c8382bb9",
                "md5": "b5ff335469b5a0bdc9152859fa54dbca",
                "sha256": "116b48e82325bf3e87ff36fad638295c74849a3f369cc5d895138721a3109dcf"
            },
            "downloads": -1,
            "filename": "footsies_gym-0.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b5ff335469b5a0bdc9152859fa54dbca",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 81206,
            "upload_time": "2025-08-27T00:27:57",
            "upload_time_iso_8601": "2025-08-27T00:27:57.169954Z",
            "url": "https://files.pythonhosted.org/packages/23/ad/a57db3fd0d691dd363ce8253e02795b3181c4c05b926197a74e2c8382bb9/footsies_gym-0.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1a3d49de92a19c2d7e8a95030083f1d42fd3bfb71aad72240586e574e7daad21",
                "md5": "9fc6e143720a5dea8543555bd88c50ec",
                "sha256": "f6ad623c7b3453970c81a93a34e5eccde6cbb072464b70394221920c45484894"
            },
            "downloads": -1,
            "filename": "footsies_gym-0.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "9fc6e143720a5dea8543555bd88c50ec",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 64747,
            "upload_time": "2025-08-27T00:27:58",
            "upload_time_iso_8601": "2025-08-27T00:27:58.273592Z",
            "url": "https://files.pythonhosted.org/packages/1a/3d/49de92a19c2d7e8a95030083f1d42fd3bfb71aad72240586e574e7daad21/footsies_gym-0.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-27 00:27:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "cmcdonald",
    "github_project": "FootsiesGym",
    "github_not_found": true,
    "lcname": "footsies-gym"
}
        
Elapsed time: 0.86007s