mlrb-agent-tasks

Name	mlrb-agent-tasks JSON
Version	0.0.23 JSON
	download
home_page	http://github.com/AlgorithmicResearchGroup/agent-tasks
Summary	A task package for ML Research Bench
upload_time	2024-12-10 16:21:44
maintainer	None
docs_url	None
author	Algorithmic Research Group
requires_python	None
license	None
keywords	tasks agent benchmark
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # ML Research Benchmark Tasks

This repository contains the tasks for ML Research Benchmark, a benchmarkdesigned to evaluate the capabilities of AI agents in accelerating ML research and development. The benchmark consists of 9 competition-level tasks that span the spectrum of activities typically undertaken by ML researchers.


## Introduction

The MLRB aims to measure the acceleration of AI agents in ML research and development. It focuses on competition-level tasks that reflect the current frontiers of ML research, providing a more nuanced and challenging evaluation environment than existing benchmarks.

[![arXiv](https://img.shields.io/badge/arXiv-2410.22553-b31b1b.svg)](https://arxiv.org/abs/2410.22553)

- [:paperclip: ML Research Benchmark Paper](https://arxiv.org/abs/2410.22553) 
- [:robot: ML Research Agent](https://github.com/AlgorithmicResearchGroup/ML-Research-Agent)
- [:white_check_mark: ML Research Tasks](https://github.com/AlgorithmicResearchGroup/ML-Research-Agent-Tasks)
- [:chart_with_upwards_trend: ML Research Evaluation](https://github.com/AlgorithmicResearchGroup/ML-Research-Agent-Evals)

## Installation

```bash
pip install mlrb-agent-tasks
```

# Usage

The library exposes a single function, get_task

get_task:
- path: path to copy the task to
- benchmark: name of the benchmark
- task: name of the task

This function will copy the task to the specified path and return a dictionary with the task name and prompt.

```
{
    "name": str, - name of the task
    "prompt": str, - prompt for the task
}
```

## Example Usage

```python
from mlrb_agent_tasks import get_task

# Example usage
result = get_task("./", "full_benchmark", "llm_efficiency")
print(result['prompt'])
```


## Contributing

We welcome contributions to the ML Research Benchmark! Please read our [CONTRIBUTING.md](CONTRIBUTING.md) file for guidelines on how to submit issues, feature requests, and pull requests.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Contact

For questions or feedback, please open an issue in this repository or contact [matt@algorithmicresearchgroup.com](mailto:matt@algorithmicresearchgroup.com).

Raw data

            {
    "_id": null,
    "home_page": "http://github.com/AlgorithmicResearchGroup/agent-tasks",
    "name": "mlrb-agent-tasks",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "tasks, agent, benchmark",
    "author": "Algorithmic Research Group",
    "author_email": "matt@algorithmicresearchgroup.com",
    "download_url": "https://files.pythonhosted.org/packages/72/37/cce338189d644ea6fa831d5300c25e760bb75441143236dd6d5e55dd19ce/mlrb_agent_tasks-0.0.23.tar.gz",
    "platform": null,
    "description": "# ML Research Benchmark Tasks\n\nThis repository contains the tasks for ML Research Benchmark, a benchmarkdesigned to evaluate the capabilities of AI agents in accelerating ML research and development. The benchmark consists of 9 competition-level tasks that span the spectrum of activities typically undertaken by ML researchers.\n\n\n## Introduction\n\nThe MLRB aims to measure the acceleration of AI agents in ML research and development. It focuses on competition-level tasks that reflect the current frontiers of ML research, providing a more nuanced and challenging evaluation environment than existing benchmarks.\n\n[![arXiv](https://img.shields.io/badge/arXiv-2410.22553-b31b1b.svg)](https://arxiv.org/abs/2410.22553)\n\n- [:paperclip: ML Research Benchmark Paper](https://arxiv.org/abs/2410.22553) \n- [:robot: ML Research Agent](https://github.com/AlgorithmicResearchGroup/ML-Research-Agent)\n- [:white_check_mark: ML Research Tasks](https://github.com/AlgorithmicResearchGroup/ML-Research-Agent-Tasks)\n- [:chart_with_upwards_trend: ML Research Evaluation](https://github.com/AlgorithmicResearchGroup/ML-Research-Agent-Evals)\n\n## Installation\n\n```bash\npip install mlrb-agent-tasks\n```\n\n# Usage\n\nThe library exposes a single function, get_task\n\nget_task:\n- path: path to copy the task to\n- benchmark: name of the benchmark\n- task: name of the task\n\nThis function will copy the task to the specified path and return a dictionary with the task name and prompt.\n\n```\n{\n    \"name\": str, - name of the task\n    \"prompt\": str, - prompt for the task\n}\n```\n\n## Example Usage\n\n```python\nfrom mlrb_agent_tasks import get_task\n\n# Example usage\nresult = get_task(\"./\", \"full_benchmark\", \"llm_efficiency\")\nprint(result['prompt'])\n```\n\n\n## Contributing\n\nWe welcome contributions to the ML Research Benchmark! Please read our [CONTRIBUTING.md](CONTRIBUTING.md) file for guidelines on how to submit issues, feature requests, and pull requests.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Contact\n\nFor questions or feedback, please open an issue in this repository or contact [matt@algorithmicresearchgroup.com](mailto:matt@algorithmicresearchgroup.com).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A task package for ML Research Bench",
    "version": "0.0.23",
    "project_urls": {
        "Homepage": "http://github.com/AlgorithmicResearchGroup/agent-tasks"
    },
    "split_keywords": [
        "tasks",
        " agent",
        " benchmark"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0d5a54639025c455d5ac622c6a5c626a1332b4e17fb5ec87812e74420e2cd177",
                "md5": "91aee643f2cc028675ca04a911894dfe",
                "sha256": "c2fe007b53bcb3a543a6833234c8474e7198c1c8144c4c0c60056ff66b2d7018"
            },
            "downloads": -1,
            "filename": "mlrb_agent_tasks-0.0.23-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "91aee643f2cc028675ca04a911894dfe",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 56000,
            "upload_time": "2024-12-10T16:21:42",
            "upload_time_iso_8601": "2024-12-10T16:21:42.381841Z",
            "url": "https://files.pythonhosted.org/packages/0d/5a/54639025c455d5ac622c6a5c626a1332b4e17fb5ec87812e74420e2cd177/mlrb_agent_tasks-0.0.23-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7237cce338189d644ea6fa831d5300c25e760bb75441143236dd6d5e55dd19ce",
                "md5": "35e93eb48bf3dbc77abb92c9eaceffd9",
                "sha256": "1e3e78b5c7cb9160205d4b206803b303a7b5b0d1da4a1225cf34f536acf5ff41"
            },
            "downloads": -1,
            "filename": "mlrb_agent_tasks-0.0.23.tar.gz",
            "has_sig": false,
            "md5_digest": "35e93eb48bf3dbc77abb92c9eaceffd9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 38444,
            "upload_time": "2024-12-10T16:21:44",
            "upload_time_iso_8601": "2024-12-10T16:21:44.905698Z",
            "url": "https://files.pythonhosted.org/packages/72/37/cce338189d644ea6fa831d5300c25e760bb75441143236dd6d5e55dd19ce/mlrb_agent_tasks-0.0.23.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-10 16:21:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AlgorithmicResearchGroup",
    "github_project": "agent-tasks",
    "github_not_found": true,
    "lcname": "mlrb-agent-tasks"
}

Algorithmic Research Group