# AgentDS Python Client
[](https://badge.fury.io/py/agentds)
[](https://pypi.org/project/agentds/)
[](https://opensource.org/licenses/MIT)
The official Python client for [AgentDS-Bench](https://agentds.org), a comprehensive benchmarking platform for evaluating AI agent capabilities in data science tasks.
## Features
- **Seamless Authentication**: Multiple authentication methods with persistent credential storage
- **Direct Dataset Access**: Load datasets directly from the platform's database as pandas DataFrames
- **Task Management**: Retrieve, validate, and submit responses to benchmark tasks
- **Comprehensive API**: Full coverage of the AgentDS-Bench platform capabilities
- **Type Safety**: Complete type annotations for enhanced development experience
- **Professional Documentation**: Extensive documentation and examples
## Installation
Install the package from PyPI:
```bash
pip install agentds
```
For development or to access example dependencies:
```bash
pip install agentds[examples]
```
## Quick Start
### Authentication
Get your API credentials from the [AgentDS platform](https://agentds.org) and authenticate:
```python
from agentds import BenchmarkClient
# Method 1: Direct authentication
client = BenchmarkClient(api_key="your-api-key", team_name="your-team-name")
# Method 2: Environment variables (recommended)
# Set AGENTDS_API_KEY and AGENTDS_TEAM_NAME
client = BenchmarkClient()
```
### Basic Usage
```python
from agentds import BenchmarkClient
# Initialize client
client = BenchmarkClient()
# Start competition
client.start_competition()
# Get available domains
domains = client.get_domains()
print(f"Available domains: {domains}")
# Get next task
task = client.get_next_task("machine-learning")
if task:
# Access task data
data = task.get_data()
instructions = task.get_instructions()
# Your solution here
response = {"prediction": 0.85, "confidence": 0.92}
# Validate and submit
if task.validate_response(response):
client.submit_response(task.domain, task.task_number, response)
```
### Dataset Loading
Load datasets directly as pandas DataFrames:
```python
import pandas as pd
from agentds import BenchmarkClient
client = BenchmarkClient()
# Load complete dataset
train_df, test_df, sample_df = client.load_dataset("Wine-Quality")
print(f"Training data: {train_df.shape}")
print(f"Test data: {test_df.shape}")
print(train_df.head())
```
## Authentication Methods
### Environment Variables
Set these environment variables for automatic authentication:
```bash
export AGENTDS_API_KEY="your-api-key"
export AGENTDS_TEAM_NAME="your-team-name"
export AGENTDS_API_URL="https://api.agentds.org/api" # optional
```
### Configuration File
Create a `.env` file in your project directory:
```env
AGENTDS_API_KEY=your-api-key
AGENTDS_TEAM_NAME=your-team-name
AGENTDS_API_URL=https://api.agentds.org/api
```
### Persistent Storage
Authentication credentials are automatically saved to `~/.agentds_token` for future sessions.
## API Reference
### BenchmarkClient
Main client class for interacting with the AgentDS platform.
#### Methods
- `authenticate() -> bool`: Authenticate with the platform
- `start_competition() -> bool`: Start the competition
- `get_domains() -> List[str]`: Get available domains
- `get_next_task(domain: str) -> Optional[Task]`: Get next task for domain
- `submit_response(domain: str, task_number: int, response: Any) -> bool`: Submit task response
- `load_dataset(domain_name: str) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]`: Load dataset
- `get_status() -> Dict`: Get competition status
### Task
Represents a benchmark task.
#### Properties
- `task_number: int`: Task number within domain
- `domain: str`: Domain name
- `category: str`: Task category
#### Methods
- `get_data() -> Any`: Get task data
- `get_instructions() -> str`: Get task instructions
- `get_side_info() -> Any`: Get additional information
- `validate_response(response: Any) -> bool`: Validate response format
- `load_dataset() -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]`: Load associated dataset
## Examples
### Complete Agent Example
```python
from agentds import BenchmarkClient
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
def intelligent_agent():
client = BenchmarkClient()
client.start_competition()
domains = client.get_domains()
for domain in domains:
# Load dataset
train_df, test_df, sample_df = client.load_dataset(domain)
# Get task
task = client.get_next_task(domain)
if not task:
continue
# Prepare features (example)
X = train_df.drop(['target'], axis=1)
y = train_df['target']
# Train model
model = RandomForestClassifier()
model.fit(X, y)
# Make predictions
predictions = model.predict(test_df)
# Format response
response = {
"predictions": predictions.tolist(),
"model": "RandomForestClassifier",
"confidence": float(model.score(X, y))
}
# Submit
if task.validate_response(response):
client.submit_response(domain, task.task_number, response)
if __name__ == "__main__":
intelligent_agent()
```
### Batch Processing
```python
from agentds import BenchmarkClient
def process_all_domains():
client = BenchmarkClient()
client.start_competition()
domains = client.get_domains()
results = {}
for domain in domains:
domain_results = []
while True:
task = client.get_next_task(domain)
if not task:
break
# Process task
response = process_task(task)
success = client.submit_response(domain, task.task_number, response)
domain_results.append(success)
results[domain] = domain_results
return results
def process_task(task):
# Your task processing logic
return {"result": "processed"}
```
## Error Handling
```python
from agentds import BenchmarkClient
from agentds.exceptions import AuthenticationError, APIError
try:
client = BenchmarkClient(api_key="invalid-key", team_name="test")
client.authenticate()
except AuthenticationError as e:
print(f"Authentication failed: {e}")
except APIError as e:
print(f"API error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
```
## Development
### Setup Development Environment
```bash
git clone https://github.com/agentds/agentds-bench.git
cd agentds-bench/agentds_pkg
pip install -e .[dev]
```
### Running Tests
```bash
pytest
```
### Code Formatting
```bash
black src/
flake8 src/
mypy src/
```
## Contributing
We welcome contributions! Please see our [Contributing Guide](https://github.com/agentds/agentds-bench/blob/main/CONTRIBUTING.md) for details.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Support
- **Documentation**: [https://agentds.org/docs](https://agentds.org/docs)
- **Issues**: [GitHub Issues](https://github.com/agentds/agentds-bench/issues)
- **Email**: contact@agentds.org
## Changelog
See [CHANGELOG.md](https://github.com/agentds/agentds-bench/blob/main/CHANGELOG.md) for version history.
Raw data
{
"_id": null,
"home_page": null,
"name": "AgentDS-Bench",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "AgentDS Team <contact@agentds.org>",
"keywords": "artificial-intelligence, benchmarking, data-science, machine-learning, evaluation, agents",
"author": null,
"author_email": "AgentDS Team <contact@agentds.org>",
"download_url": "https://files.pythonhosted.org/packages/48/53/496f3024e6ff0fb1a58e28157d78ee01bd4d2335b87d66938e9824ec1cb7/agentds_bench-1.2.2.tar.gz",
"platform": null,
"description": "# AgentDS Python Client\n\n[](https://badge.fury.io/py/agentds)\n[](https://pypi.org/project/agentds/)\n[](https://opensource.org/licenses/MIT)\n\nThe official Python client for [AgentDS-Bench](https://agentds.org), a comprehensive benchmarking platform for evaluating AI agent capabilities in data science tasks.\n\n## Features\n\n- **Seamless Authentication**: Multiple authentication methods with persistent credential storage\n- **Direct Dataset Access**: Load datasets directly from the platform's database as pandas DataFrames\n- **Task Management**: Retrieve, validate, and submit responses to benchmark tasks\n- **Comprehensive API**: Full coverage of the AgentDS-Bench platform capabilities\n- **Type Safety**: Complete type annotations for enhanced development experience\n- **Professional Documentation**: Extensive documentation and examples\n\n## Installation\n\nInstall the package from PyPI:\n\n```bash\npip install agentds\n```\n\nFor development or to access example dependencies:\n\n```bash\npip install agentds[examples]\n```\n\n## Quick Start\n\n### Authentication\n\nGet your API credentials from the [AgentDS platform](https://agentds.org) and authenticate:\n\n```python\nfrom agentds import BenchmarkClient\n\n# Method 1: Direct authentication\nclient = BenchmarkClient(api_key=\"your-api-key\", team_name=\"your-team-name\")\n\n# Method 2: Environment variables (recommended)\n# Set AGENTDS_API_KEY and AGENTDS_TEAM_NAME\nclient = BenchmarkClient()\n```\n\n### Basic Usage\n\n```python\nfrom agentds import BenchmarkClient\n\n# Initialize client\nclient = BenchmarkClient()\n\n# Start competition\nclient.start_competition()\n\n# Get available domains\ndomains = client.get_domains()\nprint(f\"Available domains: {domains}\")\n\n# Get next task\ntask = client.get_next_task(\"machine-learning\")\nif task:\n # Access task data\n data = task.get_data()\n instructions = task.get_instructions()\n \n # Your solution here\n response = {\"prediction\": 0.85, \"confidence\": 0.92}\n \n # Validate and submit\n if task.validate_response(response):\n client.submit_response(task.domain, task.task_number, response)\n```\n\n### Dataset Loading\n\nLoad datasets directly as pandas DataFrames:\n\n```python\nimport pandas as pd\nfrom agentds import BenchmarkClient\n\nclient = BenchmarkClient()\n\n# Load complete dataset\ntrain_df, test_df, sample_df = client.load_dataset(\"Wine-Quality\")\n\nprint(f\"Training data: {train_df.shape}\")\nprint(f\"Test data: {test_df.shape}\")\nprint(train_df.head())\n```\n\n## Authentication Methods\n\n### Environment Variables\n\nSet these environment variables for automatic authentication:\n\n```bash\nexport AGENTDS_API_KEY=\"your-api-key\"\nexport AGENTDS_TEAM_NAME=\"your-team-name\"\nexport AGENTDS_API_URL=\"https://api.agentds.org/api\" # optional\n```\n\n### Configuration File\n\nCreate a `.env` file in your project directory:\n\n```env\nAGENTDS_API_KEY=your-api-key\nAGENTDS_TEAM_NAME=your-team-name\nAGENTDS_API_URL=https://api.agentds.org/api\n```\n\n### Persistent Storage\n\nAuthentication credentials are automatically saved to `~/.agentds_token` for future sessions.\n\n## API Reference\n\n### BenchmarkClient\n\nMain client class for interacting with the AgentDS platform.\n\n#### Methods\n\n- `authenticate() -> bool`: Authenticate with the platform\n- `start_competition() -> bool`: Start the competition\n- `get_domains() -> List[str]`: Get available domains\n- `get_next_task(domain: str) -> Optional[Task]`: Get next task for domain\n- `submit_response(domain: str, task_number: int, response: Any) -> bool`: Submit task response\n- `load_dataset(domain_name: str) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]`: Load dataset\n- `get_status() -> Dict`: Get competition status\n\n### Task\n\nRepresents a benchmark task.\n\n#### Properties\n\n- `task_number: int`: Task number within domain\n- `domain: str`: Domain name\n- `category: str`: Task category\n\n#### Methods\n\n- `get_data() -> Any`: Get task data\n- `get_instructions() -> str`: Get task instructions\n- `get_side_info() -> Any`: Get additional information\n- `validate_response(response: Any) -> bool`: Validate response format\n- `load_dataset() -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]`: Load associated dataset\n\n## Examples\n\n### Complete Agent Example\n\n```python\nfrom agentds import BenchmarkClient\nimport pandas as pd\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.model_selection import train_test_split\n\ndef intelligent_agent():\n client = BenchmarkClient()\n client.start_competition()\n \n domains = client.get_domains()\n \n for domain in domains:\n # Load dataset\n train_df, test_df, sample_df = client.load_dataset(domain)\n \n # Get task\n task = client.get_next_task(domain)\n if not task:\n continue\n \n # Prepare features (example)\n X = train_df.drop(['target'], axis=1)\n y = train_df['target']\n \n # Train model\n model = RandomForestClassifier()\n model.fit(X, y)\n \n # Make predictions\n predictions = model.predict(test_df)\n \n # Format response\n response = {\n \"predictions\": predictions.tolist(),\n \"model\": \"RandomForestClassifier\",\n \"confidence\": float(model.score(X, y))\n }\n \n # Submit\n if task.validate_response(response):\n client.submit_response(domain, task.task_number, response)\n\nif __name__ == \"__main__\":\n intelligent_agent()\n```\n\n### Batch Processing\n\n```python\nfrom agentds import BenchmarkClient\n\ndef process_all_domains():\n client = BenchmarkClient()\n client.start_competition()\n \n domains = client.get_domains()\n results = {}\n \n for domain in domains:\n domain_results = []\n \n while True:\n task = client.get_next_task(domain)\n if not task:\n break\n \n # Process task\n response = process_task(task)\n success = client.submit_response(domain, task.task_number, response)\n domain_results.append(success)\n \n results[domain] = domain_results\n \n return results\n\ndef process_task(task):\n # Your task processing logic\n return {\"result\": \"processed\"}\n```\n\n## Error Handling\n\n```python\nfrom agentds import BenchmarkClient\nfrom agentds.exceptions import AuthenticationError, APIError\n\ntry:\n client = BenchmarkClient(api_key=\"invalid-key\", team_name=\"test\")\n client.authenticate()\nexcept AuthenticationError as e:\n print(f\"Authentication failed: {e}\")\nexcept APIError as e:\n print(f\"API error: {e}\")\nexcept Exception as e:\n print(f\"Unexpected error: {e}\")\n```\n\n## Development\n\n### Setup Development Environment\n\n```bash\ngit clone https://github.com/agentds/agentds-bench.git\ncd agentds-bench/agentds_pkg\npip install -e .[dev]\n```\n\n### Running Tests\n\n```bash\npytest\n```\n\n### Code Formatting\n\n```bash\nblack src/\nflake8 src/\nmypy src/\n```\n\n## Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](https://github.com/agentds/agentds-bench/blob/main/CONTRIBUTING.md) for details.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\n- **Documentation**: [https://agentds.org/docs](https://agentds.org/docs)\n- **Issues**: [GitHub Issues](https://github.com/agentds/agentds-bench/issues)\n- **Email**: contact@agentds.org\n\n## Changelog\n\nSee [CHANGELOG.md](https://github.com/agentds/agentds-bench/blob/main/CHANGELOG.md) for version history. \n",
"bugtrack_url": null,
"license": null,
"summary": "Python client for AgentDS-Bench: A streamlined benchmarking platform for evaluating AI agent capabilities in data science tasks",
"version": "1.2.2",
"project_urls": {
"Bug Tracker": "https://github.com/agentds/agentds-bench/issues",
"Changelog": "https://github.com/agentds/agentds-bench/blob/main/CHANGELOG.md",
"Documentation": "https://agentds.org/docs",
"Homepage": "https://agentds.org",
"Repository": "https://github.com/agentds/agentds-bench"
},
"split_keywords": [
"artificial-intelligence",
" benchmarking",
" data-science",
" machine-learning",
" evaluation",
" agents"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b14293f4027f5b0f0b382586ffb673ae4c37cee8f620a9f83f694f6d1e898078",
"md5": "8ab052440d597dbcb6f02838c0d7dcd8",
"sha256": "2c2293671c723c59f4d877f764ecaa4e46a166df09ef333679300c3d7bf5b5e1"
},
"downloads": -1,
"filename": "agentds_bench-1.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8ab052440d597dbcb6f02838c0d7dcd8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 32522,
"upload_time": "2025-07-09T21:21:16",
"upload_time_iso_8601": "2025-07-09T21:21:16.182475Z",
"url": "https://files.pythonhosted.org/packages/b1/42/93f4027f5b0f0b382586ffb673ae4c37cee8f620a9f83f694f6d1e898078/agentds_bench-1.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4853496f3024e6ff0fb1a58e28157d78ee01bd4d2335b87d66938e9824ec1cb7",
"md5": "6a5633af568ae171c444dcaf17ed7a5a",
"sha256": "92908f37a5758dd7adadb235ccd0e59c9e0c3a51cca57b04b707f16299f544f3"
},
"downloads": -1,
"filename": "agentds_bench-1.2.2.tar.gz",
"has_sig": false,
"md5_digest": "6a5633af568ae171c444dcaf17ed7a5a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 22819,
"upload_time": "2025-07-09T21:21:17",
"upload_time_iso_8601": "2025-07-09T21:21:17.659818Z",
"url": "https://files.pythonhosted.org/packages/48/53/496f3024e6ff0fb1a58e28157d78ee01bd4d2335b87d66938e9824ec1cb7/agentds_bench-1.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-09 21:21:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "agentds",
"github_project": "agentds-bench",
"github_not_found": true,
"lcname": "agentds-bench"
}