# Exponent-ML
**Prompt + Dataset → Trained + Deployed ML Models in One Line**
Exponent-ML is a CLI tool that lets anyone create, train, and deploy machine learning models by describing their task and uploading a dataset. The tool uses LLMs to generate runnable training pipelines based on both user intent and real dataset structure, with optional deployment to GitHub or cloud platforms.
## 🚀 Quick Start
### Installation
```bash
# Clone the repository
git clone https://github.com/yourusername/exponent-ml.git
cd exponent-ml
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your API keys
```
### Environment Setup
Create a `.env` file with your API keys:
```bash
# Required
ANTHROPIC_API_KEY=your_anthropic_api_key
# Authentication (OAuth)
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret
# Optional (for cloud features)
AWS_ACCESS_KEY_ID=your_aws_access_key
AWS_SECRET_ACCESS_KEY=your_aws_secret_key
AWS_REGION=us-east-1
S3_BUCKET=your_s3_bucket_name
MODAL_TOKEN_ID=your_modal_token_id
MODAL_TOKEN_SECRET=your_modal_token_secret
GITHUB_TOKEN=your_github_token
```
**Note**: Only `ANTHROPIC_API_KEY` is required. OAuth credentials are needed for authentication.
### Authentication Setup
Before using Exponent-ML, you need to set up OAuth authentication:
#### Option 1: Use the setup script
```bash
# Set up Google OAuth
python scripts/setup_oauth.py google
# Set up GitHub OAuth
python scripts/setup_oauth.py github
# Check your configuration
python scripts/setup_oauth.py check
```
#### Option 2: Manual setup
1. **Google OAuth**: Go to [Google Cloud Console](https://console.developers.google.com/)
- Create a new project
- Enable Google+ API
- Create OAuth 2.0 credentials
- Set redirect URI to `http://localhost:8080`
2. **GitHub OAuth**: Go to [GitHub OAuth Apps](https://github.com/settings/developers)
- Create a new OAuth App
- Set callback URL to `http://localhost:8080`
3. Add credentials to your `.env` file:
```bash
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret
```
### Basic Usage
```bash
# Login with OAuth
exponent login --provider google
exponent login --provider github
# Check authentication status
exponent status
# Interactive wizard
exponent init
# Quick start with task and dataset
exponent init quick "Predict email spam" --dataset spam.csv
# Train a model
exponent train
# Deploy to GitHub
exponent deploy
```
## 📋 Commands
### `exponent init`
Initialize a new ML project with interactive wizard:
```bash
exponent init
```
**Options:**
- `--task, -t`: ML task description
- `--dataset, -d`: Path to dataset file
- `--interactive, -i`: Run in interactive mode (default: True)
**Subcommands:**
- `exponent init quick <task>`: Quick initialization without prompts
### `exponent upload-dataset`
Analyze datasets and optionally upload to S3 for cloud training:
```bash
exponent upload-dataset spam.csv
```
**Options:**
- `--project-id, -p`: Project ID for organization
- `--upload, -u`: Upload to S3 for cloud training
**Subcommands:**
- `exponent upload-dataset analyze <file>`: Analyze dataset without uploading
### `exponent train`
Train ML models locally or in the cloud:
```bash
exponent train
```
**Options:**
- `--project-id, -p`: Project ID to train
- `--dataset, -d`: Path to dataset file
- `--task, -t`: Task description
- `--cloud, -c`: Use cloud training (Modal)
- `--s3`: Upload dataset to S3 for cloud training
**Subcommands:**
- `exponent train status <job_id>`: Check training job status
- `exponent train list`: List all training jobs
### `exponent deploy`
Deploy projects to GitHub:
```bash
exponent deploy
```
**Options:**
- `--project-id, -p`: Project ID to deploy
- `--name, -n`: GitHub repository name
- `--path`: Path to project directory
**Subcommands:**
- `exponent deploy list`: List GitHub repositories
## 🧠 How It Works
1. **Task Description**: You describe your ML task in natural language
2. **Dataset Analysis**: The tool analyzes your dataset structure (columns, types, etc.)
3. **Code Generation**: An LLM generates production-ready Python code based on your task and dataset
4. **Dynamic Training**: The generated code is executed in Modal's cloud infrastructure
5. **Deployment**: Projects can be deployed to GitHub with automated workflows
**No hard-coded templates** - every model is generated specifically for your task and dataset!
## 📁 Generated Project Structure
```
~/.exponent/<project-id>/
├── model.py # Model definition and training pipeline
├── train.py # Training script with data loading
├── predict.py # Prediction script for making predictions
├── requirements.txt # Python dependencies
└── README.md # Project documentation
```
## 🔧 Configuration
### Required Environment Variables
- `ANTHROPIC_API_KEY`: Your Anthropic API key for code generation
### Optional Environment Variables
- `AWS_ACCESS_KEY_ID`: AWS access key for S3 uploads
- `AWS_SECRET_ACCESS_KEY`: AWS secret key for S3 uploads
- `AWS_REGION`: AWS region (default: us-east-1)
- `S3_BUCKET`: S3 bucket name for dataset storage
- `MODAL_TOKEN_ID`: Modal token for cloud training
- `MODAL_TOKEN_SECRET`: Modal token secret for cloud training
- `GITHUB_TOKEN`: GitHub token for repository creation
## 🛠️ Development
### Project Structure
```
exponent-ml/
├── exponent/
│ ├── cli/ # CLI interface (Typer)
│ │ └── commands/ # CLI commands
│ ├── core/ # Core logic
│ │ ├── code_gen.py # LLM code generation
│ │ ├── config.py # Configuration management
│ │ ├── s3_utils.py # S3 dataset handling
│ │ ├── modal_runner.py # Modal cloud training
│ │ └── github_utils.py # GitHub deployment
│ └── main.py # CLI entry point
├── requirements.txt # Python dependencies
├── pyproject.toml # Project configuration
└── README.md # This file
```
### Local Development
```bash
# Install in development mode
pip install -e .
# Run tests
pytest
# Format code
black exponent/
isort exponent/
# Lint code
flake8 exponent/
```
## 🤝 Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Submit a pull request
## 📄 License
MIT License - see LICENSE file for details.
## 🙏 Acknowledgments
- [Anthropic](https://anthropic.com/) for Claude API
- [Modal](https://modal.com/) for cloud infrastructure
- [Typer](https://typer.tiangolo.com/) for CLI framework
- [scikit-learn](https://scikit-learn.org/) for ML algorithms
Raw data
{
"_id": null,
"home_page": null,
"name": "exponent-ml",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Eden Etuk <edens.etuk@gmail.com>",
"keywords": "machine-learning, cli, ai, ml, automation, modal, anthropic",
"author": null,
"author_email": "Eden Etuk <edens.etuk@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/b3/83/857df681e4d7ac97b9947cd0e0cd9ae1140137bd9efb31bc34ef6c642d8e/exponent_ml-0.1.1.tar.gz",
"platform": null,
"description": "# Exponent-ML\n\n**Prompt + Dataset \u2192 Trained + Deployed ML Models in One Line**\n\nExponent-ML is a CLI tool that lets anyone create, train, and deploy machine learning models by describing their task and uploading a dataset. The tool uses LLMs to generate runnable training pipelines based on both user intent and real dataset structure, with optional deployment to GitHub or cloud platforms.\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/yourusername/exponent-ml.git\ncd exponent-ml\n\n# Install dependencies\npip install -r requirements.txt\n\n# Set up environment variables\ncp .env.example .env\n# Edit .env with your API keys\n```\n\n### Environment Setup\n\nCreate a `.env` file with your API keys:\n\n```bash\n# Required\nANTHROPIC_API_KEY=your_anthropic_api_key\n\n# Authentication (OAuth)\nGOOGLE_CLIENT_ID=your_google_client_id\nGOOGLE_CLIENT_SECRET=your_google_client_secret\nGITHUB_CLIENT_ID=your_github_client_id\nGITHUB_CLIENT_SECRET=your_github_client_secret\n\n# Optional (for cloud features)\nAWS_ACCESS_KEY_ID=your_aws_access_key\nAWS_SECRET_ACCESS_KEY=your_aws_secret_key\nAWS_REGION=us-east-1\nS3_BUCKET=your_s3_bucket_name\nMODAL_TOKEN_ID=your_modal_token_id\nMODAL_TOKEN_SECRET=your_modal_token_secret\nGITHUB_TOKEN=your_github_token\n```\n\n**Note**: Only `ANTHROPIC_API_KEY` is required. OAuth credentials are needed for authentication.\n\n### Authentication Setup\n\nBefore using Exponent-ML, you need to set up OAuth authentication:\n\n#### Option 1: Use the setup script\n```bash\n# Set up Google OAuth\npython scripts/setup_oauth.py google\n\n# Set up GitHub OAuth \npython scripts/setup_oauth.py github\n\n# Check your configuration\npython scripts/setup_oauth.py check\n```\n\n#### Option 2: Manual setup\n1. **Google OAuth**: Go to [Google Cloud Console](https://console.developers.google.com/)\n - Create a new project\n - Enable Google+ API\n - Create OAuth 2.0 credentials\n - Set redirect URI to `http://localhost:8080`\n\n2. **GitHub OAuth**: Go to [GitHub OAuth Apps](https://github.com/settings/developers)\n - Create a new OAuth App\n - Set callback URL to `http://localhost:8080`\n\n3. Add credentials to your `.env` file:\n```bash\nGOOGLE_CLIENT_ID=your_google_client_id\nGOOGLE_CLIENT_SECRET=your_google_client_secret\nGITHUB_CLIENT_ID=your_github_client_id\nGITHUB_CLIENT_SECRET=your_github_client_secret\n```\n\n### Basic Usage\n\n```bash\n# Login with OAuth\nexponent login --provider google\nexponent login --provider github\n\n# Check authentication status\nexponent status\n\n# Interactive wizard\nexponent init\n\n# Quick start with task and dataset\nexponent init quick \"Predict email spam\" --dataset spam.csv\n\n# Train a model\nexponent train\n\n# Deploy to GitHub\nexponent deploy\n```\n\n## \ud83d\udccb Commands\n\n### `exponent init`\n\nInitialize a new ML project with interactive wizard:\n\n```bash\nexponent init\n```\n\n**Options:**\n- `--task, -t`: ML task description\n- `--dataset, -d`: Path to dataset file\n- `--interactive, -i`: Run in interactive mode (default: True)\n\n**Subcommands:**\n- `exponent init quick <task>`: Quick initialization without prompts\n\n### `exponent upload-dataset`\n\nAnalyze datasets and optionally upload to S3 for cloud training:\n\n```bash\nexponent upload-dataset spam.csv\n```\n\n**Options:**\n- `--project-id, -p`: Project ID for organization\n- `--upload, -u`: Upload to S3 for cloud training\n\n**Subcommands:**\n- `exponent upload-dataset analyze <file>`: Analyze dataset without uploading\n\n### `exponent train`\n\nTrain ML models locally or in the cloud:\n\n```bash\nexponent train\n```\n\n**Options:**\n- `--project-id, -p`: Project ID to train\n- `--dataset, -d`: Path to dataset file\n- `--task, -t`: Task description\n- `--cloud, -c`: Use cloud training (Modal)\n- `--s3`: Upload dataset to S3 for cloud training\n\n**Subcommands:**\n- `exponent train status <job_id>`: Check training job status\n- `exponent train list`: List all training jobs\n\n### `exponent deploy`\n\nDeploy projects to GitHub:\n\n```bash\nexponent deploy\n```\n\n**Options:**\n- `--project-id, -p`: Project ID to deploy\n- `--name, -n`: GitHub repository name\n- `--path`: Path to project directory\n\n**Subcommands:**\n- `exponent deploy list`: List GitHub repositories\n\n## \ud83e\udde0 How It Works\n\n1. **Task Description**: You describe your ML task in natural language\n2. **Dataset Analysis**: The tool analyzes your dataset structure (columns, types, etc.)\n3. **Code Generation**: An LLM generates production-ready Python code based on your task and dataset\n4. **Dynamic Training**: The generated code is executed in Modal's cloud infrastructure\n5. **Deployment**: Projects can be deployed to GitHub with automated workflows\n\n**No hard-coded templates** - every model is generated specifically for your task and dataset!\n\n## \ud83d\udcc1 Generated Project Structure\n\n```\n~/.exponent/<project-id>/\n\u251c\u2500\u2500 model.py # Model definition and training pipeline\n\u251c\u2500\u2500 train.py # Training script with data loading\n\u251c\u2500\u2500 predict.py # Prediction script for making predictions\n\u251c\u2500\u2500 requirements.txt # Python dependencies\n\u2514\u2500\u2500 README.md # Project documentation\n```\n\n## \ud83d\udd27 Configuration\n\n### Required Environment Variables\n\n- `ANTHROPIC_API_KEY`: Your Anthropic API key for code generation\n\n### Optional Environment Variables\n\n- `AWS_ACCESS_KEY_ID`: AWS access key for S3 uploads\n- `AWS_SECRET_ACCESS_KEY`: AWS secret key for S3 uploads\n- `AWS_REGION`: AWS region (default: us-east-1)\n- `S3_BUCKET`: S3 bucket name for dataset storage\n- `MODAL_TOKEN_ID`: Modal token for cloud training\n- `MODAL_TOKEN_SECRET`: Modal token secret for cloud training\n- `GITHUB_TOKEN`: GitHub token for repository creation\n\n## \ud83d\udee0\ufe0f Development\n\n### Project Structure\n\n```\nexponent-ml/\n\u251c\u2500\u2500 exponent/\n\u2502 \u251c\u2500\u2500 cli/ # CLI interface (Typer)\n\u2502 \u2502 \u2514\u2500\u2500 commands/ # CLI commands\n\u2502 \u251c\u2500\u2500 core/ # Core logic\n\u2502 \u2502 \u251c\u2500\u2500 code_gen.py # LLM code generation\n\u2502 \u2502 \u251c\u2500\u2500 config.py # Configuration management\n\u2502 \u2502 \u251c\u2500\u2500 s3_utils.py # S3 dataset handling\n\u2502 \u2502 \u251c\u2500\u2500 modal_runner.py # Modal cloud training\n\u2502 \u2502 \u2514\u2500\u2500 github_utils.py # GitHub deployment\n\u2502 \u2514\u2500\u2500 main.py # CLI entry point\n\u251c\u2500\u2500 requirements.txt # Python dependencies\n\u251c\u2500\u2500 pyproject.toml # Project configuration\n\u2514\u2500\u2500 README.md # This file\n```\n\n### Local Development\n\n```bash\n# Install in development mode\npip install -e .\n\n# Run tests\npytest\n\n# Format code\nblack exponent/\nisort exponent/\n\n# Lint code\nflake8 exponent/\n```\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests\n5. Submit a pull request\n\n## \ud83d\udcc4 License\n\nMIT License - see LICENSE file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- [Anthropic](https://anthropic.com/) for Claude API\n- [Modal](https://modal.com/) for cloud infrastructure\n- [Typer](https://typer.tiangolo.com/) for CLI framework\n- [scikit-learn](https://scikit-learn.org/) for ML algorithms \n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Claude Code for ML developers. Build and deploy ML models from your terminal. Accelerate your ML pipeline with Exponent.",
"version": "0.1.1",
"project_urls": {
"Documentation": "https://exponent-ml.vercel.app/docs",
"Homepage": "https://github.com/3d3n-ops/exponent-0.1",
"Issues": "https://github.com/3d3n-ops/exponent-0.1/issues",
"Repository": "https://github.com/3d3n-ops/exponent-0.1"
},
"split_keywords": [
"machine-learning",
" cli",
" ai",
" ml",
" automation",
" modal",
" anthropic"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "f17e8af90057ad6eb833f2c71980d0282e0a5f0fc9ea50926f74f80372d439f0",
"md5": "f81a9565430ac9d93f78025cfd94b397",
"sha256": "87b8eff9db32208fef1dd7fcd9232b9bc21923e7e178d350e5a95e2706f3c5b7"
},
"downloads": -1,
"filename": "exponent_ml-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f81a9565430ac9d93f78025cfd94b397",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 40940,
"upload_time": "2025-07-16T05:17:53",
"upload_time_iso_8601": "2025-07-16T05:17:53.207428Z",
"url": "https://files.pythonhosted.org/packages/f1/7e/8af90057ad6eb833f2c71980d0282e0a5f0fc9ea50926f74f80372d439f0/exponent_ml-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "b383857df681e4d7ac97b9947cd0e0cd9ae1140137bd9efb31bc34ef6c642d8e",
"md5": "c8a8f593a594759eccc00a6a918e4ae2",
"sha256": "92f240fd44c12d05a5c0432100227119b4a8aa500efe07b80f4baddcccd93575"
},
"downloads": -1,
"filename": "exponent_ml-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "c8a8f593a594759eccc00a6a918e4ae2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 37438,
"upload_time": "2025-07-16T05:17:54",
"upload_time_iso_8601": "2025-07-16T05:17:54.821096Z",
"url": "https://files.pythonhosted.org/packages/b3/83/857df681e4d7ac97b9947cd0e0cd9ae1140137bd9efb31bc34ef6c642d8e/exponent_ml-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-16 05:17:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "3d3n-ops",
"github_project": "exponent-0.1",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "typer",
"specs": [
[
">=",
"0.9.0"
]
]
},
{
"name": "anthropic",
"specs": [
[
">=",
"0.7.0"
]
]
},
{
"name": "boto3",
"specs": [
[
">=",
"1.26.0"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"1.5.0"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.21.0"
]
]
},
{
"name": "modal",
"specs": [
[
">=",
"0.55.0"
]
]
},
{
"name": "PyGithub",
"specs": [
[
">=",
"1.59.0"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "inquirer",
"specs": [
[
">=",
"3.1.0"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
">=",
"1.3.0"
]
]
},
{
"name": "matplotlib",
"specs": [
[
">=",
"3.5.0"
]
]
},
{
"name": "seaborn",
"specs": [
[
">=",
"0.11.0"
]
]
}
],
"lcname": "exponent-ml"
}