intent-service


Nameintent-service JSON
Version 0.1.10 PyPI version JSON
download
home_pageNone
SummaryIntent classification service
upload_time2024-12-02 20:39:23
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords intent classification nlp machine-learning
VCS
bugtrack_url
requirements fastapi uvicorn pydantic typer rich jinja2
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Intent Service

# Intent-Service: Simplifying Fine-Tuning of Encoder Models for Classification

The intent-service is a tool designed to streamline the process of fine-tuning encoder-based models, such as BERT, for classification tasks. Specifically, this project focuses on simplifying the training of models for intent classification, which is a critical task in natural language processing (NLP) applications such as chatbots, virtual assistants, and other conversational AI systems.

## Background
Encoder models like BERT (Bidirectional Encoder Representations from Transformers) have revolutionized the way we process and understand language. These models are pre-trained on vast amounts of text data and can be fine-tuned to perform a wide range of downstream tasks with minimal effort. One of the most common applications of these models is intent classification—the task of determining the user's intent based on their input text.

Intent classification plays a central role in conversational AI systems, such as Google Assistant, Siri, Alexa, and countless custom chatbot solutions. By understanding the user's intent (e.g., "set an alarm," "get the weather," "play music"), these systems can trigger appropriate actions or provide relevant responses.

However, fine-tuning these models for intent classification can be challenging. It requires a well-organized approach to dataset preparation, hyperparameter tuning, and model optimization. Intent Classifier aims to simplify this process, making it easier for developers to deploy high-performance intent classification models for their applications.


## Installation

This project uses [uv](https://github.com/astral-sh/uv) for dependency management. To get started:

1. Install uv:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

2. Clone the repository:

```bash
git clone https://github.com/yourusername/intent-service.git
cd intent-service
```

3. Create a virtual environment and install dependencies:

```bash
uv venv
source .venv/bin/activate  # On Unix/macOS
# or
.venv\Scripts\activate  # On Windows

uv pip install -r requirements.txt
```

## Development Setup

### Code Quality Tools

We use several tools to maintain code quality:

- **Ruff**: For fast Python linting and formatting
- **Pytest**: For unit testing

Install development dependencies:

```bash
uv pip install -r requirements-dev.txt
```

### Running Code Quality Checks

```bash
# Run linting
ruff check .

# Run tests
pytest
```

### Pre-commit Hooks

We use pre-commit hooks to ensure code quality before commits. To set up:

```bash
pre-commit install
```

## API Usage

The service provides a REST API for intent processing. Here are the main endpoints:

### Model Management

#### Get Model Information
```bash
GET /model/{model_id}
```
Retrieves detailed information about a specific model. The `model_id` can be either a registered model name or MLflow run ID.

#### Search Models
```bash
POST /model/search
```
Search for registered models based on various criteria:
```json
{
  "tags": {"version": "1.0.0"},
  "intents": ["greeting", "farewell"],
  "name_contains": "bert",
  "limit": 10
}
```

#### Register Model
```bash
POST /model/register
```
Register an existing MLflow run as a named model:
```json
{
  "mlflow_run_id": "run_123",
  "name": "intent-classifier-v1",
  "description": "Intent classifier using DistilBERT",
  "tags": {
    "version": "1.0.0",
    "author": "team"
  }
}
```

### Training

#### Train New Model
```bash
POST /model/train
```
Train a new intent classification model:
```json
{
  "intents": ["greeting", "farewell", "help"],
  "dataset_source": {
    "source_type": "url",
    "url": "https://example.com/dataset.csv"
  },
  "model_name": "distilbert-base-uncased",
  "experiment_name": "intent-training",
  "training_config": {
    "num_epochs": 5,
    "batch_size": 32,
    "learning_rate": 5e-5
  }
}
```

### Prediction

#### Generate Predictions
```bash
POST /model/{model_id}/predict?text=Hello%20there
```
Generate intent predictions for input text. Returns confidence scores for each intent:
```json
{
  "greeting": 0.85,
  "farewell": 0.10,
  "help": 0.05
}
```

### API Documentation

Full API documentation is available at `/docs` when running the service. This provides an interactive Swagger UI where you can:
- View detailed endpoint specifications
- Try out API calls directly
- See request/response schemas
- Access example payloads

## Development Workflow

1. Create a new branch for your feature/fix:

   ```bash
   git checkout -b feature/your-feature-name
   ```

2. Make your changes and ensure all tests pass:

   ```bash
   pytest
   ```

3. Run code quality checks:

   ```bash
   ruff check .
   mypy .
   ```

4. Commit your changes:

   ```bash
   git commit -m "feat: add your feature description"
   ```

5. Push your changes and create a pull request:
   ```bash
   git push origin feature/your-feature-name
   ```

## Environment Variables

Create a `.env` file in the root directory based on the provided `.env.example`. Here are the available configuration options:

### Application Settings
```env
DEBUG=True
LOG_LEVEL=INFO
API_KEY=your_api_key_here
ENVIRONMENT=dev  # Options: dev, prod
VSCODE_DEBUGGER=False
```

### Server Settings
```env
HOST=0.0.0.0
PORT=8000
```

### MLflow Settings
```env
MLFLOW_TRACKING_URI=http://localhost:5000
MLFLOW_TRACKING_USERNAME=mlflow
MLFLOW_TRACKING_PASSWORD=mlflow123
MLFLOW_S3_ENDPOINT_URL=http://localhost:9000  # For MinIO/S3 artifact storage
MLFLOW_ARTIFACT_ROOT=s3://mlflow/artifacts
AWS_ACCESS_KEY_ID=minioadmin          # For MinIO/S3 access
AWS_SECRET_ACCESS_KEY=minioadmin123   # For MinIO/S3 access
MLFLOW_EXPERIMENT_NAME=intent-service  # Default experiment name
```

### Model Settings
```env
DEFAULT_MODEL_NAME=distilbert-base-uncased
MAX_SEQUENCE_LENGTH=128
BATCH_SIZE=32
```

To get started:
```bash
cp .env.example .env
```
Then edit the `.env` file with your specific configuration values.

## Running the Service

Development mode:

```bash
uvicorn app.main:app --reload
```

Production mode:

```bash
uvicorn app.main:app --host 0.0.0.0 --port 8000
```

## CLI Usage

The service provides a command-line interface for model management and server operations:

### Starting the Server

```bash
# Development mode (auto-reload enabled)
intent-cli serve

# Production mode
intent-cli serve --environment prod --workers 4

# Custom configuration
intent-cli serve --port 9000 --host 127.0.0.1
```

### Model Management

Train a new model:

```bash
intent-cli train \
    --dataset-path data/training.csv \
    --experiment-name "my-experiment" \
    --num-epochs 5
```

Register a trained model:

```bash
intent-cli register \
    <run_id> \
    "my-model-name" \
    --description "Description of the model" \
    --tags '{"version": "1.0.0", "author": "team"}'
```

Search for models:

```bash
intent-cli search \
    --name-contains "bert" \
    --tags '{"version": "1.0.0"}' \
    --intents "greeting,farewell"
```

Get model information:

```bash
intent-cli info <model_id>
```

Make predictions:

```bash
intent-cli predict <model_id> "your text here"
```

### CLI Options

Each command supports various options. Use the `--help` flag to see detailed documentation:

```bash
intent-cli --help  # Show all commands
intent-cli serve --help  # Show options for serve command
intent-cli train --help  # Show options for train command
```

## Contributing

1. Fork the repository
2. Create your feature branch
3. Commit your changes
4. Push to the branch
5. Create a Pull Request

## License

This project is licensed under the MIT License - see the LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "intent-service",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "intent, classification, nlp, machine-learning",
    "author": null,
    "author_email": "Eliot Zubkoff <eliot.i.zubkoff@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/8a/b0/327763713ac8e34136edc6e91646b27b181734e69a8cb8e06a7fe26cfd57/intent_service-0.1.10.tar.gz",
    "platform": null,
    "description": "# Intent Service\n\n# Intent-Service: Simplifying Fine-Tuning of Encoder Models for Classification\n\nThe intent-service is a tool designed to streamline the process of fine-tuning encoder-based models, such as BERT, for classification tasks. Specifically, this project focuses on simplifying the training of models for intent classification, which is a critical task in natural language processing (NLP) applications such as chatbots, virtual assistants, and other conversational AI systems.\n\n## Background\nEncoder models like BERT (Bidirectional Encoder Representations from Transformers) have revolutionized the way we process and understand language. These models are pre-trained on vast amounts of text data and can be fine-tuned to perform a wide range of downstream tasks with minimal effort. One of the most common applications of these models is intent classification\u2014the task of determining the user's intent based on their input text.\n\nIntent classification plays a central role in conversational AI systems, such as Google Assistant, Siri, Alexa, and countless custom chatbot solutions. By understanding the user's intent (e.g., \"set an alarm,\" \"get the weather,\" \"play music\"), these systems can trigger appropriate actions or provide relevant responses.\n\nHowever, fine-tuning these models for intent classification can be challenging. It requires a well-organized approach to dataset preparation, hyperparameter tuning, and model optimization. Intent Classifier aims to simplify this process, making it easier for developers to deploy high-performance intent classification models for their applications.\n\n\n## Installation\n\nThis project uses [uv](https://github.com/astral-sh/uv) for dependency management. To get started:\n\n1. Install uv:\n\n```bash\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n```\n\n2. Clone the repository:\n\n```bash\ngit clone https://github.com/yourusername/intent-service.git\ncd intent-service\n```\n\n3. Create a virtual environment and install dependencies:\n\n```bash\nuv venv\nsource .venv/bin/activate  # On Unix/macOS\n# or\n.venv\\Scripts\\activate  # On Windows\n\nuv pip install -r requirements.txt\n```\n\n## Development Setup\n\n### Code Quality Tools\n\nWe use several tools to maintain code quality:\n\n- **Ruff**: For fast Python linting and formatting\n- **Pytest**: For unit testing\n\nInstall development dependencies:\n\n```bash\nuv pip install -r requirements-dev.txt\n```\n\n### Running Code Quality Checks\n\n```bash\n# Run linting\nruff check .\n\n# Run tests\npytest\n```\n\n### Pre-commit Hooks\n\nWe use pre-commit hooks to ensure code quality before commits. To set up:\n\n```bash\npre-commit install\n```\n\n## API Usage\n\nThe service provides a REST API for intent processing. Here are the main endpoints:\n\n### Model Management\n\n#### Get Model Information\n```bash\nGET /model/{model_id}\n```\nRetrieves detailed information about a specific model. The `model_id` can be either a registered model name or MLflow run ID.\n\n#### Search Models\n```bash\nPOST /model/search\n```\nSearch for registered models based on various criteria:\n```json\n{\n  \"tags\": {\"version\": \"1.0.0\"},\n  \"intents\": [\"greeting\", \"farewell\"],\n  \"name_contains\": \"bert\",\n  \"limit\": 10\n}\n```\n\n#### Register Model\n```bash\nPOST /model/register\n```\nRegister an existing MLflow run as a named model:\n```json\n{\n  \"mlflow_run_id\": \"run_123\",\n  \"name\": \"intent-classifier-v1\",\n  \"description\": \"Intent classifier using DistilBERT\",\n  \"tags\": {\n    \"version\": \"1.0.0\",\n    \"author\": \"team\"\n  }\n}\n```\n\n### Training\n\n#### Train New Model\n```bash\nPOST /model/train\n```\nTrain a new intent classification model:\n```json\n{\n  \"intents\": [\"greeting\", \"farewell\", \"help\"],\n  \"dataset_source\": {\n    \"source_type\": \"url\",\n    \"url\": \"https://example.com/dataset.csv\"\n  },\n  \"model_name\": \"distilbert-base-uncased\",\n  \"experiment_name\": \"intent-training\",\n  \"training_config\": {\n    \"num_epochs\": 5,\n    \"batch_size\": 32,\n    \"learning_rate\": 5e-5\n  }\n}\n```\n\n### Prediction\n\n#### Generate Predictions\n```bash\nPOST /model/{model_id}/predict?text=Hello%20there\n```\nGenerate intent predictions for input text. Returns confidence scores for each intent:\n```json\n{\n  \"greeting\": 0.85,\n  \"farewell\": 0.10,\n  \"help\": 0.05\n}\n```\n\n### API Documentation\n\nFull API documentation is available at `/docs` when running the service. This provides an interactive Swagger UI where you can:\n- View detailed endpoint specifications\n- Try out API calls directly\n- See request/response schemas\n- Access example payloads\n\n## Development Workflow\n\n1. Create a new branch for your feature/fix:\n\n   ```bash\n   git checkout -b feature/your-feature-name\n   ```\n\n2. Make your changes and ensure all tests pass:\n\n   ```bash\n   pytest\n   ```\n\n3. Run code quality checks:\n\n   ```bash\n   ruff check .\n   mypy .\n   ```\n\n4. Commit your changes:\n\n   ```bash\n   git commit -m \"feat: add your feature description\"\n   ```\n\n5. Push your changes and create a pull request:\n   ```bash\n   git push origin feature/your-feature-name\n   ```\n\n## Environment Variables\n\nCreate a `.env` file in the root directory based on the provided `.env.example`. Here are the available configuration options:\n\n### Application Settings\n```env\nDEBUG=True\nLOG_LEVEL=INFO\nAPI_KEY=your_api_key_here\nENVIRONMENT=dev  # Options: dev, prod\nVSCODE_DEBUGGER=False\n```\n\n### Server Settings\n```env\nHOST=0.0.0.0\nPORT=8000\n```\n\n### MLflow Settings\n```env\nMLFLOW_TRACKING_URI=http://localhost:5000\nMLFLOW_TRACKING_USERNAME=mlflow\nMLFLOW_TRACKING_PASSWORD=mlflow123\nMLFLOW_S3_ENDPOINT_URL=http://localhost:9000  # For MinIO/S3 artifact storage\nMLFLOW_ARTIFACT_ROOT=s3://mlflow/artifacts\nAWS_ACCESS_KEY_ID=minioadmin          # For MinIO/S3 access\nAWS_SECRET_ACCESS_KEY=minioadmin123   # For MinIO/S3 access\nMLFLOW_EXPERIMENT_NAME=intent-service  # Default experiment name\n```\n\n### Model Settings\n```env\nDEFAULT_MODEL_NAME=distilbert-base-uncased\nMAX_SEQUENCE_LENGTH=128\nBATCH_SIZE=32\n```\n\nTo get started:\n```bash\ncp .env.example .env\n```\nThen edit the `.env` file with your specific configuration values.\n\n## Running the Service\n\nDevelopment mode:\n\n```bash\nuvicorn app.main:app --reload\n```\n\nProduction mode:\n\n```bash\nuvicorn app.main:app --host 0.0.0.0 --port 8000\n```\n\n## CLI Usage\n\nThe service provides a command-line interface for model management and server operations:\n\n### Starting the Server\n\n```bash\n# Development mode (auto-reload enabled)\nintent-cli serve\n\n# Production mode\nintent-cli serve --environment prod --workers 4\n\n# Custom configuration\nintent-cli serve --port 9000 --host 127.0.0.1\n```\n\n### Model Management\n\nTrain a new model:\n\n```bash\nintent-cli train \\\n    --dataset-path data/training.csv \\\n    --experiment-name \"my-experiment\" \\\n    --num-epochs 5\n```\n\nRegister a trained model:\n\n```bash\nintent-cli register \\\n    <run_id> \\\n    \"my-model-name\" \\\n    --description \"Description of the model\" \\\n    --tags '{\"version\": \"1.0.0\", \"author\": \"team\"}'\n```\n\nSearch for models:\n\n```bash\nintent-cli search \\\n    --name-contains \"bert\" \\\n    --tags '{\"version\": \"1.0.0\"}' \\\n    --intents \"greeting,farewell\"\n```\n\nGet model information:\n\n```bash\nintent-cli info <model_id>\n```\n\nMake predictions:\n\n```bash\nintent-cli predict <model_id> \"your text here\"\n```\n\n### CLI Options\n\nEach command supports various options. Use the `--help` flag to see detailed documentation:\n\n```bash\nintent-cli --help  # Show all commands\nintent-cli serve --help  # Show options for serve command\nintent-cli train --help  # Show options for train command\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create your feature branch\n3. Commit your changes\n4. Push to the branch\n5. Create a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Intent classification service",
    "version": "0.1.10",
    "project_urls": {
        "homepage": "https://github.com/eliotdoesprogramming/intent-service",
        "repository": "https://github.com/eliotdoesprogramming/intent-service"
    },
    "split_keywords": [
        "intent",
        " classification",
        " nlp",
        " machine-learning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7004733ac8be2caa692d506545c597ef2ef116b6ff8196c7c3fc426d04721c33",
                "md5": "70581113d8594fbf3e87cc99df4502f9",
                "sha256": "d79581275b43a4a4dd761b185c465f75a024c14dda2672b142d4f42dc9eb9170"
            },
            "downloads": -1,
            "filename": "intent_service-0.1.10-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "70581113d8594fbf3e87cc99df4502f9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 33359,
            "upload_time": "2024-12-02T20:39:21",
            "upload_time_iso_8601": "2024-12-02T20:39:21.596331Z",
            "url": "https://files.pythonhosted.org/packages/70/04/733ac8be2caa692d506545c597ef2ef116b6ff8196c7c3fc426d04721c33/intent_service-0.1.10-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8ab0327763713ac8e34136edc6e91646b27b181734e69a8cb8e06a7fe26cfd57",
                "md5": "606b391675497046a8c8af3c1336331c",
                "sha256": "9073018d1fbaa18c2c5c5eca6c0a301a5056027772f7c3ed01036803473b098a"
            },
            "downloads": -1,
            "filename": "intent_service-0.1.10.tar.gz",
            "has_sig": false,
            "md5_digest": "606b391675497046a8c8af3c1336331c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 40648,
            "upload_time": "2024-12-02T20:39:23",
            "upload_time_iso_8601": "2024-12-02T20:39:23.213566Z",
            "url": "https://files.pythonhosted.org/packages/8a/b0/327763713ac8e34136edc6e91646b27b181734e69a8cb8e06a7fe26cfd57/intent_service-0.1.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-02 20:39:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "eliotdoesprogramming",
    "github_project": "intent-service",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "fastapi",
            "specs": [
                [
                    ">=",
                    "0.68.0"
                ]
            ]
        },
        {
            "name": "uvicorn",
            "specs": [
                [
                    ">=",
                    "0.15.0"
                ]
            ]
        },
        {
            "name": "pydantic",
            "specs": [
                [
                    ">=",
                    "1.8.0"
                ]
            ]
        },
        {
            "name": "typer",
            "specs": [
                [
                    ">=",
                    "0.9.0"
                ]
            ]
        },
        {
            "name": "rich",
            "specs": [
                [
                    ">=",
                    "13.7.0"
                ]
            ]
        },
        {
            "name": "jinja2",
            "specs": [
                [
                    ">=",
                    "3.1.2"
                ]
            ]
        }
    ],
    "lcname": "intent-service"
}
        
Elapsed time: 0.39764s