opendatalabs-query-sdk

Name	opendatalabs-query-sdk JSON
Version	0.1.1 JSON
	download
home_page	None
Summary	A Python SDK for interacting with the Query API service
upload_time	2025-02-21 00:09:38
maintainer	None
docs_url	None
author	None
requires_python	>=3.7
license	None
keywords	api query sdk
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Query API SDK

A Python SDK for interacting with Vana's Query API service. This SDK allows you to easily query and transform your AI-generated data, including posts, tweets, and other content generated by your AI models.

## Features

- 🐍 Full Python type hints
- 🔒 Built-in authentication
- 🔄 Async-style API
- 📊 Data transformation support
- 🔔 Webhook integration
- 📝 Comprehensive typing
- ⏱️ Polling utilities for long-running queries

## Installation

```bash
pip install query-api-sdk
# or
poetry add query-api-sdk
```

## Quick Start

```python
from query_api_sdk import create_client, QueryClientConfig

client = create_client(QueryClientConfig(
    api_key='your-api-key',
    base_url='https://api.vana.ai/query'
))

# Submit a query and get results
def get_my_posts():
    try:
        query_id = client.submit_query({
            "query": "SELECT * FROM reddit_posts WHERE file_owner = 'my_user_id'"
        })
        
        results = client.wait_for_results(query_id)
        print(results)
    except Exception as error:
        print(f"Error: {str(error)}")
```

## Available Data Schemas

The Query API provides access to various Vana-generated content types:

### Reddit Posts
```sql
reddit_posts {
  file_owner: string     -- User ID of the post owner
  post_id: integer       -- Unique identifier for the post
  title: string         -- Post title
  content: string       -- Post content
}
```

### Twitter Tweets
```sql
twitter_tweets {
  file_owner: string    -- User ID of the tweet owner
  tweet_id: integer     -- Unique identifier for the tweet
  text: string         -- Tweet content
}
```

## Detailed Usage

### Configuration

```python
from query_api_sdk import create_client, QueryClientConfig

client = create_client(QueryClientConfig(
    api_key='your-api-key',
    base_url='https://api.vana.ai/query',
    timeout=30000  # Optional: default is 10000ms
))
```

### Getting Available Schemas

```python
schemas = client.get_schemas()
print(schemas)
```

### Submitting Queries

Basic query:
```python
query_id = client.submit_query({
    "query": "SELECT * FROM reddit_posts LIMIT 10"
})
```

With data transformation:
```python
query_id = client.submit_query({
    "query": "SELECT * FROM reddit_posts",
    "transform": """
    def transform(rows):
        return [{**row, "word_count": len(row["content"].split())} for row in rows]
    """
})
```

With webhook notification:
```python
query_id = client.submit_query({
    "query": "SELECT * FROM twitter_tweets",
    "webhook_url": "https://your-server.com/webhook"
})
```

### Checking Query Status

```python
status = client.get_query_status(query_id)
print(status["status"])  # 'queued' | 'processing' | 'completed' | 'failed'
```

### Getting Results

With pagination:
```python
results = client.get_query_results(
    query_id,
    limit=100,
    cursor="200"
)
```

Waiting for completion:
```python
results = client.wait_for_results(
    query_id,
    timeout=300000,      # Optional: max time to wait (default: 5 minutes)
    poll_interval=1000   # Optional: time between status checks (default: 1 second)
)
```

## Common Query Examples

### Getting Recent Posts

```python
query_id = client.submit_query({
    "query": """
        SELECT *
        FROM reddit_posts
        WHERE file_owner = 'your_user_id'
        ORDER BY post_id DESC
        LIMIT 10
    """
})
```

### Analyzing Content Length

```python
query_id = client.submit_query({
    "query": """
        SELECT *
        FROM reddit_posts
        WHERE file_owner = 'your_user_id'
    """,
    "transform": """
    def transform(rows):
        return [{
            **row,
            "content_length": len(row["content"]),
            "word_count": len(row["content"].split())
        } for row in rows]
    """
})
```

### Combining Data Sources

```python
query_id = client.submit_query({
    "query": """
        SELECT 
            'reddit' as source,
            title as content,
            post_id as id
        FROM reddit_posts
        WHERE file_owner = 'your_user_id'
        UNION ALL
        SELECT 
            'twitter' as source,
            text as content,
            tweet_id as id
        FROM twitter_tweets
        WHERE file_owner = 'your_user_id'
    """
})
```

## Error Handling

The SDK uses a custom `QueryAPIError` class for error handling:

```python
from query_api_sdk import QueryAPIError

try:
    results = client.get_query_results("invalid-id")
except QueryAPIError as error:
    print(f"API Error: {str(error)}")
    print(f"Status Code: {error.status_code}")
    print(f"Response: {error.response}")
```

## Webhook Integration

When providing a webhook URL, your endpoint will receive POST requests with the following format:

```python
{
    "query_id": str,
    "status": str,  # 'completed' | 'failed'
    "error": Optional[str]
}
```

Example webhook handler (Flask):
```python
from flask import Flask, request

app = Flask(__name__)

@app.route('/webhook', methods=['POST'])
def webhook():
    data = request.json
    query_id = data["query_id"]
    status = data["status"]
    error = data.get("error")
    
    if status == "completed":
        # Handle completion
        pass
    elif status == "failed":
        # Handle failure
        pass
    
    return "", 200
```

## Rate Limiting

The Query API implements rate limiting to ensure fair usage. The SDK will automatically handle rate limit responses by raising a `QueryAPIError` with the appropriate status code and message.

## Type Hints Support

The SDK is written with full Python type hints and provides comprehensive type definitions for all features. You can import types directly:

```python
from query_api_sdk import (
    QueryStatusType,
    Schema,
    QueryRequest,
    QueryResults
)
```

## Development

For development, clone the repository and install dependencies:

```bash
git clone https://github.com/vana-com/query-sdk-python.git
cd query-sdk-python
pip install -e ".[dev]"
```

Run tests:
```bash
pytest
```

## License

MIT License - see [LICENSE](LICENSE) for details.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "opendatalabs-query-sdk",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "api, query, sdk",
    "author": null,
    "author_email": "OpenDataLabs <alex@opendatalabs.xyz>",
    "download_url": "https://files.pythonhosted.org/packages/c4/50/d8e8f9410177a226ed963e39d0590985ab09e9ff85ab2f94a6e1b4092a42/opendatalabs_query_sdk-0.1.1.tar.gz",
    "platform": null,
    "description": "# Query API SDK\n\nA Python SDK for interacting with Vana's Query API service. This SDK allows you to easily query and transform your AI-generated data, including posts, tweets, and other content generated by your AI models.\n\n## Features\n\n- \ud83d\udc0d Full Python type hints\n- \ud83d\udd12 Built-in authentication\n- \ud83d\udd04 Async-style API\n- \ud83d\udcca Data transformation support\n- \ud83d\udd14 Webhook integration\n- \ud83d\udcdd Comprehensive typing\n- \u23f1\ufe0f Polling utilities for long-running queries\n\n## Installation\n\n```bash\npip install query-api-sdk\n# or\npoetry add query-api-sdk\n```\n\n## Quick Start\n\n```python\nfrom query_api_sdk import create_client, QueryClientConfig\n\nclient = create_client(QueryClientConfig(\n    api_key='your-api-key',\n    base_url='https://api.vana.ai/query'\n))\n\n# Submit a query and get results\ndef get_my_posts():\n    try:\n        query_id = client.submit_query({\n            \"query\": \"SELECT * FROM reddit_posts WHERE file_owner = 'my_user_id'\"\n        })\n        \n        results = client.wait_for_results(query_id)\n        print(results)\n    except Exception as error:\n        print(f\"Error: {str(error)}\")\n```\n\n## Available Data Schemas\n\nThe Query API provides access to various Vana-generated content types:\n\n### Reddit Posts\n```sql\nreddit_posts {\n  file_owner: string     -- User ID of the post owner\n  post_id: integer       -- Unique identifier for the post\n  title: string         -- Post title\n  content: string       -- Post content\n}\n```\n\n### Twitter Tweets\n```sql\ntwitter_tweets {\n  file_owner: string    -- User ID of the tweet owner\n  tweet_id: integer     -- Unique identifier for the tweet\n  text: string         -- Tweet content\n}\n```\n\n## Detailed Usage\n\n### Configuration\n\n```python\nfrom query_api_sdk import create_client, QueryClientConfig\n\nclient = create_client(QueryClientConfig(\n    api_key='your-api-key',\n    base_url='https://api.vana.ai/query',\n    timeout=30000  # Optional: default is 10000ms\n))\n```\n\n### Getting Available Schemas\n\n```python\nschemas = client.get_schemas()\nprint(schemas)\n```\n\n### Submitting Queries\n\nBasic query:\n```python\nquery_id = client.submit_query({\n    \"query\": \"SELECT * FROM reddit_posts LIMIT 10\"\n})\n```\n\nWith data transformation:\n```python\nquery_id = client.submit_query({\n    \"query\": \"SELECT * FROM reddit_posts\",\n    \"transform\": \"\"\"\n    def transform(rows):\n        return [{**row, \"word_count\": len(row[\"content\"].split())} for row in rows]\n    \"\"\"\n})\n```\n\nWith webhook notification:\n```python\nquery_id = client.submit_query({\n    \"query\": \"SELECT * FROM twitter_tweets\",\n    \"webhook_url\": \"https://your-server.com/webhook\"\n})\n```\n\n### Checking Query Status\n\n```python\nstatus = client.get_query_status(query_id)\nprint(status[\"status\"])  # 'queued' | 'processing' | 'completed' | 'failed'\n```\n\n### Getting Results\n\nWith pagination:\n```python\nresults = client.get_query_results(\n    query_id,\n    limit=100,\n    cursor=\"200\"\n)\n```\n\nWaiting for completion:\n```python\nresults = client.wait_for_results(\n    query_id,\n    timeout=300000,      # Optional: max time to wait (default: 5 minutes)\n    poll_interval=1000   # Optional: time between status checks (default: 1 second)\n)\n```\n\n## Common Query Examples\n\n### Getting Recent Posts\n\n```python\nquery_id = client.submit_query({\n    \"query\": \"\"\"\n        SELECT *\n        FROM reddit_posts\n        WHERE file_owner = 'your_user_id'\n        ORDER BY post_id DESC\n        LIMIT 10\n    \"\"\"\n})\n```\n\n### Analyzing Content Length\n\n```python\nquery_id = client.submit_query({\n    \"query\": \"\"\"\n        SELECT *\n        FROM reddit_posts\n        WHERE file_owner = 'your_user_id'\n    \"\"\",\n    \"transform\": \"\"\"\n    def transform(rows):\n        return [{\n            **row,\n            \"content_length\": len(row[\"content\"]),\n            \"word_count\": len(row[\"content\"].split())\n        } for row in rows]\n    \"\"\"\n})\n```\n\n### Combining Data Sources\n\n```python\nquery_id = client.submit_query({\n    \"query\": \"\"\"\n        SELECT \n            'reddit' as source,\n            title as content,\n            post_id as id\n        FROM reddit_posts\n        WHERE file_owner = 'your_user_id'\n        UNION ALL\n        SELECT \n            'twitter' as source,\n            text as content,\n            tweet_id as id\n        FROM twitter_tweets\n        WHERE file_owner = 'your_user_id'\n    \"\"\"\n})\n```\n\n## Error Handling\n\nThe SDK uses a custom `QueryAPIError` class for error handling:\n\n```python\nfrom query_api_sdk import QueryAPIError\n\ntry:\n    results = client.get_query_results(\"invalid-id\")\nexcept QueryAPIError as error:\n    print(f\"API Error: {str(error)}\")\n    print(f\"Status Code: {error.status_code}\")\n    print(f\"Response: {error.response}\")\n```\n\n## Webhook Integration\n\nWhen providing a webhook URL, your endpoint will receive POST requests with the following format:\n\n```python\n{\n    \"query_id\": str,\n    \"status\": str,  # 'completed' | 'failed'\n    \"error\": Optional[str]\n}\n```\n\nExample webhook handler (Flask):\n```python\nfrom flask import Flask, request\n\napp = Flask(__name__)\n\n@app.route('/webhook', methods=['POST'])\ndef webhook():\n    data = request.json\n    query_id = data[\"query_id\"]\n    status = data[\"status\"]\n    error = data.get(\"error\")\n    \n    if status == \"completed\":\n        # Handle completion\n        pass\n    elif status == \"failed\":\n        # Handle failure\n        pass\n    \n    return \"\", 200\n```\n\n## Rate Limiting\n\nThe Query API implements rate limiting to ensure fair usage. The SDK will automatically handle rate limit responses by raising a `QueryAPIError` with the appropriate status code and message.\n\n## Type Hints Support\n\nThe SDK is written with full Python type hints and provides comprehensive type definitions for all features. You can import types directly:\n\n```python\nfrom query_api_sdk import (\n    QueryStatusType,\n    Schema,\n    QueryRequest,\n    QueryResults\n)\n```\n\n## Development\n\nFor development, clone the repository and install dependencies:\n\n```bash\ngit clone https://github.com/vana-com/query-sdk-python.git\ncd query-sdk-python\npip install -e \".[dev]\"\n```\n\nRun tests:\n```bash\npytest\n```\n\n## License\n\nMIT License - see [LICENSE](LICENSE) for details.",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python SDK for interacting with the Query API service",
    "version": "0.1.1",
    "project_urls": {
        "Documentation": "https://github.com/vana-com/query-sdk-python#readme",
        "Homepage": "https://github.com/vana-com/query-sdk-python",
        "Issues": "https://github.com/vana-com/query-sdk-python/issues"
    },
    "split_keywords": [
        "api",
        " query",
        " sdk"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3a3c899523ab1bb689accd9598600f2368a8f9d4a33e75bea53ab9b5dd44ed8d",
                "md5": "c61836ccd5927c6a65921d329ddf9790",
                "sha256": "738b0d9ff860e821f16d8f7d06d8582ca4a8c55df8ed1282a6245e0dcac4bdf1"
            },
            "downloads": -1,
            "filename": "opendatalabs_query_sdk-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c61836ccd5927c6a65921d329ddf9790",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 5777,
            "upload_time": "2025-02-21T00:09:36",
            "upload_time_iso_8601": "2025-02-21T00:09:36.644942Z",
            "url": "https://files.pythonhosted.org/packages/3a/3c/899523ab1bb689accd9598600f2368a8f9d4a33e75bea53ab9b5dd44ed8d/opendatalabs_query_sdk-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c450d8e8f9410177a226ed963e39d0590985ab09e9ff85ab2f94a6e1b4092a42",
                "md5": "f5b0aa59ac4654d6586a4143cdb788e6",
                "sha256": "7cb395fe12073351e35eff0e399991b4ab210af4832a97be45f621d4cc3c95b7"
            },
            "downloads": -1,
            "filename": "opendatalabs_query_sdk-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "f5b0aa59ac4654d6586a4143cdb788e6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 4907,
            "upload_time": "2025-02-21T00:09:38",
            "upload_time_iso_8601": "2025-02-21T00:09:38.884022Z",
            "url": "https://files.pythonhosted.org/packages/c4/50/d8e8f9410177a226ed963e39d0590985ab09e9ff85ab2f94a6e1b4092a42/opendatalabs_query_sdk-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-21 00:09:38",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "vana-com",
    "github_project": "query-sdk-python#readme",
    "github_not_found": true,
    "lcname": "opendatalabs-query-sdk"
}

None