together

Name	together JSON
Version	1.5.23 JSON
	download
home_page	None
Summary	Python client for Together's Cloud Platform!
upload_time	2025-08-06 23:32:28
maintainer	None
docs_url	None
author	Together AI
requires_python	<4.0,>=3.10
license	Apache-2.0
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <div align="center">
  <a href="https://www.together.ai/">
    <img alt="together.ai" height="100px" src="https://assets-global.website-files.com/64f6f2c0e3f4c5a91c1e823a/654693d569494912cfc0c0d4_favicon.svg">
  </a>
</div>

# Together Python API library

[![PyPI version](https://img.shields.io/pypi/v/together.svg)](https://pypi.org/project/together/)
[![Discord](https://dcbadge.vercel.app/api/server/9Rk6sSeWEG?style=flat&compact=true)](https://discord.com/invite/9Rk6sSeWEG)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/togethercompute.svg?style=social&label=Follow%20%40togethercompute)](https://twitter.com/togethercompute)

The [Together Python API Library](https://pypi.org/project/together/) is the official Python client for Together's API platform, providing a convenient way for interacting with the REST APIs and enables easy integrations with Python 3.10+ applications with easy to use synchronous and asynchronous clients.



## Installation

> 🚧
> The Library was rewritten in v1.0.0 released in April of 2024. There were significant changes made.

To install Together Python Library from PyPI, simply run:

```shell Shell
pip install --upgrade together
```

### Setting up API Key

> 🚧 You will need to create an account with [Together.ai](https://api.together.xyz/) to obtain a Together API Key.

Once logged in to the Together Playground, you can find available API keys in [this settings page](https://api.together.xyz/settings/api-keys).

#### Setting environment variable

```shell
export TOGETHER_API_KEY=xxxxx
```

#### Using the client

```python
from together import Together

client = Together(api_key="xxxxx")
```

This repo contains both a Python Library and a CLI. We'll demonstrate how to use both below.

## Usage – Python Client

### Chat Completions

```python
from together import Together

client = Together()

# Simple text message
response = client.chat.completions.create(
    model="meta-llama/Llama-4-Scout-17B-16E-Instruct",
    messages=[{"role": "user", "content": "tell me about new york"}],
)
print(response.choices[0].message.content)

# Multi-modal message with text and image
response = client.chat.completions.create(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo",
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What's in this image?"
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
                }
            }
        ]
    }]
)
print(response.choices[0].message.content)

# Multi-modal message with multiple images
response = client.chat.completions.create(
    model="Qwen/Qwen2.5-VL-72B-Instruct",
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Compare these two images."
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
                }
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/slack.png"
                }
            }
        ]
    }]
)
print(response.choices[0].message.content)

# Multi-modal message with text and video
response = client.chat.completions.create(
    model="Qwen/Qwen2.5-VL-72B-Instruct",
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What's happening in this video?"
            },
            {
                "type": "video_url",
                "video_url": {
                    "url": "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerFun.mp4"
                }
            }
        ]
    }]
)
print(response.choices[0].message.content)
```

The chat completions API supports three types of content:
- Plain text messages using the `content` field directly
- Multi-modal messages with images using `type: "image_url"`
- Multi-modal messages with videos using `type: "video_url"`

When using multi-modal content, the `content` field becomes an array of content objects, each with its own type and corresponding data.

#### Streaming

```python
import os
from together import Together

client = Together()
stream = client.chat.completions.create(
    model="meta-llama/Llama-4-Scout-17B-16E-Instruct",
    messages=[{"role": "user", "content": "tell me about new york"}],
    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)
```

#### Async usage

```python
import asyncio
from together import AsyncTogether

async_client = AsyncTogether()
messages = [
    "What are the top things to do in San Francisco?",
    "What country is Paris in?",
]

async def async_chat_completion(messages):
    async_client = AsyncTogether()
    tasks = [
        async_client.chat.completions.create(
            model="meta-llama/Llama-4-Scout-17B-16E-Instruct",
            messages=[{"role": "user", "content": message}],
        )
        for message in messages
    ]
    responses = await asyncio.gather(*tasks)

    for response in responses:
        print(response.choices[0].message.content)

asyncio.run(async_chat_completion(messages))
```

#### Fetching logprobs

Logprobs are logarithms of token-level generation probabilities that indicate the likelihood of the generated token based on the previous tokens in the context. Logprobs allow us to estimate the model's confidence in its outputs, which can be used to decide how to optimally consume the model's output (e.g. rejecting low confidence outputs, retrying or ensembling model outputs etc).

```python
from together import Together

client = Together()

response = client.chat.completions.create(
    model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
    messages=[{"role": "user", "content": "tell me about new york"}],
    logprobs=1
)

response_lobprobs = response.choices[0].logprobs

print(dict(zip(response_lobprobs.tokens, response_lobprobs.token_logprobs)))
# {'New': -2.384e-07, ' York': 0.0, ',': 0.0, ' also': -0.20703125, ' known': -0.20214844, ' as': -8.34465e-07, ... }
```

More details about using logprobs in Together's API can be found [here](https://docs.together.ai/docs/logprobs).


### Completions

Completions are for code and language models shown [here](https://docs.together.ai/docs/inference-models). Below, a code model example is shown.

```python
from together import Together

client = Together()

response = client.completions.create(
    model="codellama/CodeLlama-34b-Python-hf",
    prompt="Write a Next.js component with TailwindCSS for a header component.",
    max_tokens=200,
)
print(response.choices[0].text)
```

#### Streaming

```python
from together import Together

client = Together()
stream = client.completions.create(
    model="codellama/CodeLlama-34b-Python-hf",
    prompt="Write a Next.js component with TailwindCSS for a header component.",
    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)
```

#### Async usage

```python
import asyncio
from together import AsyncTogether

async_client = AsyncTogether()
prompts = [
    "Write a Next.js component with TailwindCSS for a header component.",
    "Write a python function for the fibonacci sequence",
]

async def async_chat_completion(prompts):
    tasks = [
        async_client.completions.create(
            model="codellama/CodeLlama-34b-Python-hf",
            prompt=prompt,
        )
        for prompt in prompts
    ]
    responses = await asyncio.gather(*tasks)

    for response in responses:
        print(response.choices[0].text)

asyncio.run(async_chat_completion(prompts))
```

### Image generation

```python
from together import Together

client = Together()

response = client.images.generate(
    prompt="space robots",
    model="stabilityai/stable-diffusion-xl-base-1.0",
    steps=10,
    n=4,
)
print(response.data[0].b64_json)
```

### Embeddings

```python
from typing import List
from together import Together

client = Together()

def get_embeddings(texts: List[str], model: str) -> List[List[float]]:
    texts = [text.replace("\n", " ") for text in texts]
    outputs = client.embeddings.create(model=model, input = texts)
    return [outputs.data[i].embedding for i in range(len(texts))]

input_texts = ['Our solar system orbits the Milky Way galaxy at about 515,000 mph']
embeddings = get_embeddings(input_texts, model='togethercomputer/m2-bert-80M-8k-retrieval')

print(embeddings)
```

### Reranking

```python
from typing import List
from together import Together

client = Together()

def get_reranked_documents(query: str, documents: List[str], model: str, top_n: int = 3) -> List[str]:
    outputs = client.rerank.create(model=model, query=query, documents=documents, top_n=top_n)
    # sort by relevance score and returns the original docs
    return [documents[i] for i in [x.index for x in sorted(outputs.results, key=lambda x: x.relevance_score, reverse=True)]]

query = "What is the capital of the United States?"
documents = ["New York","Washington, D.C.", "Los Angeles"]

reranked_documents = get_reranked_documents(query, documents, model='Salesforce/Llama-Rank-V1', top_n=1)

print(reranked_documents)
```

Read more about Reranking [here](https://docs.together.ai/docs/rerank-overview).

### Files

The files API is used for fine-tuning and allows developers to upload data to fine-tune on. It also has several methods to list all files, retrive files, and delete files. Please refer to our fine-tuning docs [here](https://docs.together.ai/docs/fine-tuning-python).

```python
from together import Together

client = Together()

client.files.upload(file="somedata.jsonl") # uploads a file
client.files.list() # lists all uploaded files
client.files.retrieve(id="file-d0d318cb-b7d9-493a-bd70-1cfe089d3815") # retrieves a specific file
client.files.retrieve_content(id="file-d0d318cb-b7d9-493a-bd70-1cfe089d3815") # retrieves content of a specific file
client.files.delete(id="file-d0d318cb-b7d9-493a-bd70-1cfe089d3815") # deletes a file
```

### Fine-tunes

The finetune API is used for fine-tuning and allows developers to create finetuning jobs. It also has several methods to list all jobs, retrive statuses and get checkpoints. Please refer to our fine-tuning docs [here](https://docs.together.ai/docs/fine-tuning-quickstart).

```python
from together import Together

client = Together()

client.fine_tuning.create(
  training_file = 'file-d0d318cb-b7d9-493a-bd70-1cfe089d3815',
  model = 'meta-llama/Llama-3.2-3B-Instruct',
  n_epochs = 3,
  n_checkpoints = 1,
  batch_size = "max",
  learning_rate = 1e-5,
  suffix = 'my-demo-finetune',
  wandb_api_key = '1a2b3c4d5e.......',
)
client.fine_tuning.list() # lists all fine-tuned jobs
client.fine_tuning.retrieve(id="ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b") # retrieves information on finetune event
client.fine_tuning.cancel(id="ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b") # Cancels a fine-tuning job
client.fine_tuning.list_events(id="ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b") #  Lists events of a fine-tune job
client.fine_tuning.download(id="ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b") # downloads compressed fine-tuned model or checkpoint to local disk
```

### Models

This lists all the models that Together supports.

```python
from together import Together

client = Together()

models = client.models.list()

for model in models:
    print(model)
```

### Batch Inference

The batch API allows you to submit larger inference jobs for completion with a 24 hour turn-around time, below is an example. To learn more refer to the [docs here](https://docs.together.ai/docs/batch-inference).

```python
from together import Together

client = Together()

# Upload the batch file
batch_file = client.files.upload(file="simpleqa_batch_student.jsonl", purpose="batch-api")

# Create the batch job
batch = client.batches.create_batch(file_id=batch_file.id, endpoint="/v1/chat/completions")

# Monitor the batch status
batch_stat = client.batches.get_batch(batch.id)

# List all batches - contains other batches as well
client.batches.list_batches()

# Download the file content if job completed
if batch_stat.status == 'COMPLETED':
    output_response = client.files.retrieve_content(id=batch_stat.output_file_id,
                                                    output="simpleqa_v3_output.jsonl")
```

## Usage – CLI

### Chat Completions

```bash
together chat.completions \
  --message "system" "You are a helpful assistant named Together" \
  --message "user" "What is your name?" \
  --model meta-llama/Llama-4-Scout-17B-16E-Instruct
```

The Chat Completions CLI enables streaming tokens to stdout by default. To disable streaming, use `--no-stream`.

### Completions

```bash
together completions \
  "Large language models are " \
  --model meta-llama/Llama-4-Scout-17B-16E-Instruct \
  --max-tokens 512 \
  --stop "."
```

The Completions CLI enables streaming tokens to stdout by default. To disable streaming, use `--no-stream`.

### Image Generations

```bash
together images generate \
  "space robots" \
  --model stabilityai/stable-diffusion-xl-base-1.0 \
  --n 4
```

The image is opened in the default image viewer by default. To disable this, use `--no-show`.

### Files

```bash
# Help
together files --help

# Check file
together files check example.jsonl

# Upload file
together files upload example.jsonl

# List files
together files list

# Retrieve file metadata
together files retrieve file-6f50f9d1-5b95-416c-9040-0799b2b4b894

# Retrieve file content
together files retrieve-content file-6f50f9d1-5b95-416c-9040-0799b2b4b894

# Delete remote file
together files delete file-6f50f9d1-5b95-416c-9040-0799b2b4b894
```

### Fine-tuning

```bash
# Help
together fine-tuning --help

# Create fine-tune job
together fine-tuning create \
  --model togethercomputer/llama-2-7b-chat \
  --training-file file-711d8724-b3e3-4ae2-b516-94841958117d

# List fine-tune jobs
together fine-tuning list

# Retrieve fine-tune job details
together fine-tuning retrieve ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b

# List fine-tune job events
together fine-tuning list-events ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b

# Cancel running job
together fine-tuning cancel ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b

# Download fine-tuned model weights
together fine-tuning download ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b
```

### Models

```bash
# Help
together models --help

# List models
together models list
```

## Contributing

Refer to the [Contributing Guide](CONTRIBUTING.md)

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "together",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Together AI",
    "author_email": "support@together.ai",
    "download_url": "https://files.pythonhosted.org/packages/bb/a2/67f1f9be4e66b816305f374646335a4cd1cc60d5c3c8c44c26226689f120/together-1.5.23.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <a href=\"https://www.together.ai/\">\n    <img alt=\"together.ai\" height=\"100px\" src=\"https://assets-global.website-files.com/64f6f2c0e3f4c5a91c1e823a/654693d569494912cfc0c0d4_favicon.svg\">\n  </a>\n</div>\n\n# Together Python API library\n\n[![PyPI version](https://img.shields.io/pypi/v/together.svg)](https://pypi.org/project/together/)\n[![Discord](https://dcbadge.vercel.app/api/server/9Rk6sSeWEG?style=flat&compact=true)](https://discord.com/invite/9Rk6sSeWEG)\n[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/togethercompute.svg?style=social&label=Follow%20%40togethercompute)](https://twitter.com/togethercompute)\n\nThe [Together Python API Library](https://pypi.org/project/together/) is the official Python client for Together's API platform, providing a convenient way for interacting with the REST APIs and enables easy integrations with Python 3.10+ applications with easy to use synchronous and asynchronous clients.\n\n\n\n## Installation\n\n> \ud83d\udea7\n> The Library was rewritten in v1.0.0 released in April of 2024. There were significant changes made.\n\nTo install Together Python Library from PyPI, simply run:\n\n```shell Shell\npip install --upgrade together\n```\n\n### Setting up API Key\n\n> \ud83d\udea7 You will need to create an account with [Together.ai](https://api.together.xyz/) to obtain a Together API Key.\n\nOnce logged in to the Together Playground, you can find available API keys in [this settings page](https://api.together.xyz/settings/api-keys).\n\n#### Setting environment variable\n\n```shell\nexport TOGETHER_API_KEY=xxxxx\n```\n\n#### Using the client\n\n```python\nfrom together import Together\n\nclient = Together(api_key=\"xxxxx\")\n```\n\nThis repo contains both a Python Library and a CLI. We'll demonstrate how to use both below.\n\n## Usage \u2013 Python Client\n\n### Chat Completions\n\n```python\nfrom together import Together\n\nclient = Together()\n\n# Simple text message\nresponse = client.chat.completions.create(\n    model=\"meta-llama/Llama-4-Scout-17B-16E-Instruct\",\n    messages=[{\"role\": \"user\", \"content\": \"tell me about new york\"}],\n)\nprint(response.choices[0].message.content)\n\n# Multi-modal message with text and image\nresponse = client.chat.completions.create(\n    model=\"meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo\",\n    messages=[{\n        \"role\": \"user\",\n        \"content\": [\n            {\n                \"type\": \"text\",\n                \"text\": \"What's in this image?\"\n            },\n            {\n                \"type\": \"image_url\",\n                \"image_url\": {\n                    \"url\": \"https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png\"\n                }\n            }\n        ]\n    }]\n)\nprint(response.choices[0].message.content)\n\n# Multi-modal message with multiple images\nresponse = client.chat.completions.create(\n    model=\"Qwen/Qwen2.5-VL-72B-Instruct\",\n    messages=[{\n        \"role\": \"user\",\n        \"content\": [\n            {\n                \"type\": \"text\",\n                \"text\": \"Compare these two images.\"\n            },\n            {\n                \"type\": \"image_url\",\n                \"image_url\": {\n                    \"url\": \"https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png\"\n                }\n            },\n            {\n                \"type\": \"image_url\",\n                \"image_url\": {\n                    \"url\": \"https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/slack.png\"\n                }\n            }\n        ]\n    }]\n)\nprint(response.choices[0].message.content)\n\n# Multi-modal message with text and video\nresponse = client.chat.completions.create(\n    model=\"Qwen/Qwen2.5-VL-72B-Instruct\",\n    messages=[{\n        \"role\": \"user\",\n        \"content\": [\n            {\n                \"type\": \"text\",\n                \"text\": \"What's happening in this video?\"\n            },\n            {\n                \"type\": \"video_url\",\n                \"video_url\": {\n                    \"url\": \"http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerFun.mp4\"\n                }\n            }\n        ]\n    }]\n)\nprint(response.choices[0].message.content)\n```\n\nThe chat completions API supports three types of content:\n- Plain text messages using the `content` field directly\n- Multi-modal messages with images using `type: \"image_url\"`\n- Multi-modal messages with videos using `type: \"video_url\"`\n\nWhen using multi-modal content, the `content` field becomes an array of content objects, each with its own type and corresponding data.\n\n#### Streaming\n\n```python\nimport os\nfrom together import Together\n\nclient = Together()\nstream = client.chat.completions.create(\n    model=\"meta-llama/Llama-4-Scout-17B-16E-Instruct\",\n    messages=[{\"role\": \"user\", \"content\": \"tell me about new york\"}],\n    stream=True,\n)\n\nfor chunk in stream:\n    print(chunk.choices[0].delta.content or \"\", end=\"\", flush=True)\n```\n\n#### Async usage\n\n```python\nimport asyncio\nfrom together import AsyncTogether\n\nasync_client = AsyncTogether()\nmessages = [\n    \"What are the top things to do in San Francisco?\",\n    \"What country is Paris in?\",\n]\n\nasync def async_chat_completion(messages):\n    async_client = AsyncTogether()\n    tasks = [\n        async_client.chat.completions.create(\n            model=\"meta-llama/Llama-4-Scout-17B-16E-Instruct\",\n            messages=[{\"role\": \"user\", \"content\": message}],\n        )\n        for message in messages\n    ]\n    responses = await asyncio.gather(*tasks)\n\n    for response in responses:\n        print(response.choices[0].message.content)\n\nasyncio.run(async_chat_completion(messages))\n```\n\n#### Fetching logprobs\n\nLogprobs are logarithms of token-level generation probabilities that indicate the likelihood of the generated token based on the previous tokens in the context. Logprobs allow us to estimate the model's confidence in its outputs, which can be used to decide how to optimally consume the model's output (e.g. rejecting low confidence outputs, retrying or ensembling model outputs etc).\n\n```python\nfrom together import Together\n\nclient = Together()\n\nresponse = client.chat.completions.create(\n    model=\"meta-llama/Llama-3.2-3B-Instruct-Turbo\",\n    messages=[{\"role\": \"user\", \"content\": \"tell me about new york\"}],\n    logprobs=1\n)\n\nresponse_lobprobs = response.choices[0].logprobs\n\nprint(dict(zip(response_lobprobs.tokens, response_lobprobs.token_logprobs)))\n# {'New': -2.384e-07, ' York': 0.0, ',': 0.0, ' also': -0.20703125, ' known': -0.20214844, ' as': -8.34465e-07, ... }\n```\n\nMore details about using logprobs in Together's API can be found [here](https://docs.together.ai/docs/logprobs).\n\n\n### Completions\n\nCompletions are for code and language models shown [here](https://docs.together.ai/docs/inference-models). Below, a code model example is shown.\n\n```python\nfrom together import Together\n\nclient = Together()\n\nresponse = client.completions.create(\n    model=\"codellama/CodeLlama-34b-Python-hf\",\n    prompt=\"Write a Next.js component with TailwindCSS for a header component.\",\n    max_tokens=200,\n)\nprint(response.choices[0].text)\n```\n\n#### Streaming\n\n```python\nfrom together import Together\n\nclient = Together()\nstream = client.completions.create(\n    model=\"codellama/CodeLlama-34b-Python-hf\",\n    prompt=\"Write a Next.js component with TailwindCSS for a header component.\",\n    stream=True,\n)\n\nfor chunk in stream:\n    print(chunk.choices[0].delta.content or \"\", end=\"\", flush=True)\n```\n\n#### Async usage\n\n```python\nimport asyncio\nfrom together import AsyncTogether\n\nasync_client = AsyncTogether()\nprompts = [\n    \"Write a Next.js component with TailwindCSS for a header component.\",\n    \"Write a python function for the fibonacci sequence\",\n]\n\nasync def async_chat_completion(prompts):\n    tasks = [\n        async_client.completions.create(\n            model=\"codellama/CodeLlama-34b-Python-hf\",\n            prompt=prompt,\n        )\n        for prompt in prompts\n    ]\n    responses = await asyncio.gather(*tasks)\n\n    for response in responses:\n        print(response.choices[0].text)\n\nasyncio.run(async_chat_completion(prompts))\n```\n\n### Image generation\n\n```python\nfrom together import Together\n\nclient = Together()\n\nresponse = client.images.generate(\n    prompt=\"space robots\",\n    model=\"stabilityai/stable-diffusion-xl-base-1.0\",\n    steps=10,\n    n=4,\n)\nprint(response.data[0].b64_json)\n```\n\n### Embeddings\n\n```python\nfrom typing import List\nfrom together import Together\n\nclient = Together()\n\ndef get_embeddings(texts: List[str], model: str) -> List[List[float]]:\n    texts = [text.replace(\"\\n\", \" \") for text in texts]\n    outputs = client.embeddings.create(model=model, input = texts)\n    return [outputs.data[i].embedding for i in range(len(texts))]\n\ninput_texts = ['Our solar system orbits the Milky Way galaxy at about 515,000 mph']\nembeddings = get_embeddings(input_texts, model='togethercomputer/m2-bert-80M-8k-retrieval')\n\nprint(embeddings)\n```\n\n### Reranking\n\n```python\nfrom typing import List\nfrom together import Together\n\nclient = Together()\n\ndef get_reranked_documents(query: str, documents: List[str], model: str, top_n: int = 3) -> List[str]:\n    outputs = client.rerank.create(model=model, query=query, documents=documents, top_n=top_n)\n    # sort by relevance score and returns the original docs\n    return [documents[i] for i in [x.index for x in sorted(outputs.results, key=lambda x: x.relevance_score, reverse=True)]]\n\nquery = \"What is the capital of the United States?\"\ndocuments = [\"New York\",\"Washington, D.C.\", \"Los Angeles\"]\n\nreranked_documents = get_reranked_documents(query, documents, model='Salesforce/Llama-Rank-V1', top_n=1)\n\nprint(reranked_documents)\n```\n\nRead more about Reranking [here](https://docs.together.ai/docs/rerank-overview).\n\n### Files\n\nThe files API is used for fine-tuning and allows developers to upload data to fine-tune on. It also has several methods to list all files, retrive files, and delete files. Please refer to our fine-tuning docs [here](https://docs.together.ai/docs/fine-tuning-python).\n\n```python\nfrom together import Together\n\nclient = Together()\n\nclient.files.upload(file=\"somedata.jsonl\") # uploads a file\nclient.files.list() # lists all uploaded files\nclient.files.retrieve(id=\"file-d0d318cb-b7d9-493a-bd70-1cfe089d3815\") # retrieves a specific file\nclient.files.retrieve_content(id=\"file-d0d318cb-b7d9-493a-bd70-1cfe089d3815\") # retrieves content of a specific file\nclient.files.delete(id=\"file-d0d318cb-b7d9-493a-bd70-1cfe089d3815\") # deletes a file\n```\n\n### Fine-tunes\n\nThe finetune API is used for fine-tuning and allows developers to create finetuning jobs. It also has several methods to list all jobs, retrive statuses and get checkpoints. Please refer to our fine-tuning docs [here](https://docs.together.ai/docs/fine-tuning-quickstart).\n\n```python\nfrom together import Together\n\nclient = Together()\n\nclient.fine_tuning.create(\n  training_file = 'file-d0d318cb-b7d9-493a-bd70-1cfe089d3815',\n  model = 'meta-llama/Llama-3.2-3B-Instruct',\n  n_epochs = 3,\n  n_checkpoints = 1,\n  batch_size = \"max\",\n  learning_rate = 1e-5,\n  suffix = 'my-demo-finetune',\n  wandb_api_key = '1a2b3c4d5e.......',\n)\nclient.fine_tuning.list() # lists all fine-tuned jobs\nclient.fine_tuning.retrieve(id=\"ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b\") # retrieves information on finetune event\nclient.fine_tuning.cancel(id=\"ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b\") # Cancels a fine-tuning job\nclient.fine_tuning.list_events(id=\"ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b\") #  Lists events of a fine-tune job\nclient.fine_tuning.download(id=\"ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b\") # downloads compressed fine-tuned model or checkpoint to local disk\n```\n\n### Models\n\nThis lists all the models that Together supports.\n\n```python\nfrom together import Together\n\nclient = Together()\n\nmodels = client.models.list()\n\nfor model in models:\n    print(model)\n```\n\n### Batch Inference\n\nThe batch API allows you to submit larger inference jobs for completion with a 24 hour turn-around time, below is an example. To learn more refer to the [docs here](https://docs.together.ai/docs/batch-inference).\n\n```python\nfrom together import Together\n\nclient = Together()\n\n# Upload the batch file\nbatch_file = client.files.upload(file=\"simpleqa_batch_student.jsonl\", purpose=\"batch-api\")\n\n# Create the batch job\nbatch = client.batches.create_batch(file_id=batch_file.id, endpoint=\"/v1/chat/completions\")\n\n# Monitor the batch status\nbatch_stat = client.batches.get_batch(batch.id)\n\n# List all batches - contains other batches as well\nclient.batches.list_batches()\n\n# Download the file content if job completed\nif batch_stat.status == 'COMPLETED':\n    output_response = client.files.retrieve_content(id=batch_stat.output_file_id,\n                                                    output=\"simpleqa_v3_output.jsonl\")\n```\n\n## Usage \u2013 CLI\n\n### Chat Completions\n\n```bash\ntogether chat.completions \\\n  --message \"system\" \"You are a helpful assistant named Together\" \\\n  --message \"user\" \"What is your name?\" \\\n  --model meta-llama/Llama-4-Scout-17B-16E-Instruct\n```\n\nThe Chat Completions CLI enables streaming tokens to stdout by default. To disable streaming, use `--no-stream`.\n\n### Completions\n\n```bash\ntogether completions \\\n  \"Large language models are \" \\\n  --model meta-llama/Llama-4-Scout-17B-16E-Instruct \\\n  --max-tokens 512 \\\n  --stop \".\"\n```\n\nThe Completions CLI enables streaming tokens to stdout by default. To disable streaming, use `--no-stream`.\n\n### Image Generations\n\n```bash\ntogether images generate \\\n  \"space robots\" \\\n  --model stabilityai/stable-diffusion-xl-base-1.0 \\\n  --n 4\n```\n\nThe image is opened in the default image viewer by default. To disable this, use `--no-show`.\n\n### Files\n\n```bash\n# Help\ntogether files --help\n\n# Check file\ntogether files check example.jsonl\n\n# Upload file\ntogether files upload example.jsonl\n\n# List files\ntogether files list\n\n# Retrieve file metadata\ntogether files retrieve file-6f50f9d1-5b95-416c-9040-0799b2b4b894\n\n# Retrieve file content\ntogether files retrieve-content file-6f50f9d1-5b95-416c-9040-0799b2b4b894\n\n# Delete remote file\ntogether files delete file-6f50f9d1-5b95-416c-9040-0799b2b4b894\n```\n\n### Fine-tuning\n\n```bash\n# Help\ntogether fine-tuning --help\n\n# Create fine-tune job\ntogether fine-tuning create \\\n  --model togethercomputer/llama-2-7b-chat \\\n  --training-file file-711d8724-b3e3-4ae2-b516-94841958117d\n\n# List fine-tune jobs\ntogether fine-tuning list\n\n# Retrieve fine-tune job details\ntogether fine-tuning retrieve ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b\n\n# List fine-tune job events\ntogether fine-tuning list-events ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b\n\n# Cancel running job\ntogether fine-tuning cancel ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b\n\n# Download fine-tuned model weights\ntogether fine-tuning download ft-c66a5c18-1d6d-43c9-94bd-32d756425b4b\n```\n\n### Models\n\n```bash\n# Help\ntogether models --help\n\n# List models\ntogether models list\n```\n\n## Contributing\n\nRefer to the [Contributing Guide](CONTRIBUTING.md)\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Python client for Together's Cloud Platform!",
    "version": "1.5.23",
    "project_urls": {
        "Bug Tracker": "https://github.com/togethercomputer/together-python/issues",
        "Homepage": "https://github.com/togethercomputer/together-python",
        "Repository": "https://github.com/togethercomputer/together-python"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ca899db23cfbb87982cc5dbcaefbf2e6013967baee768a6af248c331a33370ac",
                "md5": "42bd7c57c52f0563ee9fdd0c385aa29f",
                "sha256": "6c994e47db0f58cea2ba30e0b2be4dae54460d6729a2c9053c1ac8610ee9ff37"
            },
            "downloads": -1,
            "filename": "together-1.5.23-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "42bd7c57c52f0563ee9fdd0c385aa29f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 102699,
            "upload_time": "2025-08-06T23:32:26",
            "upload_time_iso_8601": "2025-08-06T23:32:26.887315Z",
            "url": "https://files.pythonhosted.org/packages/ca/89/9db23cfbb87982cc5dbcaefbf2e6013967baee768a6af248c331a33370ac/together-1.5.23-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bba267f1f9be4e66b816305f374646335a4cd1cc60d5c3c8c44c26226689f120",
                "md5": "597115ae24b1ffb93f742cd89e42c01e",
                "sha256": "d2aba903d49cdefb6a153f8d89bb45e22a81bd4d1a16e467589dd25b05aa1c4e"
            },
            "downloads": -1,
            "filename": "together-1.5.23.tar.gz",
            "has_sig": false,
            "md5_digest": "597115ae24b1ffb93f742cd89e42c01e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 74972,
            "upload_time": "2025-08-06T23:32:28",
            "upload_time_iso_8601": "2025-08-06T23:32:28.347322Z",
            "url": "https://files.pythonhosted.org/packages/bb/a2/67f1f9be4e66b816305f374646335a4cd1cc60d5c3c8c44c26226689f120/together-1.5.23.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-06 23:32:28",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "togethercomputer",
    "github_project": "together-python",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "together"
}

Together AI