distilabel

Name	distilabel JSON
Version	1.5.2 JSON
	download
home_page	None
Summary	Distilabel is an AI Feedback (AIF) framework for building datasets with and for LLMs.
upload_time	2025-01-22 10:49:11
maintainer	None
docs_url	None
author	None
requires_python	>=3.9
license	None
keywords	alignment annotation data llm rlaif synthetic
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <div align="center">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://github.com/argilla-io/distilabel/blob/main/docs/assets/distilabel-white.png?raw=true">
    <img alt="Distilabel Logo" src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-black.png">
  </picture>
</div>

<h3 align="center">Synthesize data for AI and add feedback on the fly!</h3>

<p align="center">
  <a  href="https://pypi.org/project/distilabel/">
    <img alt="CI" src="https://img.shields.io/pypi/v/distilabel.svg?style=flat-round&logo=pypi&logoColor=white">
  </a>
  <a href="https://pepy.tech/project/distilabel">
    <img alt="CI" src="https://static.pepy.tech/personalized-badge/distilabel?period=month&units=international_system&left_color=grey&right_color=blue&left_text=pypi%20downloads/month">
  </a>
</p>

<p align="center">
  <a href="https://twitter.com/argilla_io">
    <img src="https://img.shields.io/badge/twitter-black?logo=x"/>
  </a>
  <a href="https://www.linkedin.com/company/argilla-io">
    <img src="https://img.shields.io/badge/linkedin-blue?logo=linkedin"/>
  </a>
  <a href="http://hf.co/join/discord">
  <img src="https://img.shields.io/badge/Discord-7289DA?&logo=discord&logoColor=white"/>
  </a>
</p>


Distilabel is the framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

If you just want to get started, we recommend you check the [documentation](http://distilabel.argilla.io/). Curious, and want to know more? Keep reading!
<!-- ![overview](https://github.com/argilla-io/distilabel/assets/36760800/360110da-809d-4e24-a29b-1a1a8bc4f9b7)  -->

## Why use distilabel?

Distilabel can be used for generating synthetic data and AI feedback for a wide variety of projects including traditional predictive NLP (classification, extraction, etc.), or generative and large language model scenarios (instruction following, dialogue generation, judging etc.). Distilabel's programmatic approach allows you to build scalable pipelines for data generation and AI feedback. The goal of distilabel is to accelerate your AI development by quickly generating high-quality, diverse datasets based on verified research methodologies for generating and judging with AI feedback.

### Improve your AI output quality through data quality

Compute is expensive and output quality is important. We help you **focus on data quality**, which tackles the root cause of both of these problems at once. Distilabel helps you to synthesize and judge data to let you spend your valuable time **achieving and keeping high-quality standards for your data**.

### Take control of your data and models

**Ownership of data for fine-tuning your own LLMs** is not easy but Distilabel can help you to get started. We integrate **AI feedback from any LLM provider out there** using one unified API.

### Improve efficiency by quickly iterating on the right research and LLMs

Synthesize and judge data with **latest research papers** while ensuring **flexibility, scalability and fault tolerance**. So you can focus on improving your data and training your models.

## Community

We are an open-source community-driven project and we love to hear from you. Here are some ways to get involved:

- [Community Meetup](https://lu.ma/embed-checkout/evt-IQtRiSuXZCIW6FB): listen in or present during one of our bi-weekly events.

- [Discord](http://hf.co/join/discord): get direct support from the community in #argilla-general and #argilla-help.

- [Roadmap](https://github.com/orgs/argilla-io/projects/10/views/1): plans change but we love to discuss those with our community so feel encouraged to participate.

## What do people build with Distilabel?

The Argilla community uses distilabel to create amazing [datasets](https://huggingface.co/datasets?other=distilabel) and [models](https://huggingface.co/models?other=distilabel).

- The [1M OpenHermesPreference](https://huggingface.co/datasets/argilla/OpenHermesPreferences) is a dataset of ~1 million AI preferences derived from teknium/OpenHermes-2.5. It shows how we can use Distilabel to **synthesize data on an immense scale**.
- Our [distilabeled Intel Orca DPO dataset](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs) and the [improved OpenHermes model](https://huggingface.co/argilla/distilabeled-OpenHermes-2.5-Mistral-7B), show how we **improve model performance by filtering out 50%** of the original dataset through **AI feedback**.
- The [haiku DPO data](https://github.com/davanstrien/haiku-dpo) outlines how anyone can create a **dataset for a specific task** and **the latest research papers** to improve the quality of the dataset.

## Installation

```sh
pip install distilabel --upgrade
```

Requires Python 3.9+

In addition, the following extras are available:

### LLMs

- `anthropic`: for using models available in [Anthropic API](https://www.anthropic.com/api) via the `AnthropicLLM` integration.
- `cohere`: for using models available in [Cohere](https://cohere.ai/) via the `CohereLLM` integration.
- `argilla`: for exporting the generated datasets to [Argilla](https://argilla.io/).
- `groq`: for using models available in [Groq](https://groq.com/) using [`groq`](https://github.com/groq/groq-python) Python client via the `GroqLLM` integration.
- `hf-inference-endpoints`: for using the [Hugging Face Inference Endpoints](https://huggingface.co/inference-endpoints) via the `InferenceEndpointsLLM` integration.
- `hf-transformers`: for using models available in [transformers](https://github.com/huggingface/transformers) package via the `TransformersLLM` integration.
- `litellm`: for using [`LiteLLM`](https://github.com/BerriAI/litellm) to call any LLM using OpenAI format via the `LiteLLM` integration.
- `llama-cpp`: for using [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) Python bindings for `llama.cpp` via the `LlamaCppLLM` integration.
- `mistralai`: for using models available in [Mistral AI API](https://mistral.ai/news/la-plateforme/) via the `MistralAILLM` integration.
- `ollama`: for using [Ollama](https://ollama.com/) and their available models via `OllamaLLM` integration.
- `openai`: for using [OpenAI API](https://openai.com/blog/openai-api) models via the `OpenAILLM` integration, or the rest of the integrations based on OpenAI and relying on its client as `AnyscaleLLM`, `AzureOpenAILLM`, and `TogetherLLM`.
- `vertexai`: for using [Google Vertex AI](https://cloud.google.com/vertex-ai) proprietary models via the `VertexAILLM` integration.
- `vllm`: for using [vllm](https://github.com/vllm-project/vllm) serving engine via the `vLLM` integration.
- `sentence-transformers`: for generating sentence embeddings using [sentence-transformers](https://github.com/UKPLab/sentence-transformers).
- `mlx`: for using [MLX](https://github.com/ml-explore/mlx) models via the `MlxLLM` integration.

### Structured generation

- `outlines`: for using structured generation of LLMs with [outlines](https://github.com/outlines-dev/outlines).
- `instructor`: for using structured generation of LLMs with [Instructor](https://github.com/jxnl/instructor/).

### Data processing

- `ray`: for scaling and distributing a pipeline with [Ray](https://github.com/ray-project/ray).
- `faiss-cpu` and `faiss-gpu`: for generating sentence embeddings using [faiss](https://github.com/facebookresearch/faiss).
- `text-clustering`: for using text clustering with [UMAP](https://github.com/lmcinnes/umap) and [Scikit-learn](https://github.com/scikit-learn/scikit-learn).
- `minhash`: for using minhash for duplicate detection with [datasketch](https://github.com/datasketch/datasketch) and [nltk](https://github.com/nltk/nltk).

### Example

To run the following example you must install `distilabel` with the `hf-inference-endpoints` extra:

```sh
pip install "distilabel[hf-inference-endpoints]" --upgrade
```

Then run:

```python
from distilabel.models import InferenceEndpointsLLM
from distilabel.pipeline import Pipeline
from distilabel.steps import LoadDataFromHub
from distilabel.steps.tasks import TextGeneration

with Pipeline(
    name="simple-text-generation-pipeline",
    description="A simple text generation pipeline",
) as pipeline:
    load_dataset = LoadDataFromHub(output_mappings={"prompt": "instruction"})

    text_generation = TextGeneration(
        llm=InferenceEndpointsLLM(
            model_id="meta-llama/Meta-Llama-3.1-8B-Instruct",
            tokenizer_id="meta-llama/Meta-Llama-3.1-8B-Instruct",
        ),
    )

    load_dataset >> text_generation

if __name__ == "__main__":
    distiset = pipeline.run(
        parameters={
            load_dataset.name: {
                "repo_id": "distilabel-internal-testing/instruction-dataset-mini",
                "split": "test",
            },
            text_generation.name: {
                "llm": {
                    "generation_kwargs": {
                        "temperature": 0.7,
                        "max_new_tokens": 512,
                    }
                }
            },
        },
    )
    distiset.push_to_hub(repo_id="distilabel-example")
```

## Badges

If you build something cool with `distilabel` consider adding one of these badges to your dataset or model card.

    [<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/>](https://github.com/argilla-io/distilabel)

[<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/>](https://github.com/argilla-io/distilabel)

    [<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-dark.png" alt="Built with Distilabel" width="200" height="32"/>](https://github.com/argilla-io/distilabel)

[<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-dark.png" alt="Built with Distilabel" width="200" height="32"/>](https://github.com/argilla-io/distilabel)

## Contribute

To directly contribute with `distilabel`, check our [good first issues](https://github.com/argilla-io/distilabel/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) or [open a new one](https://github.com/argilla-io/distilabel/issues/new/choose).

## Citation

```bibtex
@misc{distilabel-argilla-2024,
  author = {Álvaro Bartolomé Del Canto and Gabriel Martín Blázquez and Agustín Piqueres Lajarín and Daniel Vila Suero},
  title = {Distilabel: An AI Feedback (AIF) framework for building datasets with and for LLMs},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/argilla-io/distilabel}}
}
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "distilabel",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "alignment, annotation, data, llm, rlaif, synthetic",
    "author": null,
    "author_email": "Argilla <admin@argilla.io>",
    "download_url": "https://files.pythonhosted.org/packages/75/52/82c954bacee0d71f986acbf0f032f7101b769cdad1d3c9a7ea4192c98bb3/distilabel-1.5.2.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <picture>\n    <source media=\"(prefers-color-scheme: dark)\" srcset=\"https://github.com/argilla-io/distilabel/blob/main/docs/assets/distilabel-white.png?raw=true\">\n    <img alt=\"Distilabel Logo\" src=\"https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-black.png\">\n  </picture>\n</div>\n\n<h3 align=\"center\">Synthesize data for AI and add feedback on the fly!</h3>\n\n<p align=\"center\">\n  <a  href=\"https://pypi.org/project/distilabel/\">\n    <img alt=\"CI\" src=\"https://img.shields.io/pypi/v/distilabel.svg?style=flat-round&logo=pypi&logoColor=white\">\n  </a>\n  <a href=\"https://pepy.tech/project/distilabel\">\n    <img alt=\"CI\" src=\"https://static.pepy.tech/personalized-badge/distilabel?period=month&units=international_system&left_color=grey&right_color=blue&left_text=pypi%20downloads/month\">\n  </a>\n</p>\n\n<p align=\"center\">\n  <a href=\"https://twitter.com/argilla_io\">\n    <img src=\"https://img.shields.io/badge/twitter-black?logo=x\"/>\n  </a>\n  <a href=\"https://www.linkedin.com/company/argilla-io\">\n    <img src=\"https://img.shields.io/badge/linkedin-blue?logo=linkedin\"/>\n  </a>\n  <a href=\"http://hf.co/join/discord\">\n  <img src=\"https://img.shields.io/badge/Discord-7289DA?&logo=discord&logoColor=white\"/>\n  </a>\n</p>\n\n\nDistilabel is the framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.\n\nIf you just want to get started, we recommend you check the [documentation](http://distilabel.argilla.io/). Curious, and want to know more? Keep reading!\n<!-- ![overview](https://github.com/argilla-io/distilabel/assets/36760800/360110da-809d-4e24-a29b-1a1a8bc4f9b7)  -->\n\n## Why use distilabel?\n\nDistilabel can be used for generating synthetic data and AI feedback for a wide variety of projects including traditional predictive NLP (classification, extraction, etc.), or generative and large language model scenarios (instruction following, dialogue generation, judging etc.). Distilabel's programmatic approach allows you to build scalable pipelines for data generation and AI feedback. The goal of distilabel is to accelerate your AI development by quickly generating high-quality, diverse datasets based on verified research methodologies for generating and judging with AI feedback.\n\n### Improve your AI output quality through data quality\n\nCompute is expensive and output quality is important. We help you **focus on data quality**, which tackles the root cause of both of these problems at once. Distilabel helps you to synthesize and judge data to let you spend your valuable time **achieving and keeping high-quality standards for your data**.\n\n### Take control of your data and models\n\n**Ownership of data for fine-tuning your own LLMs** is not easy but Distilabel can help you to get started. We integrate **AI feedback from any LLM provider out there** using one unified API.\n\n### Improve efficiency by quickly iterating on the right research and LLMs\n\nSynthesize and judge data with **latest research papers** while ensuring **flexibility, scalability and fault tolerance**. So you can focus on improving your data and training your models.\n\n## Community\n\nWe are an open-source community-driven project and we love to hear from you. Here are some ways to get involved:\n\n- [Community Meetup](https://lu.ma/embed-checkout/evt-IQtRiSuXZCIW6FB): listen in or present during one of our bi-weekly events.\n\n- [Discord](http://hf.co/join/discord): get direct support from the community in #argilla-general and #argilla-help.\n\n- [Roadmap](https://github.com/orgs/argilla-io/projects/10/views/1): plans change but we love to discuss those with our community so feel encouraged to participate.\n\n## What do people build with Distilabel?\n\nThe Argilla community uses distilabel to create amazing [datasets](https://huggingface.co/datasets?other=distilabel) and [models](https://huggingface.co/models?other=distilabel).\n\n- The [1M OpenHermesPreference](https://huggingface.co/datasets/argilla/OpenHermesPreferences) is a dataset of ~1 million AI preferences derived from teknium/OpenHermes-2.5. It shows how we can use Distilabel to **synthesize data on an immense scale**.\n- Our [distilabeled Intel Orca DPO dataset](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs) and the [improved OpenHermes model](https://huggingface.co/argilla/distilabeled-OpenHermes-2.5-Mistral-7B), show how we **improve model performance by filtering out 50%** of the original dataset through **AI feedback**.\n- The [haiku DPO data](https://github.com/davanstrien/haiku-dpo) outlines how anyone can create a **dataset for a specific task** and **the latest research papers** to improve the quality of the dataset.\n\n## Installation\n\n```sh\npip install distilabel --upgrade\n```\n\nRequires Python 3.9+\n\nIn addition, the following extras are available:\n\n### LLMs\n\n- `anthropic`: for using models available in [Anthropic API](https://www.anthropic.com/api) via the `AnthropicLLM` integration.\n- `cohere`: for using models available in [Cohere](https://cohere.ai/) via the `CohereLLM` integration.\n- `argilla`: for exporting the generated datasets to [Argilla](https://argilla.io/).\n- `groq`: for using models available in [Groq](https://groq.com/) using [`groq`](https://github.com/groq/groq-python) Python client via the `GroqLLM` integration.\n- `hf-inference-endpoints`: for using the [Hugging Face Inference Endpoints](https://huggingface.co/inference-endpoints) via the `InferenceEndpointsLLM` integration.\n- `hf-transformers`: for using models available in [transformers](https://github.com/huggingface/transformers) package via the `TransformersLLM` integration.\n- `litellm`: for using [`LiteLLM`](https://github.com/BerriAI/litellm) to call any LLM using OpenAI format via the `LiteLLM` integration.\n- `llama-cpp`: for using [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) Python bindings for `llama.cpp` via the `LlamaCppLLM` integration.\n- `mistralai`: for using models available in [Mistral AI API](https://mistral.ai/news/la-plateforme/) via the `MistralAILLM` integration.\n- `ollama`: for using [Ollama](https://ollama.com/) and their available models via `OllamaLLM` integration.\n- `openai`: for using [OpenAI API](https://openai.com/blog/openai-api) models via the `OpenAILLM` integration, or the rest of the integrations based on OpenAI and relying on its client as `AnyscaleLLM`, `AzureOpenAILLM`, and `TogetherLLM`.\n- `vertexai`: for using [Google Vertex AI](https://cloud.google.com/vertex-ai) proprietary models via the `VertexAILLM` integration.\n- `vllm`: for using [vllm](https://github.com/vllm-project/vllm) serving engine via the `vLLM` integration.\n- `sentence-transformers`: for generating sentence embeddings using [sentence-transformers](https://github.com/UKPLab/sentence-transformers).\n- `mlx`: for using [MLX](https://github.com/ml-explore/mlx) models via the `MlxLLM` integration.\n\n### Structured generation\n\n- `outlines`: for using structured generation of LLMs with [outlines](https://github.com/outlines-dev/outlines).\n- `instructor`: for using structured generation of LLMs with [Instructor](https://github.com/jxnl/instructor/).\n\n### Data processing\n\n- `ray`: for scaling and distributing a pipeline with [Ray](https://github.com/ray-project/ray).\n- `faiss-cpu` and `faiss-gpu`: for generating sentence embeddings using [faiss](https://github.com/facebookresearch/faiss).\n- `text-clustering`: for using text clustering with [UMAP](https://github.com/lmcinnes/umap) and [Scikit-learn](https://github.com/scikit-learn/scikit-learn).\n- `minhash`: for using minhash for duplicate detection with [datasketch](https://github.com/datasketch/datasketch) and [nltk](https://github.com/nltk/nltk).\n\n### Example\n\nTo run the following example you must install `distilabel` with the `hf-inference-endpoints` extra:\n\n```sh\npip install \"distilabel[hf-inference-endpoints]\" --upgrade\n```\n\nThen run:\n\n```python\nfrom distilabel.models import InferenceEndpointsLLM\nfrom distilabel.pipeline import Pipeline\nfrom distilabel.steps import LoadDataFromHub\nfrom distilabel.steps.tasks import TextGeneration\n\nwith Pipeline(\n    name=\"simple-text-generation-pipeline\",\n    description=\"A simple text generation pipeline\",\n) as pipeline:\n    load_dataset = LoadDataFromHub(output_mappings={\"prompt\": \"instruction\"})\n\n    text_generation = TextGeneration(\n        llm=InferenceEndpointsLLM(\n            model_id=\"meta-llama/Meta-Llama-3.1-8B-Instruct\",\n            tokenizer_id=\"meta-llama/Meta-Llama-3.1-8B-Instruct\",\n        ),\n    )\n\n    load_dataset >> text_generation\n\nif __name__ == \"__main__\":\n    distiset = pipeline.run(\n        parameters={\n            load_dataset.name: {\n                \"repo_id\": \"distilabel-internal-testing/instruction-dataset-mini\",\n                \"split\": \"test\",\n            },\n            text_generation.name: {\n                \"llm\": {\n                    \"generation_kwargs\": {\n                        \"temperature\": 0.7,\n                        \"max_new_tokens\": 512,\n                    }\n                }\n            },\n        },\n    )\n    distiset.push_to_hub(repo_id=\"distilabel-example\")\n```\n\n## Badges\n\nIf you build something cool with `distilabel` consider adding one of these badges to your dataset or model card.\n\n    [<img src=\"https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png\" alt=\"Built with Distilabel\" width=\"200\" height=\"32\"/>](https://github.com/argilla-io/distilabel)\n\n[<img src=\"https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png\" alt=\"Built with Distilabel\" width=\"200\" height=\"32\"/>](https://github.com/argilla-io/distilabel)\n\n    [<img src=\"https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-dark.png\" alt=\"Built with Distilabel\" width=\"200\" height=\"32\"/>](https://github.com/argilla-io/distilabel)\n\n[<img src=\"https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-dark.png\" alt=\"Built with Distilabel\" width=\"200\" height=\"32\"/>](https://github.com/argilla-io/distilabel)\n\n## Contribute\n\nTo directly contribute with `distilabel`, check our [good first issues](https://github.com/argilla-io/distilabel/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) or [open a new one](https://github.com/argilla-io/distilabel/issues/new/choose).\n\n## Citation\n\n```bibtex\n@misc{distilabel-argilla-2024,\n  author = {\u00c1lvaro Bartolom\u00e9 Del Canto and Gabriel Mart\u00edn Bl\u00e1zquez and Agust\u00edn Piqueres Lajar\u00edn and Daniel Vila Suero},\n  title = {Distilabel: An AI Feedback (AIF) framework for building datasets with and for LLMs},\n  year = {2024},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished = {\\url{https://github.com/argilla-io/distilabel}}\n}\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Distilabel is an AI Feedback (AIF) framework for building datasets with and for LLMs.",
    "version": "1.5.2",
    "project_urls": {
        "Documentation": "https://distilabel.argilla.io/",
        "Issues": "https://github.com/argilla/distilabel/issues",
        "Source": "https://github.com/argilla/distilabel"
    },
    "split_keywords": [
        "alignment",
        " annotation",
        " data",
        " llm",
        " rlaif",
        " synthetic"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "83c992416a87ae5e4e10ea0f78a06a04659a0a19acca250e13fe9c5e38b91efc",
                "md5": "482fea66ac0f60f6e024fedae99b4ac5",
                "sha256": "96398bd28844e7acddc167de447daec2a4e108e100341581d92ce6f37471ae51"
            },
            "downloads": -1,
            "filename": "distilabel-1.5.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "482fea66ac0f60f6e024fedae99b4ac5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 510987,
            "upload_time": "2025-01-22T10:49:09",
            "upload_time_iso_8601": "2025-01-22T10:49:09.195180Z",
            "url": "https://files.pythonhosted.org/packages/83/c9/92416a87ae5e4e10ea0f78a06a04659a0a19acca250e13fe9c5e38b91efc/distilabel-1.5.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "755282c954bacee0d71f986acbf0f032f7101b769cdad1d3c9a7ea4192c98bb3",
                "md5": "be6d99ef46aa51074099108f0217c619",
                "sha256": "b1efbd4f5d22297dec9af7b47535e6b11a3982f1733f726e5c40e8da4b173394"
            },
            "downloads": -1,
            "filename": "distilabel-1.5.2.tar.gz",
            "has_sig": false,
            "md5_digest": "be6d99ef46aa51074099108f0217c619",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 6750908,
            "upload_time": "2025-01-22T10:49:11",
            "upload_time_iso_8601": "2025-01-22T10:49:11.776918Z",
            "url": "https://files.pythonhosted.org/packages/75/52/82c954bacee0d71f986acbf0f032f7101b769cdad1d3c9a7ea4192c98bb3/distilabel-1.5.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-22 10:49:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "argilla",
    "github_project": "distilabel",
    "github_not_found": true,
    "lcname": "distilabel"
}

None