synthegrator


Namesynthegrator JSON
Version 0.13.2.2 PyPI version JSON
download
home_pageNone
SummaryFramework for code synthesis and AI4SE research
upload_time2025-07-18 19:11:18
maintainerNone
docs_urlNone
authorDavid Gros, Claudio Spiess
requires_python>=3.10
licenseNone
keywords code synthesis llm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Synthegrator

Synthegrator is a framework for code generation problems. It simplifies
the process of loading common datasets and solving them with language models.

# Installation
```bash
pip install synthegrator
```

Also, for execution you will need to [install docker](https://docs.docker.com/engine/install/).


# Example
Let's take a look at an example of how we can run a solver over
the HumanEval dataset, which collects 164 function synthesis problems.

```python
# Imports
from lmwrapper.openai_wrapper import get_open_ai_lm, OpenAiModelNames
from synthegrator.code_solver import LmCodeSolverAutoRegressive
from synthegrator.execution_threading import solve_and_evaluate_problems
from synthegrator.synthdatasets.human_eval import yield_human_eval
from synthegrator.df_converters import solution_evals_to_df

# Loading of a selection of AI4SE Datasets
problems = list(yield_human_eval())

# Create a solver that can solve a problem
lm = get_open_ai_lm(OpenAiModelNames.gpt_3_5_turbo_instruct)
#    ^ Make sure to add your API key to OPENAI_API_KEY or a file. 
#    See https://github.com/DaiseyCode/lmwrapper for more.
solver = LmCodeSolverAutoRegressive(lm)

# Generate code and execute problems testcases
evals = list(solve_and_evaluate_problems(
    solver=solver,
    problems=problems,
    max_threads_eval=4,
))
# Convert to a dataframe
df = solution_evals_to_df(
    evals, 
    pickle_gzip_whole_solution_eval=True
)
print("Fraction Passing", df.main_metric__is_success.mean())
```

# Architecture
## Guiding Design Requirements
- DR-1 **Support Diverse Datasets and Tasks.** We want an architecture that can
support a diverse tasks (including potentially complex, repository-level tasks).
- DR-2 **Consistent & Efficient Execution.** Experiments often involve running LLM-generated code. We want this to be fast, efficient, and reasonably secure.
- DR-3 **Adaptable to State-of-the-Art Models.** This includes models like those from OpenAI or on HuggingFace. Additionally be adaptable to models
that might do complex retrieval or reasoning
- DR-4 **Maintainable.** Try to follow best practices around automated testing and continuous integration.

## Diagram
![Alt synthegrator diagram](https://rb2xb7.s3.amazonaws.com/synthegrator.png)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "synthegrator",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "code synthesis, llm",
    "author": "David Gros, Claudio Spiess",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/8f/be/fe2ac23c7d299032937ac9910143319ec19ae127c7c7cb229f7bdb6e324b/synthegrator-0.13.2.2.tar.gz",
    "platform": null,
    "description": "# Synthegrator\n\nSynthegrator is a framework for code generation problems. It simplifies\nthe process of loading common datasets and solving them with language models.\n\n# Installation\n```bash\npip install synthegrator\n```\n\nAlso, for execution you will need to [install docker](https://docs.docker.com/engine/install/).\n\n\n# Example\nLet's take a look at an example of how we can run a solver over\nthe HumanEval dataset, which collects 164 function synthesis problems.\n\n```python\n# Imports\nfrom lmwrapper.openai_wrapper import get_open_ai_lm, OpenAiModelNames\nfrom synthegrator.code_solver import LmCodeSolverAutoRegressive\nfrom synthegrator.execution_threading import solve_and_evaluate_problems\nfrom synthegrator.synthdatasets.human_eval import yield_human_eval\nfrom synthegrator.df_converters import solution_evals_to_df\n\n# Loading of a selection of AI4SE Datasets\nproblems = list(yield_human_eval())\n\n# Create a solver that can solve a problem\nlm = get_open_ai_lm(OpenAiModelNames.gpt_3_5_turbo_instruct)\n#    ^ Make sure to add your API key to OPENAI_API_KEY or a file. \n#    See https://github.com/DaiseyCode/lmwrapper for more.\nsolver = LmCodeSolverAutoRegressive(lm)\n\n# Generate code and execute problems testcases\nevals = list(solve_and_evaluate_problems(\n    solver=solver,\n    problems=problems,\n    max_threads_eval=4,\n))\n# Convert to a dataframe\ndf = solution_evals_to_df(\n    evals, \n    pickle_gzip_whole_solution_eval=True\n)\nprint(\"Fraction Passing\", df.main_metric__is_success.mean())\n```\n\n# Architecture\n## Guiding Design Requirements\n- DR-1 **Support Diverse Datasets and Tasks.** We want an architecture that can\nsupport a diverse tasks (including potentially complex, repository-level tasks).\n- DR-2 **Consistent & Efficient Execution.** Experiments often involve running LLM-generated code. We want this to be fast, efficient, and reasonably secure.\n- DR-3 **Adaptable to State-of-the-Art Models.** This includes models like those from OpenAI or on HuggingFace. Additionally be adaptable to models\nthat might do complex retrieval or reasoning\n- DR-4 **Maintainable.** Try to follow best practices around automated testing and continuous integration.\n\n## Diagram\n![Alt synthegrator diagram](https://rb2xb7.s3.amazonaws.com/synthegrator.png)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Framework for code synthesis and AI4SE research",
    "version": "0.13.2.2",
    "project_urls": {
        "Homepage": "https://github.com/DaiseyCode/synthegrator"
    },
    "split_keywords": [
        "code synthesis",
        " llm"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b9848029d0e4141b5354ac3a2290fdb311a69444eb842166b6d27445bc4b3275",
                "md5": "0dad77960470920561bc6ef30d7ac8bd",
                "sha256": "5f8db07e3aa11aec50828d6bb76c3d7d3ca0db6cf10b41bfbb12f168287c37b7"
            },
            "downloads": -1,
            "filename": "synthegrator-0.13.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0dad77960470920561bc6ef30d7ac8bd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 3250763,
            "upload_time": "2025-07-18T19:11:15",
            "upload_time_iso_8601": "2025-07-18T19:11:15.578079Z",
            "url": "https://files.pythonhosted.org/packages/b9/84/8029d0e4141b5354ac3a2290fdb311a69444eb842166b6d27445bc4b3275/synthegrator-0.13.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8fbefe2ac23c7d299032937ac9910143319ec19ae127c7c7cb229f7bdb6e324b",
                "md5": "619facf4beda05e549ca0e0f9e23f1fe",
                "sha256": "796c33691b910f79308bddb35a099e136ed7322e17b24baaacb4231cb2dea8d3"
            },
            "downloads": -1,
            "filename": "synthegrator-0.13.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "619facf4beda05e549ca0e0f9e23f1fe",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 3212061,
            "upload_time": "2025-07-18T19:11:18",
            "upload_time_iso_8601": "2025-07-18T19:11:18.977482Z",
            "url": "https://files.pythonhosted.org/packages/8f/be/fe2ac23c7d299032937ac9910143319ec19ae127c7c7cb229f7bdb6e324b/synthegrator-0.13.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-18 19:11:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "DaiseyCode",
    "github_project": "synthegrator",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "synthegrator"
}
        
Elapsed time: 0.96146s