langevaluate


Namelangevaluate JSON
Version 0.2.10 PyPI version JSON
download
home_pageNone
SummaryLLM 기반의 자동 평가 시스템
upload_time2025-09-19 05:14:03
maintainerNone
docs_urlNone
authorNone
requires_python==3.10.*
licenseMIT
keywords llm nlp benchmarks evaluation langchain
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # LangEvaluate

LangEvaluate는 LLM(Large Language Model)의 성능을 평가하기 위한 Python 라이브러리입니다. 다양한 평가 메트릭과 데이터셋 관리 기능을 제공하여 LLM의 성능을 체계적으로 분석할 수 있습니다.

## 주요 기능

- **다양한 LLM 지원**
  - OpenAI (GPT-4, GPT-3.5)
  - Anthropic (Claude)
  - Naver (Clova)
  - DeepSeek
  - 로컬 GPU 모델

- **다양한 평가 유형**
  - 객관식 문제 (MCQ)
  - 이진 선택 문제
  - 주관식 문제
  - 다중 턴 대화

- **데이터셋 관리**
  - Hugging Face 데이터셋 통합
  - 커스텀 데이터셋 지원
  - 데이터셋 변환 및 전처리

- **평가 메트릭**
  - 정확도 (Accuracy)
  - BLEU, ROUGE 스코어
  - LLM 기반 평가
  - 사용자 정의 메트릭

## 설치 방법

sglang이 라이브러리를 설치하려면 requirements.txt를 설치해야합니다.
만약에 linux 체제가 아니라면 pip install sglang을 해주세요.

```bash
pip install -r requirements
pip install -e .
```

## 라이선스

이 프로젝트는 MIT 라이선스를 따릅니다.

## todo

- evaluate으로 여러개의 metric 한번에 돌릴 수 있게하기
- benchmark dataset 추가 + 코드 짜기

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "langevaluate",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "==3.10.*",
    "maintainer_email": null,
    "keywords": "LLM, NLP, benchmarks, evaluation, langchain",
    "author": null,
    "author_email": "JIN PARK <nwirandx@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/bb/73/10aafe41fe8f8eb485994aa2d47d7e3a176a52b97bda0e79c00cc3143cfe/langevaluate-0.2.10.tar.gz",
    "platform": null,
    "description": "# LangEvaluate\n\nLangEvaluate\ub294 LLM(Large Language Model)\uc758 \uc131\ub2a5\uc744 \ud3c9\uac00\ud558\uae30 \uc704\ud55c Python \ub77c\uc774\ube0c\ub7ec\ub9ac\uc785\ub2c8\ub2e4. \ub2e4\uc591\ud55c \ud3c9\uac00 \uba54\ud2b8\ub9ad\uacfc \ub370\uc774\ud130\uc14b \uad00\ub9ac \uae30\ub2a5\uc744 \uc81c\uacf5\ud558\uc5ec LLM\uc758 \uc131\ub2a5\uc744 \uccb4\uacc4\uc801\uc73c\ub85c \ubd84\uc11d\ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4.\n\n## \uc8fc\uc694 \uae30\ub2a5\n\n- **\ub2e4\uc591\ud55c LLM \uc9c0\uc6d0**\n  - OpenAI (GPT-4, GPT-3.5)\n  - Anthropic (Claude)\n  - Naver (Clova)\n  - DeepSeek\n  - \ub85c\uceec GPU \ubaa8\ub378\n\n- **\ub2e4\uc591\ud55c \ud3c9\uac00 \uc720\ud615**\n  - \uac1d\uad00\uc2dd \ubb38\uc81c (MCQ)\n  - \uc774\uc9c4 \uc120\ud0dd \ubb38\uc81c\n  - \uc8fc\uad00\uc2dd \ubb38\uc81c\n  - \ub2e4\uc911 \ud134 \ub300\ud654\n\n- **\ub370\uc774\ud130\uc14b \uad00\ub9ac**\n  - Hugging Face \ub370\uc774\ud130\uc14b \ud1b5\ud569\n  - \ucee4\uc2a4\ud140 \ub370\uc774\ud130\uc14b \uc9c0\uc6d0\n  - \ub370\uc774\ud130\uc14b \ubcc0\ud658 \ubc0f \uc804\ucc98\ub9ac\n\n- **\ud3c9\uac00 \uba54\ud2b8\ub9ad**\n  - \uc815\ud655\ub3c4 (Accuracy)\n  - BLEU, ROUGE \uc2a4\ucf54\uc5b4\n  - LLM \uae30\ubc18 \ud3c9\uac00\n  - \uc0ac\uc6a9\uc790 \uc815\uc758 \uba54\ud2b8\ub9ad\n\n## \uc124\uce58 \ubc29\ubc95\n\nsglang\uc774 \ub77c\uc774\ube0c\ub7ec\ub9ac\ub97c \uc124\uce58\ud558\ub824\uba74 requirements.txt\ub97c \uc124\uce58\ud574\uc57c\ud569\ub2c8\ub2e4.\n\ub9cc\uc57d\uc5d0 linux \uccb4\uc81c\uac00 \uc544\ub2c8\ub77c\uba74 pip install sglang\uc744 \ud574\uc8fc\uc138\uc694.\n\n```bash\npip install -r requirements\npip install -e .\n```\n\n## \ub77c\uc774\uc120\uc2a4\n\n\uc774 \ud504\ub85c\uc81d\ud2b8\ub294 MIT \ub77c\uc774\uc120\uc2a4\ub97c \ub530\ub985\ub2c8\ub2e4.\n\n## todo\n\n- evaluate\uc73c\ub85c \uc5ec\ub7ec\uac1c\uc758 metric \ud55c\ubc88\uc5d0 \ub3cc\ub9b4 \uc218 \uc788\uac8c\ud558\uae30\n- benchmark dataset \ucd94\uac00 + \ucf54\ub4dc \uc9dc\uae30\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "LLM \uae30\ubc18\uc758 \uc790\ub3d9 \ud3c9\uac00 \uc2dc\uc2a4\ud15c",
    "version": "0.2.10",
    "project_urls": {
        "Bug Tracker": "https://github.com/JINAILAB/langmetrics/issues",
        "Homepage": "https://github.com/JINAILAB/langmetrics"
    },
    "split_keywords": [
        "llm",
        " nlp",
        " benchmarks",
        " evaluation",
        " langchain"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6e12510710fda11676e5f9f4006c8a2f3500980178ee5f025952ce35a6e5002d",
                "md5": "1fd6b1ad4c9b27cdbdf173734f2a6afe",
                "sha256": "861226d0cde0c793c102bba6d363e09f190d1be471588de8c0f1feca159035c8"
            },
            "downloads": -1,
            "filename": "langevaluate-0.2.10-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1fd6b1ad4c9b27cdbdf173734f2a6afe",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "==3.10.*",
            "size": 102627,
            "upload_time": "2025-09-19T05:13:47",
            "upload_time_iso_8601": "2025-09-19T05:13:47.970420Z",
            "url": "https://files.pythonhosted.org/packages/6e/12/510710fda11676e5f9f4006c8a2f3500980178ee5f025952ce35a6e5002d/langevaluate-0.2.10-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bb7310aafe41fe8f8eb485994aa2d47d7e3a176a52b97bda0e79c00cc3143cfe",
                "md5": "eccc711358f59c7efcf53fd067f7abec",
                "sha256": "596408544dbe997239354f91d18ed50f56efe4d7cb30df153ac3e7da1cb89092"
            },
            "downloads": -1,
            "filename": "langevaluate-0.2.10.tar.gz",
            "has_sig": false,
            "md5_digest": "eccc711358f59c7efcf53fd067f7abec",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "==3.10.*",
            "size": 1514337,
            "upload_time": "2025-09-19T05:14:03",
            "upload_time_iso_8601": "2025-09-19T05:14:03.585562Z",
            "url": "https://files.pythonhosted.org/packages/bb/73/10aafe41fe8f8eb485994aa2d47d7e3a176a52b97bda0e79c00cc3143cfe/langevaluate-0.2.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-19 05:14:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "JINAILAB",
    "github_project": "langmetrics",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "langevaluate"
}
        
Elapsed time: 0.75203s