Name | Version | Summary | date |
ntqr |
0.4.2.3 |
Tools for the logic of evaluation using unlabeled data |
2024-10-31 13:59:07 |
DAindex |
0.1.0 |
Deterioration Allocation Index Framework |
2024-09-12 17:06:46 |
simple-smatch |
0.1.3.1 |
Simple Smatch |
2024-08-10 21:39:47 |
nutcracker-py |
0.0.2a2 |
streamline LLM evaluation |
2024-08-03 10:09:01 |
semevalplatform |
0.0.11.post1 |
Semantic Evaluation Platform |
2024-07-15 09:12:37 |
ragtime |
0.0.43 |
Ragtime 🎹 is an LLMOps framework to automatically evaluate Retrieval Augmented Generation (RAG) systems and compare different RAGs / LLMs |
2024-06-10 15:20:30 |
indic-eval |
0.1.0 |
A package to make LLM evaluation easier |
2024-06-01 07:07:20 |
sed-scores-eval |
0.0.4 |
(Threshold-Independent) Evaluation of Sound Event Detection Scores |
2024-05-23 19:52:56 |
rag-eval |
0.1.3 |
A RAG evaluation framework |
2024-03-19 17:16:27 |
synthesized-datasets |
1.7 |
Publically available datasets for benchmarking and evaluation. |
2024-03-13 14:54:22 |
easy-evaluator |
0.0.0 |
A library for easy evaluation of language models |
2024-03-03 15:30:15 |
reseval |
0.1.6 |
Reproducible Subjective Evaluation |
2024-03-03 05:32:16 |
lighthouz |
0.0.5 |
Lighthouz AI Python SDK |
2024-02-12 07:27:52 |
v-stream |
0.1.2 |
STREAM: Spatio-TempoRal Evaluation and Analysis Metric for Video Generative Models |
2024-01-25 08:02:46 |
pyEvalData |
1.6.0 |
Python module to evaluate experimental data |
2023-12-27 21:03:34 |
tiger-eval |
0.0.2 |
Text Generation Evaluation Toolkit |
2023-12-19 08:45:39 |
guardrails-ai-unbabel-comet |
2.2.1 |
High-quality Machine Translation Evaluation |
2023-12-13 00:47:54 |
simuleval |
1.1.4 |
SimulEval: A Flexible Toolkit for Automated Machine Translation Evaluation |
2023-11-30 19:50:58 |
ragstack-ai-langsmith |
0.0.1a1 |
Client library to connect to the LangSmith LLM Tracing and Evaluation Platform. |
2023-11-09 15:11:25 |
metric-eval |
1.0.2 |
a python package for evaluating evaluation metrics |
2023-11-07 01:22:58 |