Name | Version | Summary | date |
sed-scores-eval |
0.0.3 |
(Threshold-Independent) Evaluation of Sound Event Detection Scores |
2024-04-21 02:33:50 |
ragtime |
0.0.29 |
Ragtime 🎹 is an LLMOps framework to automatically evaluate Retrieval Augmented Generation (RAG) systems and compare different RAGs / LLMs |
2024-04-20 23:32:33 |
ntqr |
0.3.2 |
Tools for the logic of evaluation using unlabeled data |
2024-04-18 12:30:10 |
chainforge |
0.3.1.2 |
A Visual Programming Environment for Prompt Engineering |
2024-04-17 23:21:07 |
llama-index-packs-llama-dataset-metadata |
0.1.4 |
llama-index packs llama_dataset_metadata integration |
2024-04-08 19:39:22 |
enoslib |
9.2.0 |
None |
2024-04-03 11:34:17 |
lighteval |
0.3.0 |
A lightweight and configurable evaluation package |
2024-03-29 16:52:04 |
rag-eval |
0.1.3 |
A RAG evaluation framework |
2024-03-19 17:16:27 |
nutcracker-py |
0.0.1a37 |
streamline LLM evaluation |
2024-03-17 15:42:49 |
synthesized-datasets |
1.7 |
Publically available datasets for benchmarking and evaluation. |
2024-03-13 14:54:22 |
unbabel-comet |
2.2.2 |
High-quality Machine Translation Evaluation |
2024-03-13 11:27:34 |
easy-evaluator |
0.0.0 |
A library for easy evaluation of language models |
2024-03-03 15:30:15 |
reseval |
0.1.6 |
Reproducible Subjective Evaluation |
2024-03-03 05:32:16 |
coconut |
3.1.0 |
Simple, elegant, Pythonic functional programming. |
2024-03-02 09:18:28 |
codebleu |
0.6.0 |
Unofficial CodeBLEU implementation that supports Linux, MacOS and Windows available on PyPI. |
2024-03-01 16:17:22 |
tno.sdg.tabular.eval.utility-metrics |
0.3.0 |
Utility metrics for tabular data |
2024-02-28 13:23:02 |
llama-index-packs-rag-evaluator |
0.1.3 |
llama-index packs rag_evaluator integration |
2024-02-22 01:29:47 |
phasellm |
0.0.21 |
Wrappers for common large language models (LLMs) with support for evaluation. |
2024-02-20 23:31:31 |
promptmodel |
0.1.18 |
Prompt & model versioning on the cloud, built for developers. |
2024-02-15 06:35:40 |
mt-thresholds |
0.0.4 |
Tool to check how metric deltas for machine translation reflect on system-level human accuracies. |
2024-02-12 20:40:24 |