Name | Version | Summary | date |
mt-thresholds |
0.0.4 |
Tool to check how metric deltas for machine translation reflect on system-level human accuracies. |
2024-02-12 20:40:24 |
lighthouz |
0.0.5 |
Lighthouz AI Python SDK |
2024-02-12 07:27:52 |
v-stream |
0.1.2 |
STREAM: Spatio-TempoRal Evaluation and Analysis Metric for Video Generative Models |
2024-01-25 08:02:46 |
pyEvalData |
1.6.0 |
Python module to evaluate experimental data |
2023-12-27 21:03:34 |
tiger-eval |
0.0.2 |
Text Generation Evaluation Toolkit |
2023-12-19 08:45:39 |
tieval |
0.1.1 |
A framework for evaluation and development of temporal-aware models. |
2023-12-18 19:31:18 |
promptbench |
0.0.2 |
PromptBench is a powerful tool designed to scrutinize and analyze the interaction of large language models with various prompts. It provides a convenient infrastructure to simulate **black-box** adversarial **prompt attacks** on the models and evaluate their performances. |
2023-12-16 00:40:10 |
guardrails-ai-unbabel-comet |
2.2.1 |
High-quality Machine Translation Evaluation |
2023-12-13 00:47:54 |
simuleval |
1.1.4 |
SimulEval: A Flexible Toolkit for Automated Machine Translation Evaluation |
2023-11-30 19:50:58 |
ranx |
0.3.19 |
ranx: A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion |
2023-11-28 08:19:40 |
ragstack-ai-langsmith |
0.0.1a1 |
Client library to connect to the LangSmith LLM Tracing and Evaluation Platform. |
2023-11-09 15:11:25 |
metric-eval |
1.0.2 |
a python package for evaluating evaluation metrics |
2023-11-07 01:22:58 |
simplifiedbert |
0.0.3 |
SimplifiedBert is a Python package that simplifies the training and evaluation process for BERT models. |
2023-10-19 15:33:03 |
pytrec-eval-terrier |
0.5.6 |
Provides Python bindings for popular Information Retrieval measures implemented within trec_eval. |
2023-10-10 14:49:24 |
jury |
2.3 |
Evaluation toolkit for neural language generation. |
2023-10-08 19:02:42 |
pixanalyzer |
0.2.0 |
Analyzing pixcel change from movies to evaluate deformation of objects |
2023-10-06 10:02:18 |
SpiralEval |
0.1.2 |
Evaluation for characteristics |
2023-10-05 03:20:45 |
ml3m |
0.0.20 |
Evaluting your LLM performance |
2023-09-19 08:13:55 |
metaquantus |
0.0.5 |
MetaQuantus is a XAI performance tool for identifying reliable metrics. |
2023-09-13 09:32:57 |
xturing |
0.1.8 |
Fine-tuning, evaluation and data generation for LLMs |
2023-09-06 18:26:17 |