PyDigger - unearthing stuff about Python


NameVersionSummarydate
ragtime 0.0.22 Ragtime 🎹 is an LLMOps framework to automatically evaluate Retrieval Augmented Generation (RAG) systems and compare different RAGs / LLMs 2024-03-28 14:37:20
chainforge 0.3.0.6 A Visual Programming Environment for Prompt Engineering 2024-03-22 01:34:15
rag-eval 0.1.3 A RAG evaluation framework 2024-03-19 17:16:27
nutcracker-py 0.0.1a37 streamline LLM evaluation 2024-03-17 15:42:49
dyff-schema 0.2.2 Data models for the Dyff AI auditing platform. 2024-03-14 05:03:22
synthesized-datasets 1.7 Publically available datasets for benchmarking and evaluation. 2024-03-13 14:54:22
unbabel-comet 2.2.2 High-quality Machine Translation Evaluation 2024-03-13 11:27:34
opencompass 0.2.3 A comprehensive toolkit for large model evaluation 2024-03-12 03:53:54
evo 1.26.2 Python package for the evaluation of odometry and SLAM 2024-03-08 10:30:29
maihem 1.4.0 LLM evaluations and synthetic data generation with the MAIHEM models 2024-03-07 01:34:31
dyff-client 0.2.0 Python client for the Dyff AI auditing platform. 2024-03-05 04:12:57
easy-evaluator 0.0.0 A library for easy evaluation of language models 2024-03-03 15:30:15
reseval 0.1.6 Reproducible Subjective Evaluation 2024-03-03 05:32:16
coconut 3.1.0 Simple, elegant, Pythonic functional programming. 2024-03-02 09:18:28
codebleu 0.6.0 Unofficial CodeBLEU implementation that supports Linux, MacOS and Windows available on PyPI. 2024-03-01 16:17:22
lighteval 0.2.0 A lightweight and configurable evaluation package 2024-03-01 11:09:27
tno.sdg.tabular.eval.utility-metrics 0.3.0 Utility metrics for tabular data 2024-02-28 13:23:02
ntqr 0.2 Tools for the logic of evaluation using unlabeled data 2024-02-28 12:11:10
llama-index-packs-rag-evaluator 0.1.3 llama-index packs rag_evaluator integration 2024-02-22 01:29:47
phasellm 0.0.21 Wrappers for common large language models (LLMs) with support for evaluation. 2024-02-20 23:31:31
hourdayweektotal
5712899586192453
Elapsed time: 0.62614s