Name | Version | Summary | date |
mteb |
1.38.54 |
Massive Text Embedding Benchmark |
2025-09-08 07:17:34 |
ssrjson-benchmark |
0.0.3 |
benchmark of ssrJSON |
2025-09-07 06:44:44 |
filehasher |
1.0.2 |
Modern file hashing utility with parallel processing, multiple algorithms, and benchmarking |
2025-09-07 06:13:48 |
ml-agents-reasoning |
0.0.19a0 |
ML Agents Reasoning Research Platform |
2025-09-05 18:35:49 |
fusion-bench |
0.2.24 |
A Comprehensive Benchmark of Deep Model Fusion |
2025-09-05 14:38:17 |
twevals |
0.0.0.dev20250904233630 |
A lightweight, code-first evaluation framework for testing AI agents and LLM applications |
2025-09-04 23:36:47 |
gpu-benchmark-tool |
0.5.5 |
Multi-vendor GPU health monitoring supporting old GPUs for e-waste reduction |
2025-09-04 01:07:41 |
swebench |
4.0.5 |
The official SWE-bench package - a benchmark for evaluating LMs on software engineering |
2025-09-03 01:10:34 |
opencompass |
0.5.0 |
A comprehensive toolkit for large model evaluation |
2025-09-01 06:21:25 |
haerae-evaluation-toolkit |
0.1.0 |
A comprehensive, standardized validation toolkit for Korean Large Language Models (LLMs). |
2025-08-31 12:00:25 |
praisonaibench |
0.0.3 |
Simple LLM Benchmarking Tool using PraisonAI Agents |
2025-08-30 07:28:30 |
pypply |
1.0.0 |
Makes it easier to use Upply API |
2025-08-27 10:30:04 |
HASARD |
0.2.0 |
Egocentric 3D Safe Reinforcement Learning Benchmark |
2025-08-26 21:39:27 |
knows |
2.0.2 |
Powerful and user-friendly property graph benchmark that creates graphs with specified node and edge numbers, supporting multiple output formats and visualization |
2025-08-21 19:15:38 |
cli-arena |
1.1.6 |
The definitive AI coding agent evaluation platform |
2025-08-20 15:31:01 |
gpu-benchmark-linux |
0.4.4 |
GPU基准测试工具 - 用于评估NVIDIA GPU性能的综合工具包 |
2025-08-19 09:55:09 |
bocode |
0.1.6.dev0 |
BOCoDe is a Python library contains optimization benchmark problems. |
2025-08-17 17:47:16 |
gramps-bench |
1.0.1 |
Performance benchmarking tools for Gramps genealogy software |
2025-08-14 12:44:13 |
dlcomm |
0.3.4 |
Distributed GPU Communication Benchmarking Framework for Deep Learning |
2025-08-13 17:28:24 |
pydftracer |
1.0.14 |
I/O profiler for deep learning python apps. Specifically for dlio_benchmark. |
2025-08-08 21:58:43 |