| Name | Version | Summary | date |
| gembench |
1.0.7 |
First comprehensive benchmark for Generative Engine Marketing (GEM), an emerging field that focuses on monetizing generative AI by seamlessly integrating advertisements into Large Language Model (LLM) responses. Our work addresses the core problem of ad-injected response (AIR) generation and provides a framework for its evaluation. |
2025-10-13 06:46:21 |
| glue3d |
0.2.5 |
Official project for GLUE3D. |
2025-10-12 21:04:49 |
| pybenchx |
1.2.0 |
A tiny, precise microbenchmarking framework for Python |
2025-10-07 20:54:46 |
| polybench |
0.3.4 |
Multivariate polynomial arithmetic benchmark tests. |
2025-10-07 10:24:22 |
| tacho |
0.8.6 |
CLI tool for measuring and comparing LLM inference speeds |
2025-10-07 08:28:58 |
| zpp-beta |
0.1.4 |
ZPP: Zero-hassle C++ build/run, metrics, and optimization hints with a real-time terminal UI |
2025-09-14 15:41:51 |
| nsfr750-pybench |
1.3.0 |
A comprehensive benchmarking tool with a modern GUI |
2025-09-03 11:54:43 |
| BenchMatcha |
0.0.1 |
Google Benchmark Suite Runner and Regression Analyzer. |
2025-08-31 17:55:50 |
| optimum-benchmark |
0.6.0 |
Optimum-Benchmark is a unified multi-backend utility for benchmarking Transformers, Timm, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes. |
2025-08-19 22:40:14 |
| http-benchmarker |
0.1.2 |
Advanced HTTP load testing tool with real-time progress monitoring and detailed performance reports. |
2025-08-18 14:18:29 |
| srb |
0.0.5 |
Space Robotics Bench |
2025-08-17 15:32:45 |
| looptick |
0.1.1 |
A simple loop execution time measurement tool. |
2025-08-12 03:58:00 |
| spectrumlab |
0.1.2 |
A pioneering unified platform designed to systematize and accelerate deep learning research in spectroscopy. |
2025-08-07 12:50:50 |
| benchmake |
1.1.2 |
A lightweight benchmarking toolkit for Python, helping you measure and compare code performance with ease. |
2025-07-23 05:23:23 |
| causalbench-asu |
0.1.7 |
Spatio Temporal Causal Benchmarking Platform |
2025-07-22 01:25:53 |
| agentdojo |
0.1.26 |
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents |
2025-02-12 08:29:46 |
| robobench |
0.0.2 |
A benchmarking tool for AI models and Hardware. |
2025-02-09 07:59:11 |
| localbench |
0.0.2 |
A benchmarking tool for Local LLMs. |
2025-02-09 01:39:35 |
| rl4co |
0.5.2 |
RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark |
2025-01-26 07:48:28 |
| gtrbench |
0.0.1 |
A benchmark to evaluate implicit reasoning in LLMs using guess-the-rule games |
2025-01-19 01:58:11 |