modelscope/MCPBench
The evaluation benchmark on MCP servers
Overview
modelscope/MCPBench is a Python MCP server licensed under Apache-2.0. The evaluation benchmark on MCP servers Topics: benchmark, database, mcp, mcp-server, websearch.
Ranked #10062 out of 25632 indexed tools.
Ecosystem
Python Apache-2.0
benchmarkdatabasemcpmcp-serverwebsearch
Signal Breakdown
Stars 241
Freshness 6mo ago
Issue Health 14%
Contributors 3
Dependents 0
Forks 15
Description Brief
License Apache-2.0
How to Improve
Description low impact
Freshness high impact
Issue Health high impact
Matched Queries
From the README
<h1 align="center"> 🦊 MCPBench: A Benchmark for Evaluating MCP Servers </h1> <div align="center"> [![Documentation][docs-image]][docs-url] [![Package License][package-license-image]][package-license-url] </div> <div align="center"> <h4 align="center"> [中文](https://github.com/modelscope/MCPBench/blob/main/README_zh.md) | [English](https://github.com/modelscope/MCPBench/blob/main/README.md) </h4> </div> MCPBench is an evaluation framework for MCP Servers. It supports the evaluation of three types of servers: Web Search, Database Query and GAIA, and is compatible with both local and remote MCP Servers. The framework primarily evaluates different MCP Servers (such as Brave Search, DuckDuckGo, etc.) in terms of task completion accuracy, latency, and token consumption under the same LLM and Agent configurations. Here is the [evaluation report](https://arxiv.org/abs/2504.11094). > The implementation refers to [LangProBe: a Language Programs Benchmark](https://arxiv.org/abs/2502.2031Read full README on GitHub →
Are you the maintainer? Claim this listing