The reputation layer for AI skills, tools & agents

modelscope/MCPBench

Score: 22.4 Rank #10062

The evaluation benchmark on MCP servers

Overview

modelscope/MCPBench is a Python MCP server licensed under Apache-2.0. The evaluation benchmark on MCP servers Topics: benchmark, database, mcp, mcp-server, websearch.

Ranked #10062 out of 25632 indexed tools.

Ecosystem

Python Apache-2.0
benchmarkdatabasemcpmcp-serverwebsearch

Signal Breakdown

Stars 241
Freshness 6mo ago
Issue Health 14%
Contributors 3
Dependents 0
Forks 15
Description Brief
License Apache-2.0

How to Improve

Description low impact

Expand your description to 150+ characters for better discoverability

Freshness high impact

Last commit was 194 days ago — a recent commit would boost your freshness score

Issue Health high impact

You have 6 open vs 1 closed issues — triaging stale issues improves health

Badge

AgentRank score for modelscope/MCPBench
[![AgentRank](https://agentrank-ai.com/api/badge/tool/modelscope--MCPBench)](https://agentrank-ai.com/tool/modelscope--MCPBench)
<a href="https://agentrank-ai.com/tool/modelscope--MCPBench"><img src="https://agentrank-ai.com/api/badge/tool/modelscope--MCPBench" alt="AgentRank"></a>

Matched Queries

"mcp server""mcp-server"

From the README

<h1 align="center">
	🦊 MCPBench: A Benchmark for Evaluating MCP Servers
</h1>

<div align="center">

[![Documentation][docs-image]][docs-url]
[![Package License][package-license-image]][package-license-url]

</div>

<div align="center">
<h4 align="center">

[中文](https://github.com/modelscope/MCPBench/blob/main/README_zh.md) |
[English](https://github.com/modelscope/MCPBench/blob/main/README.md)

</h4>
</div>

MCPBench is an evaluation framework for MCP Servers. It supports the evaluation of three types of servers: Web Search, Database Query and GAIA, and is compatible with both local and remote MCP Servers. The framework primarily evaluates different MCP Servers (such as Brave Search, DuckDuckGo, etc.) in terms of task completion accuracy, latency, and token consumption under the same LLM and Agent configurations. Here is the [evaluation report](https://arxiv.org/abs/2504.11094).

> The implementation refers to [LangProBe: a Language Programs Benchmark](https://arxiv.org/abs/2502.2031
Read full README on GitHub →
Are you the maintainer? Claim this listing