Google PageRank for AI agents. 25,000+ tools indexed.

Top 10 MCP Servers for Data Science and Analytics in 2026

Data science workflows are a natural fit for MCP. An agent that can run a Jupyter cell, query DuckDB, trigger a dbt model, stream from ClickHouse, fire an Airflow DAG, and read Spark job history can replace hours of manual pipeline inspection. We scored 368 data science tools in the AgentRank index. These are the ten worth depending on.

Top 10 data science MCP servers

Ranked by the composite AgentRank score — a weighted blend of stars (15%), freshness (25%), issue health (25%), contributors (10%), and inbound dependents (25%). Average score across all 368 data science tools is 32.5. Only 2.3% of the 25,000+ tool index scores above 60 — every tool in this list clears that bar.

# Repository Score Stars Use Case Lang
1 motherduckdb/mcp-server-motherduck Local MCP server for DuckDB and MotherDuck 88.31 439 Analytics / OLAP Python
2 datalayer/jupyter-mcp-server MCP server for Jupyter — run cells, read outputs, manage kernels 83.45 940 Jupyter Notebooks Python
3 kubeflow/mcp-apache-spark-history-server MCP server for Apache Spark History Server — bridge between agentic AI and Spark 78.08 136 Spark / Big Data Python
4 ClickHouse/mcp-clickhouse Official ClickHouse MCP server — connect AI assistants to ClickHouse databases 74.52 710 Real-time Analytics Python
5 dbt-labs/dbt-mcp Official dbt MCP server for running models, tests, and queries 72.46 507 Data Transformation Python
6 surendranb/google-analytics-mcp Google Analytics 4 MCP server for Claude, Cursor, Windsurf and more 68.14 187 Web Analytics Python
7 astronomer/astro-airflow-mcp MCP server for Apache Airflow — runs standalone or as an Airflow plugin 64.53 312 Pipeline Orchestration Python
8 Snowflake-Labs/mcp MCP server for Snowflake including Cortex AI, object management, SQL orchestration 61.24 255 Data Warehouse Python
9 e2b-dev/mcp-server MCP server for E2B — run code in secure cloud sandboxes from any AI agent 59.18 398 Code Execution TypeScript
10 LucasHild/mcp-server-bigquery MCP server for Google BigQuery — schema inspection and query execution 53.74 145 Cloud Data Warehouse Python

Choosing by workflow

The ten tools cover distinct workflow stages. Pick based on where you spend time, not by score alone.

Workflow need Best pick Score Why
Local analytical SQL motherduckdb/mcp-server-motherduck 88.31 Zero round-trips, in-process Parquet/CSV queries
Notebook-first analysis datalayer/jupyter-mcp-server 83.45 Run cells, inspect outputs, manage kernels
Big data / Spark jobs kubeflow/mcp-apache-spark-history-server 78.08 Query Spark job history and application logs
Real-time event analytics ClickHouse/mcp-clickhouse 74.52 Official server, 710 stars, sub-second OLAP queries
Data transformation dbt-labs/dbt-mcp 72.46 Run dbt models, tests, and lineage inspection
Web / product analytics surendranb/google-analytics-mcp 68.14 GA4 data through natural language queries
Pipeline orchestration astronomer/astro-airflow-mcp 64.53 Inspect DAGs, trigger runs, query task logs
Cloud data warehouse (Snowflake) Snowflake-Labs/mcp 61.24 Official Snowflake server with Cortex AI support
Sandboxed code execution e2b-dev/mcp-server 59.18 Run Python data analysis code safely in E2B sandboxes
Google BigQuery LucasHild/mcp-server-bigquery 53.74 Best-maintained BigQuery MCP — schema inspection and SQL

Detailed breakdown

1. Analytical queries and local data (DuckDB)

motherduckdb/mcp-server-motherduck is the top-scoring data science tool in the index at 88.31. DuckDB runs in-process with zero round-trips for local data — fast analytical SQL over Parquet files, CSVs, and in-memory DataFrames. MotherDuck support extends this to cloud scale. If your agent needs to query a dataset without spinning up an external database, this is the right tool. The 16-contributor base and active commit history (March 4 last commit) make this one of the most reliable analytics MCPs in the ecosystem.

Typical workflow: point the agent at a directory of Parquet files, ask it to profile the data, compute summary statistics, and identify outliers — all without leaving the conversation. DuckDB's columnar execution engine makes this genuinely fast for datasets in the hundreds of millions of rows.

2. Jupyter notebooks

datalayer/jupyter-mcp-server scores 83.45 with 940 stars and 18 contributors. It's the canonical MCP integration for Jupyter — lets agents execute notebook cells, read cell outputs, manage kernels, and inspect variable state. If your data science workflow is notebook-first, this is the integration that closes the loop between an AI agent and interactive computation.

The key capability is bidirectional: the agent can write a cell, execute it, read the output, then write the next cell based on what it saw. This enables multi-step exploratory analysis where the agent can adapt its approach based on intermediate results — not just generate a notebook template and hand it back to you.

3. Big data and Spark pipelines

kubeflow/mcp-apache-spark-history-server is an official Kubeflow project, scoring 78.08. It connects agents directly to the Spark History Server — query job history, application logs, stage metrics, and executor details through natural language. If your team runs Spark jobs on Kubernetes or any cluster, this is how you give agents visibility into pipeline execution without custom tooling.

Use case: an agent monitoring a nightly batch job can query the history server for job duration, failed stages, and executor memory usage — then surface anomalies in a summary report, or trigger follow-up queries to diagnose root causes. The Kubeflow backing means this server will track Spark version changes and History Server API updates reliably.

4. Real-time analytics (ClickHouse)

ClickHouse/mcp-clickhouse is the official MCP server from the ClickHouse team, scoring 74.52 with 710 stars — the most-starred data analytics MCP in this list after DuckDB. ClickHouse is purpose-built for real-time analytical queries at scale: event tracking, user behavior analytics, time-series data, and log analysis. The MCP server exposes SQL query execution and schema inspection across all databases and tables.

ClickHouse's sub-second query performance on billion-row datasets makes it the right database for agents that need to analyze production-scale event data. If your data team uses ClickHouse for product analytics, observability data, or financial time-series, this is the direct MCP integration. The official vendor backing means this server will track ClickHouse SQL dialect changes and authentication updates without community lag.

5. Data transformation (dbt)

dbt-labs/dbt-mcp is the official dbt MCP server from dbt Labs, scoring 72.46 with 35 contributors — the highest contributor count in this list. Agents can run dbt models, execute tests, compile SQL, and inspect lineage directly. If your data team runs dbt, this server closes the gap between the transformation layer and any LLM-driven workflow.

Practical applications: ask an agent to run a specific dbt model and report test failures, inspect the compiled SQL for a model you're debugging, or trace the lineage of a broken dashboard metric back to the source table. The 35-contributor base is the strongest signal of community health in this category — dbt-mcp won't go stale.

6. Web analytics (Google Analytics 4)

surendranb/google-analytics-mcp gives agents natural language access to GA4 data, scoring 68.14. Active commits through March 2026, supporting Claude, Cursor, and Windsurf. One caveat: it's a solo-contributor project, which means bus factor risk. Verify the license and consider forking for production use. If you need GA4 query access in an agent workflow, this is the best-maintained option in the index.

Typical workflow: an agent querying GA4 for session metrics, conversion rates, and traffic source breakdowns can surface weekly performance summaries, flag anomalies, and correlate spikes with specific events or campaigns — without requiring a data analyst to pull reports manually.

7. Data pipeline orchestration (Apache Airflow)

astronomer/astro-airflow-mcp is the official MCP server from Astronomer — the company that runs the managed Airflow cloud and contributes heavily to the open-source project. Score of 64.53 with 312 stars and 10 contributors. Agents can inspect DAG structures, list available pipelines, trigger DAG runs, monitor task state, and query execution logs through natural language.

Astronomer's official backing is the key signal here. Airflow's REST API changes with each major version — a community-maintained server would lag; the Astronomer server tracks these changes directly. Use case: an agent that monitors pipeline health and surfaces failed tasks, slow runs, and SLA misses in a daily briefing, with the ability to re-trigger failed tasks directly.

8. Cloud data warehouse (Snowflake)

Snowflake-Labs/mcp is the official Snowflake server, covering Cortex AI, object management, and SQL orchestration. It scores 61.24 — lower than expected for an official server because the last commit was December 19, 2025, triggering a freshness penalty in the score. It's still the right pick if your stack runs on Snowflake, but watch the repo for resumed activity. The Cortex AI integration is unique — it lets agents call Snowflake's built-in ML functions directly through the same MCP interface.

9. Sandboxed code execution (E2B)

e2b-dev/mcp-server gives agents the ability to run Python and JavaScript code in secure E2B cloud sandboxes, scoring 59.18 with 398 stars and 11 contributors. E2B (e2b.dev) is purpose-built for AI agent code execution — sandboxed environments with full Python data science libraries (pandas, NumPy, scikit-learn, matplotlib, Polars) available out of the box.

The sandbox model solves a real problem: agents that generate and execute data analysis code need a safe execution environment that won't touch your production systems or local filesystem. E2B's remote sandboxes handle this cleanly. The agent iterates on analysis code, inspects outputs, and refines the approach — without any risk to your local environment. For ad-hoc data analysis, this is a strong complement to the notebook-oriented Jupyter integration.

10. Google BigQuery

LucasHild/mcp-server-bigquery provides BigQuery access through MCP — dataset listing, schema inspection, and SQL query execution. Score of 53.74 with 145 stars and 6 contributors. This is a community-maintained server rather than an official Google product, which explains the lower score. Last commit January 2026 introduces a freshness penalty.

BigQuery is the default cloud data warehouse for many data science teams, and this server fills a real gap — there is no official Google BigQuery MCP server yet. The ergut/mcp-bigquery-server (a similar community alternative with 130 stars) offers comparable functionality with a slightly different architecture. For production use, evaluate both and choose based on which better fits your auth setup and BigQuery project structure.

Browse all data science tools: Full AgentRank index — 368 data science tools indexed, updated daily.

Browse the full category: All data processing MCP servers ranked — sorted by AgentRank score.

Built a data science MCP server? Submit it to get indexed and scored.

What to watch for

Official vs community servers

Six of the ten servers are official: motherduckdb, Kubeflow, ClickHouse, dbt Labs, Astronomer, and Snowflake Labs. Official servers track breaking API changes and protocol upgrades — they're safer long-term bets. The community servers (Jupyter, GA4, E2B, BigQuery) are excellent but require more due diligence. Check the last commit date and open-issue backlog before depending on any community server in production.

Freshness matters more in data tooling

Data tooling APIs change fast — new Spark versions, GA4 property changes, Snowflake protocol updates, Airflow REST API revisions. A server that hasn't been committed to in 90+ days may be broken against the current API version. That's why freshness is weighted at 25% in the AgentRank score. The Snowflake Labs server and LucasHild BigQuery server are the clearest examples: legitimate quality tools with freshness penalties dragging the score. Watch those repos for resumed commits before putting them on the critical path.

Bus factor risk in solo projects

surendranb/google-analytics-mcp is maintained by a single contributor. For anything production-critical, evaluate whether you could maintain a fork if the author goes inactive. The contributor signal in the AgentRank score (10% weight) specifically flags this risk class. For GA4 specifically, this is the only well-scored option in the index — a gap the ecosystem hasn't filled with an official server.

Score vs stars divergence

ClickHouse/mcp-clickhouse has 710 stars — the most in this list — but scores 74.52, not #1. dbt-labs/dbt-mcp has only 507 stars but has 35 contributors, pushing its health signals up. motherduckdb/mcp-server-motherduck leads at 88.31 despite 439 stars because its freshness, issue health, and dependent signals are all strong. Stars measure historical attention; the AgentRank score measures current maintenance quality. Use the score to decide what to depend on in production.

The missing pieces: pandas and real-time ML serving

There is no well-maintained, high-scoring MCP server for direct pandas or NumPy interaction in the index as of March 2026. The closest substitutes are motherduckdb/mcp-server-motherduck for in-memory analytical queries and e2b-dev/mcp-server for running arbitrary Python. Similarly, there is no MCP server for ML model serving (e.g., calling a production scikit-learn or PyTorch model through MCP). Both are genuine ecosystem gaps — well-maintained servers in these categories would immediately rank in the top tier.

Pipeline coverage is now complete

With Airflow covering orchestration, dbt covering transformation, and DuckDB/ClickHouse/BigQuery covering query, a full modern data stack is now reachable through MCP. A single agent can inspect a pipeline failure in Airflow, trace it to a broken dbt model, query the source table in BigQuery for anomalies, and summarize the root cause — without a human switching between four tools. That's the real value unlock.

Methodology

Tools classified as "data science" via keyword matching on description, topics, and repo name — covering Jupyter, DuckDB, ClickHouse, Spark, dbt, Snowflake, Airflow, BigQuery, analytics, dataframes, notebooks, machine learning pipelines, and related terms. Only non-archived repositories with a computed score included. Data from the AgentRank index crawled March 2026.

Score weights: stars 15%, freshness (days since last commit) 25%, issue health (closed/total ratio) 25%, contributors 10%, inbound dependents 25%. Only 589 tools in the 25,000+ index score above 60. Every tool in this list clears that threshold, placing them in the top 2.3% of the full index.

FAQ

What are the best MCP servers for data science workflows?

The top-scoring data science MCP servers in the AgentRank index are: motherduckdb/mcp-server-motherduck (88.31) for local SQL analytics, datalayer/jupyter-mcp-server (83.45) for notebook execution, kubeflow/mcp-apache-spark-history-server (78.08) for Spark, and ClickHouse/mcp-clickhouse (74.52) for real-time analytics.

Can Claude run Python data analysis code?

Yes — two options. e2b-dev/mcp-server runs arbitrary Python in a secure cloud sandbox. datalayer/jupyter-mcp-server executes code in an existing Jupyter kernel and reads back cell outputs. E2B is better for disposable analysis; Jupyter is better if you're working in an existing notebook and want stateful variable persistence between executions.

Can I give Claude access to my data warehouse?

Yes. MCP servers exist for Snowflake (official), BigQuery (community), ClickHouse (official), and DuckDB (official). Each server handles query routing and result serialization; Claude handles the SQL reasoning and analysis. Read-only mode is the safe default — most of these servers restrict to read access by design.

Is there an MCP server for Apache Airflow?

Yes — astronomer/astro-airflow-mcp from the Astronomer team. It supports both standalone deployment and running as an Airflow plugin. Agents can inspect DAGs, trigger runs, and query execution logs.

What is the best MCP server for real-time event analytics?

ClickHouse/mcp-clickhouse (score 74.52, 710 stars). It is the official server from the ClickHouse team and the most-starred analytics MCP in the AgentRank index after DuckDB.

Get the weekly AgentRank digest

Top movers, new tools, ecosystem insights — straight to your inbox.