---
title: "Natural Language Data Analysis with ClickHouse and Claude MCP — Is the Era of Writing SQL Coming to an End?"
description: "Building an environment where Claude Desktop can operate ClickHouse through natural language using the ClickHouse MCP."
lang: "en"
canonical: "https://llm-lab.dev/en/posts/clickhouse-001-claude-mcp/"
source: "https://llm-lab.dev/en/posts/clickhouse-001-claude-mcp.md"
publishedAt: "2026-06-12"
updatedAt: "2026-06-12"
category: "Clickhouse"
tags:
  - "Clickhouse"
  - "MCP"
  - "Claude"
---

# Natural Language Data Analysis with ClickHouse and Claude MCP — Is the Era of Writing SQL Coming to an End?

## Introduction

For a long time, SQL has been the common language of data analysis.

Extracting the information you need from a database, aggregating it, and detecting anomalies. To do this, analysts had to understand the table structure and be able to write appropriate SQL.

On the other hand, the evolution of generative AI is beginning to change this assumption significantly.

"Aggregate yesterday's errors by service"
"Show me requests that spiked in the last hour"
"Investigate endpoints with higher-than-usual response times"

Mechanisms that generate appropriate SQL from natural language instructions like these, query the database, interpret the results, and return them are becoming realistic.

One of the technologies at the center of this is MCP (Model Context Protocol).

Using MCP, LLMs like Claude can behave not just as text generation tools, but as agents that operate external systems.

In this article, I will build an environment where Claude Desktop can operate ClickHouse through natural language using the ClickHouse MCP.

This is not just a setup guide. I will verify query generation accuracy from natural language, failure cases, and security concerns for production use using actual log data.

To state the conclusion upfront, this does not mean that "there is no longer any need to write SQL." However, the initial steps of data exploration and incident investigation could change dramatically.

---

# Why Connect AI to Your Database Now?

Traditionally, data analysis had a major bottleneck.

It was the translation work between "what you want to know" and the form of "SQL."

For example, suppose an operator wonders, "Are there any errors that have suddenly increased since yesterday?"

To translate this into SQL, the following tasks are required:

* Check which table the logs are stored in
* Identify the column that indicates errors
* Decide the aggregation unit
* Write GROUP BY and WHERE clauses
* Infer possible causes from the results

A skilled data engineer can do this in minutes.

But for product managers, SREs, customer support, and other roles that do not necessarily write SQL on a daily basis, this is a high barrier.

By inserting an LLM here, you can keep your intent in natural language and have the AI handle only the database operation translation.

What is important is that this is not a technology to "replace SQL."

Rather, it should be considered a technology that converts SQL, a powerful language, into an interface that more people can use.

---

# What Is MCP?

MCP (Model Context Protocol) is a standard protocol for LLMs to safely connect to external tools and data sources.

Until now, when having LLMs operate external systems, it was necessary to implement proprietary API integrations for each tool.

For example,

* Operating GitHub
* Searching databases
* Retrieving information from Slack
* Searching internal documents

Each required a dedicated implementation.

MCP provides this "common interface for AI to use tools."

In other words, Claude can understand what functions a tool has and autonomously invoke it as needed.

In the case of the ClickHouse MCP, Claude operates in the following flow:

```
User
  ↓
Natural language
  ↓
Claude Desktop
  ↓
ClickHouse MCP Server
  ↓
ClickHouse Database
  ↓
Result retrieval and interpretation
  ↓
Natural language response
```

What is important here is that Claude does more than just write SQL.

If necessary, it can check the table list, understand the schema, and build answers while making multiple queries.

This is a significant difference from traditional "natural language → SQL conversion tools."

---

# Trying It Out

The verification environment for this experiment is as follows.

| Item | Details |
| ---------- | ---------------- |
| LLM | Claude Desktop |
| MCP Server | ClickHouse MCP |
| Database | ClickHouse on local Docker |
| Data | Sample access logs |

The log data schema is as follows.

```sql
CREATE TABLE access_logs
(
    timestamp DateTime,
    service String,
    endpoint String,
    status_code UInt16,
    response_time_ms UInt32
)
ENGINE = MergeTree()
ORDER BY timestamp;
```

For this technical verification, I created an `access_logs` table on ClickHouse running in local Docker according to this schema and inserted 20 rows of sample access logs.

First, I verified that the initial data was inserted correctly.

```sql
SELECT count() FROM access_logs;
```

I confirmed that this result was `20`.

When connecting from Claude Desktop via MCP, I also confirmed that `access_logs` and `error_summary` existed in the `analytics` database.

![Checking the table list and schema in ClickHouse from Claude Desktop](/images/posts/clickhouse-001-claude-mcp/claude-table-list.webp)

Against this data, I first asked the following question.

---

**Question**

"Tell me which APIs had the most errors in the sample data. Also show the SQL you executed."

---

Claude internally performed the following process:

1. Retrieve the table list
2. Check the schema of access_logs
3. Infer that status_code can be used for error determination
4. Generate an aggregation SQL query
5. Summarize the results in natural language

The SQL initially generated was as follows.

```sql
SELECT
    endpoint,
    service,
    count() AS total_requests,
    countIf(status_code >= 400) AS error_count,
    round(countIf(status_code >= 400) / count() * 100, 2) AS error_rate_pct
FROM analytics.access_logs
GROUP BY endpoint, service
ORDER BY error_count DESC
```

The result showed that `/api/search` had the most errors with 3 out of 6 requests.

However, what is important here is that Claude interpreted "error" as `status_code >= 400`. Therefore, `/login` with 401 and `/docs/mcp` with 404 were also counted as errors.

![Asking which APIs had the most errors in natural language, with 4xx errors included in the aggregation](/images/posts/clickhouse-001-claude-mcp/claude-error-400.webp)

This is less of a mistake and more an example where the ambiguity of natural language directly appeared in the SQL condition. When investigating incidents and you want to see only server-caused errors, you need to explicitly say "5xx errors" rather than just "errors."

So, I changed the question next.

---

**Question**

"Limiting to 5xx errors, tell me which APIs had the most errors in the sample data. Also show the SQL you executed."

---

In this case, the generated SQL was closer to what I expected.

```sql
SELECT
    endpoint,
    service,
    count() AS total_requests,
    countIf(status_code >= 500) AS server_error_count,
    round(countIf(status_code >= 500) / count() * 100, 2) AS server_error_rate_pct
FROM analytics.access_logs
GROUP BY endpoint, service
ORDER BY server_error_count DESC, server_error_rate_pct DESC
```

The result was 3 server errors for `/api/search` and 1 for `/checkout`. This matched the expected values from the sample data.

![Asking with a focus on 5xx errors, showing the expected aggregation results](/images/posts/clickhouse-001-claude-mcp/claude-error-500.webp)

From this verification, I confirmed the following with Claude Desktop + ClickHouse MCP:

* It can read the table list and schema in ClickHouse
* It can generate ClickHouse SQL from natural language
* It can execute SQL and summarize results in natural language
* Ambiguous terms are complemented by Claude, so business definitions must be explicit

In other words, the experience of operating a database in natural language works. However, to obtain correct aggregation results, instead of writing SQL, you need to clearly communicate in natural language "what counts as an error," "what period to look at," and "what unit to aggregate by."

---

# But It Is Not a Silver Bullet

On the other hand, natural language data operations also have limitations.

For example,

"Tell me which APIs are responding slower than usual"

is an ambiguous question.

In this verification as well, when I simply asked "which APIs had the most errors," Claude treated both 4xx and 5xx as errors. On the other hand, when I explicitly stated the condition as "5xx errors," it could aggregate only server errors as expected.

"Than usual" could mean:

* Comparing with yesterday
* Comparing with the past 7-day average
* Comparing with the same day of the week and time slot

The SQL changes depending on this.

Even context that humans implicitly understand can have multiple possible interpretations for AI.

This problem cannot be completely solved by AI performance improvements alone.

What matters is how much analysis purpose and business definitions are passed to the AI as data models and metadata.

---

# What to Consider for Production Use

When connecting a production database to AI via MCP, the most important consideration is permission design.

AI does not have malicious intent.

However, within the scope of given permissions, it operates according to the user's intent.

Therefore, data governance design such as the following is necessary:

* Use a read-only user
* Limit accessible tables
* Separate data containing personal information
* Audit query execution logs

You should think of it not as "granting AI database access," but as "adding a new user called AI."

---

# Conclusion

With MCP, LLMs have evolved from mere chat tools to interfaces that operate databases.

Especially when combined with a fast analytical database like ClickHouse, it has the potential to dramatically change the experience of exploratory analysis such as log analysis, incident investigation, and KPI verification.

On the other hand, SQL itself is not becoming unnecessary.

Complex analysis logic, ambiguous business definitions, and permission design remain areas that humans should design.

In future data analysis, rather than distinguishing between "people who can write SQL" and "people who cannot," the value of "people who can provide appropriate context to AI and guide analysis" may increase.

As it becomes possible to operate databases in natural language, now it is the human side that is tested on the ability to accurately verbalize "what you want to ask."
