How I Built a Self-Hosted ClickHouse Cluster from Scratch

A practical guide to deploying a production-ready ClickHouse cluster using Terraform and Ansible on Hetzner. Covers cluster architecture, sharding vs replication, Keeper quorum, and why I skipped Kubernetes.

Seerat Awan
January 10, 2026
7 min read
Share:

If you've made it this far, you're probably following my journey building Userloom.io. Things are going well so far. In the last post, I explained why ClickHouse was the obvious choice for analytics storage. Now it's time to actually deploy the thing.

I decided to go self-hosted for two reasons:

  1. Control. I want full control over cluster topology, configuration tuning, and scaling decisions.
  2. Learning. I want to understand how to architect and deploy a distributed database cluster myself. No black boxes.

So the goal for the week was clear: build a production-ready ClickHouse cluster with complete operational control.

Then the R&D rabbit hole began.

Why I Skipped ClickHouse Cloud

Let me get this out of the way first. ClickHouse Cloud is a great product. But I ruled it out for this project.

I wanted full control over the cluster topology, configuration tuning, and scaling decisions. I didn't want to wait behind support tickets to change settings or debug issues.

Bare metal (or VMs) it is.

Why Terraform + Ansible Over Kubernetes

Here's where I'll be completely honest: I don't have deep experience with clusters or advanced DevOps. So I turned to everyone's best friends, Claude and ChatGPT, to help me think through options.

My first instinct was to learn Kubernetes. I have plenty of experience with Docker and containers, so K8s felt like the natural next step. But after some research, I realized it wasn't the right choice for this stage.

Info

At small scale (5 nodes), Kubernetes adds significant overhead without proportional benefits. The 2-4GB RAM per node for K8s components alone would eat into resources better spent on ClickHouse itself.

Here's the comparison that convinced me:

FactorKubernetesTerraform + Ansible
Operational complexityHigh (need to manage K8s itself)Low (direct server management)
Resource overhead2-4GB RAM per node for K8s componentsNear zero
ClickHouse I/O performanceTricky with PVCsNative disk access, optimal
DebuggingMore layers to troubleshootSSH in, check logs, done
Learning curveSteep if not already using K8sModerate
Time to productionLongerFaster

The decision became obvious. Terraform + Ansible gives me infrastructure-as-code discipline without the Kubernetes tax.

Why Both Terraform and Ansible?

Terraform is declarative for cloud resources. Ansible is procedural for server configuration. They complement each other, though there's some overlap.

I could use Ansible alone. If you're comfortable creating servers manually in Hetzner's UI (takes 10 minutes), Ansible handles the rest just fine.

But Terraform adds value when you want to:

  • Tear down and recreate environments reliably
  • Manage DNS, firewalls, private networks, and load balancers as code
  • Maintain multiple environments (staging, production)
  • Review infrastructure changes in pull requests

I'm planning to replicate this environment later, so investing in Terraform now reduces future headaches. If we're doing this, let's do it right.

The Cluster Architecture

After a week of research, here's the architecture I landed on:

100%

Five nodes total:

  • Nodes 1-4: ClickHouse servers (2 shards × 2 replicas)
  • Nodes 1-3: Also run ClickHouse Keeper for consensus
  • Node 5: Dedicated monitoring stack (Prometheus, Grafana, Alertmanager, backups)

Everything lives inside a private network. The only public exposure is through the load balancer with SSL termination.

SSH Access

SSH (port 22) is restricted to your IP only. Never expose SSH to the public internet on production infrastructure.

Understanding ClickHouse Keeper and Quorum

Info

ClickHouse Keeper is the modern replacement for ZooKeeper, built directly into ClickHouse. It uses the same Raft consensus protocol but eliminates the need to manage a separate Java-based ZooKeeper cluster.

ClickHouse Keeper uses a consensus protocol called Raft. It needs a quorum (majority) to make decisions.

Here's the math:

NodesQuorumSurvives N Failures
110 (no fault tolerance)
220 (both needed, pointless)
321 node can fail ✓
431 node can fail (same as 3)
532 nodes can fail

Why 3 Keeper Nodes?

Three is the minimum for fault tolerance (survives 1 failure), and odd numbers avoid split-brain scenarios. Four nodes provides no extra fault tolerance over three. For my scale, five Keeper nodes would be overkill.

What Keeper actually does:

  • Coordinates which replica is the leader for writes
  • Tracks replication state across nodes
  • Manages distributed DDL operations (CREATE/ALTER TABLE)
  • Stores merge scheduling metadata

Sharding vs Replication: What's the Difference?

This confused me at first. Let me break it down simply.

100%

Sharding = Horizontal Scaling (Split Your Data)

Sharding distributes data across multiple servers. Each shard holds a subset of the total data, partitioned by a shard key (like user_id, tenant_id, or date).

When to add shards:

  • Single node can't hold all your data
  • Query performance degrades due to data volume
  • Write throughput exceeds single node capacity

Replication = High Availability (Copy Your Data)

Replication creates identical copies of each shard on multiple nodes. Writes go to any replica, then sync automatically. If one replica dies, others keep serving.

Benefits:

  • Zero data loss if a node fails
  • Read scaling (distribute reads across replicas)
  • Online maintenance (take one replica offline without downtime)

My Setup: 2 Shards × 2 Replicas

Here's what this configuration provides:

  • Can lose any 1 node without data loss
  • Can lose 1 node per shard and stay online
  • 2x write throughput (parallel writes to shards)
  • 4x read throughput (all 4 nodes serve reads)
  • 2x storage capacity (data split across shards)

How Distributed Queries Work

The table setup uses two layers:

-- Local table: lives on each shard, holds that shard's data only
CREATE TABLE events ON CLUSTER 'analytics' (
    tenant_id UInt32,
    event_time DateTime,
    event_type String
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/events', '{replica}')
PARTITION BY toYYYYMM(event_time)
ORDER BY (tenant_id, event_time);

-- Distributed table: virtual table that queries ALL shards
CREATE TABLE events_distributed ON CLUSTER 'analytics' AS events
ENGINE = Distributed('analytics', 'default', 'events', tenant_id);

When you query events_distributed, here's what happens:

100%
  1. Query hits any node through the load balancer
  2. That node fans out to one replica per shard
  3. Each shard processes its portion in parallel
  4. Results merge and return to client

The client doesn't need to know anything about shards or replicas. It just queries events_distributed and ClickHouse handles the rest.

Key Takeaways for Self-Hosted ClickHouse

If you're considering self-hosted ClickHouse:

  1. Skip Kubernetes at small scale. Terraform + Ansible gives you infrastructure-as-code without the overhead.
  2. Use 3 Keeper nodes. It's the minimum for fault tolerance, and 4 nodes doesn't improve anything.
  3. Understand the difference between shards and replicas. Shards split data for scale. Replicas copy data for availability.
  4. Start with 2 shards × 2 replicas. It's a solid foundation that survives node failures and scales reads/writes.
  5. Keep everything in a private network. Expose only what you need through a load balancer.

The whole setup took about a week of learning and iteration. Most of that time was understanding concepts, not writing code. Once the architecture clicked, the Terraform and Ansible configs came together quickly.


Next up: Building the JavaScript SDK for client-side event tracking and batching.


Under the HoodPart 3 of 3

Follow along as I figure this out. View all articles →

You've completed the series!

View all articles in this series →
Seerat Awan

Seerat Awan

Founder & Builder in Chief

Building the product growth platform for teams who act on behavior. Capture signals, trigger emails, launch surveys, and turn more users into engaged customers.

Limited beta spots available

Stop losing users to silence.

Join the beta, help us build, and lock in a lifetime discount when we launch.

Join founders on the waitlist · Lifetime discount at launch · Shape the product with us