Securing Multi-Cloud Kubernetes: Talos, KubeSpan, and Tailscale
Deploy a production-ready multi-cloud Kubernetes cluster using Talos OS kexec hot-swap, KubeSpan encrypted mesh, and Tailscale-secured management.
Krishna C
September 23, 2025
•
5 min read
Running Kubernetes across different cloud providers usually means dealing with incompatible networks, manual OS installations, and exposed management APIs. You need a way to connect nodes securely, encrypt pod traffic across clouds, and lock down administrative access—all without drowning in VPN complexity.
Here's how to do it right.
The Stack
| Component | Purpose |
|---|---|
| Talos OS | Immutable Kubernetes OS deployed via kexec hot-swap |
| KubeSpan | WireGuard mesh for encrypted pod-to-pod communication |
| Flannel | Lightweight CNI that works everywhere |
| Tailscale | Zero-trust network access for cluster management |
| Traefik | Gateway API controller for public traffic |
Infrastructure managed with OpenTofu and Terragrunt, using Supabase PostgreSQL for remote state.
Network Architecture
1┌─────────────────────────────────────────────────────────────────┐2│ Internet │3└────────────┬────────────────────────────────────┬───────────────┘4 │ │5 ┌───────▼─────────┐ ┌───────▼────────┐6 │ VPS Provider A │ │ VPS Provider B │7 │ (Public IPs) │ │ (Public IPs) │8 └───────┬─────────┘ └───────┬────────┘9 │ │10 ┌───────▼────────────────────────────────────▼───────┐11 │ KubeSpan WireGuard Mesh (51820) │12 │ Encrypted pod-to-pod across all nodes │13 │ 10.244.0.0/16 pod network via Flannel │14 └───────┬────────────────────────────────────────────┘15 │16 ┌───────▼────────────────────────────────────────────┐17 │ Kubernetes Cluster (Control Plane + Workers) │18 │ - DNS Round-Robin to CP nodes (6443) │19 │ - Traefik on host network (80/443) │20 │ - Tailscale operator for internal routes │21 └───────┬────────────────────────────────────────────┘22 │23 ┌───────▼────────────────────────────────────────────┐24 │ Tailscale Management Network │25 │ API access (50000, 6443) restricted to: │26 │ - Tailscale network (100.64.0.0/10) │27 │ - Firewall blocks public API access │28 └────────────────────────────────────────────────────┘2930Public Traffic: Internet → Public IP:80/443 → Traefik → Pods31Management: Admin → Tailscale VPN → CP Tailscale IP:6443 → API32Pod-to-Pod: Pod A → Flannel → KubeSpan → Internet → KubeSpan → Pod B
Talos OS: Hot-Swap Any VPS
Talos is an immutable, API-only OS built for Kubernetes. No SSH, no shell, no package manager. The killer feature: deploy via kexec without reinstalling the OS.
Your VPS boots whatever it came with. SSH in once, run a deployment script, and minutes later you're running Talos. The OS hot-swaps itself while running.
Works on any provider—Hetzner, DigitalOcean, Vultr. The deployment script auto-detects network configuration and handles the kexec boot.
The infrastructure code configures:
- LUKS2 full disk encryption (state and ephemeral partitions)
- Tailscale system extension for management access
- KubeSpan WireGuard mesh for pod networking
- Flannel CNI overlay
KubeSpan: Encrypted Pod Networking
KubeSpan is Talos's built-in WireGuard mesh connecting all cluster nodes. It creates an encrypted overlay for pod-to-pod communication across clouds.
Configuration is minimal:
1machine:2 network:3 kubespan:4 enabled: true5cluster:6 discovery:7 enabled: true
Nodes discover each other automatically and establish WireGuard tunnels. All pod traffic flows encrypted, even crossing the public internet between providers.
Flannel provides the CNI layer with VXLAN overlay on top of KubeSpan. The pod subnet (10.244.0.0/16) works seamlessly across all nodes regardless of location.
Why KubeSpan Over Tailscale for Pod Traffic?
I initially tried running pod networking over Tailscale IPs. Problems:
| Issue | Details |
|---|---|
| MTU Limitations | Tailscale's 1280 MTU causes fragmentation with standard pod traffic |
| User-Space Overhead | Tailscale runs in user space, adding latency. KubeSpan uses kernel-space WireGuard |
| Routing Complexity | Pod subnet routing through Tailscale requires additional configuration |
| Network Stability | Different providers have varying configs—KubeSpan handles this transparently |
For management access (talosctl, kubectl), Tailscale is perfect. For pod networking at scale, KubeSpan's kernel-level mesh is the right tool.
Why Flannel Over Cilium?
I spent considerable time trying to make Cilium work as the CNI, L2 LoadBalancer, and Gateway API provider. The promise of an all-in-one solution was attractive.
The reality: debugging network issues across different cloud providers became a time sink. Each provider has different configurations—OVH uses /32 point-to-point, Hetzner uses standard subnets, some have strict MAC filtering.
Cilium's L2 announcements and socket-based load balancing kept breaking in subtle ways. Days troubleshooting why pods couldn't reach services on one provider but worked fine on another.
Flannel keeps it simple. VXLAN overlay, standard configuration, works the same everywhere. Combined with KubeSpan's encrypted mesh, it provides reliable networking across any VPS provider.
Sometimes boring technology wins.
Tailscale: Securing Management Access
Nodes join the Tailscale network as devices, but only for management access.
After Talos deploys, a post-deployment script:
- Waits for Tailscale to initialize on all nodes
- Applies firewall rules blocking API ports (50000, 6443) from public internet
- Allows these ports only from Tailscale network (100.64.0.0/10)
- Updates kubeconfig and talosconfig to use Tailscale endpoints
Control plane firewall configuration:
1machine:2 network:3 firewall:4 defaultAction: block5 rules:6 - protocol: udp7 port: 518208 ingress: allow9 - protocol: tcp10 port: 5000011 sources: ["100.64.0.0/10"]12 ingress: allow13 - protocol: tcp14 port: 644315 sources: ["100.64.0.0/10"]16 ingress: allow
Now kubectl and talosctl only work through Tailscale:
1kubectl cluster-info # Uses Tailscale endpoints2talosctl version # Uses Tailscale endpoints34curl https://public-ip:6443 # Connection refused
Public Ingress with Traefik Gateway API
Traefik runs on host network mode to accept public traffic on ports 80 and 443. It implements Kubernetes Gateway API instead of traditional Ingress.
Gateway API provides:
- Better separation between infrastructure and routing configuration
- More expressive routing rules (header matching, path rewrites)
- Cleaner multi-tenant support
Host network means direct binding to public IPs—no NodePort, no LoadBalancer services, no extra NAT layer.
What This Gives You
Security by Default
- Disk encryption with LUKS2
- Pod traffic encrypted via KubeSpan WireGuard
- Management APIs locked to Tailscale VPN only
Multi-Cloud Freedom
- Deploy on any VPS provider
- KubeSpan connects nodes across clouds transparently
- Add capacity anywhere in minutes
Operational Simplicity
- Infrastructure as code with OpenTofu
- GitOps for all applications via ArgoCD
- DNS round-robin for control plane HA
Cost Efficiency
- Use cheap VPS instances from any provider
- No expensive load balancers or cloud networking services
In Practice
When you need more capacity, provision a VPS anywhere, run the deployment script, and it joins automatically:
- OpenTofu hot-swaps the OS to Talos via kexec
- Node joins KubeSpan mesh and gets encrypted pod networking
- Joins Tailscale for management access
- Firewall rules lock down public API access
- Ready to serve traffic in under 10 minutes
No VPN certificates to manage, no complex networking configs, no exposed management APIs. Just secure, simple, multi-cloud Kubernetes.
---
Interested in learning more? Reach out at [email protected]