K3s No Route to Host

Troubleshooting K3s Network Issues: When Cloudflare Reports an “Invalid SSL Certificate”

It is every developer’s nightmare: you suddenly discover your website is down. Recently, I encountered exactly this scenario. My site was inaccessible, and Cloudflare was throwing an “invalid SSL certificate” error.

At first glance, an SSL error points to a certificate expiration or misconfiguration. However, in proxy setups like Cloudflare, this error can sometimes be a red herring indicating that Cloudflare simply cannot communicate with the origin server. Here is how I tracked down the issue and fixed my Kubernetes cluster.

The Investigation: Losing Connection to the Master Node

My first instinct was to check the status of my Kubernetes nodes from my local machine. I ran the standard kubectl command, but it failed immediately:

λ kubectl get node
E0426 11:20:25.820157  499941 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://[master-node-ip]:6443/api?timeout=32s\": dial tcp [master-node-ip]:6443: connect: no route to host"

To isolate whether this was a Kubernetes API issue or a general network problem, I tried a simple telnet to the master node’s port:

λ telnet [master-node-ip] 6443
Trying [master-node-ip]...
telnet: Unable to connect to remote host: No route to host

Since external access was completely blocked, I had to SSH directly into the master node (master-node) to see what was happening on the inside.

Diving into the Cluster

Once inside the master node, the API server was responding locally, but the cluster state was unhealthy. Two of my worker nodes were marked as NotReady:

ubuntu@master-node:~$ sudo kubectl get node
NAME                     STATUS     ROLES                  AGE      VERSION
node-alpha    NotReady   <none>                 3y159d   v1.25.3+k3s1
node-beta     NotReady   <none>                 3y159d   v1.28.3+k3s2
master-node   Ready      control-plane,master   3y159d   v1.28.3+k3s2

Describing one of the NotReady nodes revealed that the Kubelet had simply stopped communicating:

NodeStatusUnknown   Kubelet stopped posting node status.

Checking the core system pods in the kube-system namespace showed widespread failures, with several components stuck in a CrashLoopBackOff state:

ubuntu@master-node:~$ sudo kubectl -n kube-system get pod -w
NAME                                      READY   STATUS             RESTARTS         AGE
metrics-server-84cfb4895b-54chl           0/1     CrashLoopBackOff   459 (111s ago)   39h
local-path-provisioner-84db5d44d9-9ksgk   0/1     CrashLoopBackOff   852 (40s ago)    2y158d

The breakthrough came when I checked the logs for my ingress controller, Traefik. It was constantly throwing network errors when trying to reach the internal Kubernetes API:

ubuntu@master-node:~$ sudo kubectl -n kube-system logs traefik-5c5745cdcd-j6wxm
time="2026-04-26T03:39:19Z" level=info msg="Configuration loaded from flags."
W0426 03:39:19.706944       1 reflector.go:424] k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1alpha1.IngressRouteUDP: Get "https://10.43.0.1:443/apis/traefik.containo.us/v1alpha1/ingressrouteudps?limit=500&resourceVersion=0": dial tcp 10.43.0.1:443: connect: no route to host

Identifying the Root Cause

Traefik couldn’t route to 10.43.0.1 (the internal Cluster IP). I ran sudo k3s check-config to verify the K3s installation, but all checks passed. This pointed to a lower-level networking issue.

I decided to inspect the server’s firewall rules using sudo iptables -L -n. Scanning through the output, I found the culprits at the very end of both the INPUT and FORWARD chains:

REJECT     all  --  0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited

The Problem: These rules are default security configurations that come pre-packaged with Oracle Cloud Ubuntu images. Because K3s relies heavily on iptables for traffic forwarding and inter-container communication (Cluster IP), these trailing REJECT rules were aggressively dropping necessary traffic that didn’t match earlier dynamic rules:

  1. The INPUT Block: It rejected external traffic trying to hit ports 80 and 443. This is why Cloudflare failed to connect to the origin server, resulting in the misleading “invalid SSL certificate” error on the frontend.
  2. The FORWARD Block: It prevented pods from talking to the API Server (10.43.0.1). This internal routing failure caused Traefik to log no route to host and forced nodes into a NotReady state.

The Solution

The fix was straightforward. I manually deleted the restrictive rules from the server’s iptables:

# Remove the reject rule from the INPUT chain
sudo iptables -D INPUT -j REJECT --reject-with icmp-host-prohibited

# Remove the reject rule from the FORWARD chain
sudo iptables -D FORWARD -j REJECT --reject-with icmp-host-prohibited

Almost instantly after running these two commands, Traefik stopped throwing errors, the worker nodes reported back as Ready, and Cloudflare successfully reconnected to the origin server. All services were fully recovered.

If you are running K3s on an Oracle Cloud Ubuntu instance and suddenly experience mysterious network drops or Cloudflare errors, checking your iptables for these default reject rules is a great place to start!

← back