Kubernetes Networking

Pods

Each pod gets its own IP address. Containers within the pod communicate with each other using the 'localhost' interface, and pods communicate with each other using their assigned IP addresses.

By default, all pods can communicate with each other across the cluster.

How is this implemented? How do pods get assigned IPs and virtual network interfaces, network namespaces, etc?

Container Networking Interface (CNI)

This is done by the CNI. This is a plugin to Kubernetes that provides networking to containers. It configures (virtual) network interfaces, assigns IP addresses, and sets up routes.

Some examples of CNI plugins:

  • Flannel - easy/lightweight, but doesn’t support network policies

  • Cilium - more complex, based on eBPF, but supports network policies and advanced features

  • Calico - supports network policies and advanced features, but can be more complex to set up

eBPF

extended Berkeley Packet Filter. This is a technology that allows you to run sandboxed programs in the kernel without changing kernel source code or loading kernel modules. Cilium uses eBPF to implement its networking and security features.

Allows you a way to extend kernel functionality safely by providing isolated functions that get triggered by certain events (like a packet arriving at a network interface). This allows Cilium to implement features like load balancing, network policies, and observability without needing to modify the kernel or use iptables.

eBPF runs at the logical junction between the kernel and hardware devices, so it’s able to make decisions early when receiving inputs from those devices. The functions you provide (hooks) to eBPF have links back to the running user-space processes, so you can have your eBPF functions trigger events in user-space, and you can have user-space processes trigger events in eBPF functions.

Some use cases:

  • Network Filtering: eBPF can enforce simple to complex rules very early in the recv path. This is done tailored to network namespaces, etc. Can also handle egress filtering for things like content filtering, loss prevention, etc.

  • Observability: eBPF can see traffic from all devices on the server, and can provide detailed telemetry on that traffic, which is useful for monitoring and debugging.

  • Security: eBPF can be used to implement security features like intrusion detection, or by asking the kernel to kill misbehaving processes.

Gateway API

This is a set of modular APIs for controlling North/South traffic (traffic entering and leaving the cluster), as well as East/West traffic (traffic between services within the cluster, aka service mesh). It is a more flexible and extensible alternative to Ingress.

Services

A Kubernetes Service is an abstraction to expose services running on a set of Pods as a network service. Handles the problem of pod ephemerality and load balancing. When pods come and go, they get new IP addresses. This makes it difficult for clients to reliably access the pods. A Service provides a stable IP address and DNS name that clients can use to access the service, regardless of the underlying pod IPs.

ClusterIP

This is an internal-only service that is only accessible from within the cluster, and is the default type of service.

This creates an internal, private IP address within the cluster, for communication between services and the backing pods. It cannot be accessed externally.

Traffic goes through a proxy, (for example, kube-proxy), which forwards traffic to the appropriate backing pods of the service based on the service’s load-balancing strategy.

Kubernetes assigns a stable, virtual IP to the service, which remains constant regardless of the backing pods' IP addresses. This allows clients to reliably access the service without needing to track changes in the pod IPs. The kube-proxy component maintains a set of network rules (under the hood, using iptables, ipvs, or nftables) to route traffic.

So when a request is made to the service’s ClusterIP, kube-proxy intercepts the traffic and forwards it to one of the backing pods based on the defined load-balancing strategy (e.g., round-robin, least connections).

The Service updates the list of available pod IPs as the pods are scaled up and down.

LoadBalancer

This is similar to ClusterIP, but exposes your application externally using a cloud provider’s load balancer. When you create this type of service, Kubernetes tries to use the specified cloud load balancer API to provision a load balancer for your service. This external load balancer directs traffic to the appropriate nodes on a nodeport that Kubernetes opens on each node. From there, kube-proxy routes the traffic to the appropriate backing pods of the service based on the service’s load-balancing strategy. Each LoadBalancer service typically gets its own public IP address, which can be costly if many services are exposed this way.

This is only available if your cluster is running in a supported cloud environment. There is also MetalLB or Cilium for scenarios where you want to use LoadBalancer services in on-premises or bare-metal clusters.

Some providers support health checks or cross-zone load balancing.

Best for public-facing services where clients need access.

MetalLB

See MetalLB Concepts for more details.

Acts as a LoadBalancer implementation for on-premises or bare-metal Kubernetes clusters.

NodePort

The Nodeport service type exposes the service on a static port on each node’s IP address. Traffic is sent to the NodePort on any node, and every node in the cluster has this port exposed. The request is then forwarded to the appropriate backing pod.

Because no cloud provider load balancer is used, this makes the NodePort type ideal for on-premises or bare-metal clusters.

When you create a NodePort service, Kubernetes allocates a port from a predefined range (default is 30000-32767) and opens that port on all nodes in the cluster. The kube-proxy component then routes incoming traffic on that port to the appropriate backing pods of the service based on the service’s load-balancing strategy.