Calico Zero Trust Networking on Platform9

Zero Trust Network

What Is Zero Trust Networking and Why is it Important?

What is a Zero Trust Network? Zero Trust Networking is an approach to network security that adopts a posture where the network is always assumed to be a hostile place in a company. This posture makes us go back in time when we have DMZ and non-DMZ networks, this is an approach that focuses on separating the world into trusted and untrusted network segments.

Why is the network a hostile place? In many attack scenarios, it is.

  • Attackers may compromise “trusted” parts of your network infrastructure: routers, switches, links, etc, this could be from an insider or an outsider.
  • A misconfiguration applied to a device can route sensitive traffic over untrusted networks, like the public Internet.
  • One application could be compromised, and this could be the starting point for escalating privileges within the network, the compromised application may share a network with other servers, or containers nowadays compromising critical company assets.

Introduction

The following document describes how to enable Networking Zero Trust in a Platform9 Cluster running Calico as the CNI. We will be covering different scenarios in which the network needs to be protected.

Runtime Environment

The environment where this was tested involves the following:

  • Cluster version: PMK 1.20
  • DU version: 5.4
  • Calico Client version: 3.18.1
  • Calico Cluster version: 3.18.1
  • MetalLB version: 0.9.6

Applications

Deploying Test Applications

Let’s create an NGINX deployment and a BusyBox pod within the ‘advanced-policy-demo’ namespace. Then we will validate that the pods are able to talk to each other.

$ kubectl create ns advanced-policy-demo
$ kubectl create deployment --namespace=advanced-policy-demo nginx --image=nginx
$ kubectl expose --namespace=advanced-policy-demo deployment nginx --port=80
kubectl run --namespace=advanced-policy-demo access --rm -ti --image busybox /bin/sh
kubectl run --namespace=advanced-policy-demo access --rm -ti --image busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/  wget -q --timeout=5 nginx -O -
/  wget -q --timeout=5 google.com -O -

Closing Down All App’s Communications In Your Kubernetes Cluster

Locking Down The Cluster

Let’s apply the following GlobalNetwork Policy. This policy is applied at the cluster level, which means it applies to all namespaces, except the ones described in the manifest. We will explicitly allow egress UDP traffic over port 53. Go ahead and apply the manifest and then confirm that the policy is in place.

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: deny-app-policy
spec:
  namespaceSelector: has(projectcalico.org/name) && projectcalico.org/name not in {"kube-system", "calico-system", "kube-node-lease", "kube-public", "kubernetes-dashboard", "metallb-system", "pf9-addons", "pf9-monitoring", "pf9-olm", "pf9-operators", "platform9-system"}
  types:
  - Ingress
  - Egress
  egress:
  # allow all namespaces to communicate to DNS pods
  - action: Allow
    protocol: UDP
    destination:
      selector: 'k8s-app == "kube-dns"'
      ports:
      - 53
calicoctl apply -f global-default-deny.yaml
calicoctl get globalnetworkpolicies
NAME
deny-app-policy

Testing Communication Between Your Applications

Let’s try to reach the NGINX pod from the busybox pod. Confirm that the communication is blocked.

Failing To Communicate

This will indicate that the application is being targeted by our globalnetworkpolicy and that the policy is properly applied and working.

kubectl run --namespace=advanced-policy-demo access --rm -ti --image busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ wget -q --timeout=5 nginx -O -
wget: download timed out
/ wget -q --timeout=5 google.com -O -
wget: download timed out

Opening Communication Between Your Applications

In order to start opening communication flows, we need to understand in which direction (Ingress/Egress) traffic should flow. We want to create a policy that is open, but only to the extent that is necessary.

Enable Egress Traffic To The busybox Pod

Now we will apply the following NetworkPolicy allow-busybox-egress in the advanced-policy-demo namespace.

apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
  name: allow-busybox-egress
  namespace: advanced-policy-demo
spec:
  selector: run == 'access'
  types:
  - Egress
  egress:
  - action: Allow
calicoctl apply -f allow-buybox-egress.yaml

Next, verify that the BusyBox pod is able to reach the internet, but not the NGINX pod.

kubectl run --namespace=advanced-policy-demo access --rm -ti --image busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ wget -q --timeout=5 google.com -O -
.................SNIPPET.......................
...............................................
</script> </body></html>/ 
/ wget -q --timeout=5 nginx -O -
wget: download timed out

Apply the following networkpolicy allow-nginx-ingress.yaml in the advanced-policy-demo namespace to allow ingress traffic. Now we can test traffic from our BusyBox pod. .Enable Ingress Traffic To The Nginx Pods

apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
  name: allow-nginx-ingress
  namespace: advanced-policy-demo
spec:
  selector: app == 'nginx'
  types:
  - Ingress
  ingress:
  - action: Allow
    source:
      selector: run == 'access'
calicoctl apply -f allow-nginx-ingress.yaml

The BusyBox pod should now be able to reach the NGINX pod. Test this out on your cluster with the commands below.

kubectl run --namespace=advanced-policy-demo access --rm -ti --image busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ wget -q --timeout=5 nginx -O -
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

Exposing Application Over NodePort Service<

As you may know, exposing applications over NodePort is a common use case for on-premises Kubernetes Clusters when a LoadBalancer service is not available.

Patch ClusterIP Service and Expose NGINX Pod

Let’s patch the service of our application:

kubectl get svc -o wide -n advanced-policy-demo
NAME    TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE   SELECTOR
nginx   ClusterIP   10.21.250.63   <none>        80/TCP    17h   app=nginx
kubectl patch svc nginx -n advanced-policy-demo --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"}]'
service/nginx patched
kubectl get svc -o wide -n advanced-policy-demo
NAME    TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE   SELECTOR
nginx   NodePort   10.21.250.63   <none>        80:32127/TCP   17h   app=nginx

Unable to reach NodePort From External Source

When I try to reach the NodePort service, on port 32127, the traffic should be blocked.

curl http://10.128.147.68:32127
curl: (52) Empty reply from server

Allow cluster ingress traffic but deny general ingress traffic

In the following example, we create a global network policy to allow cluster ingress traffic (allow-cluster-internal-ingress): for the nodes’ IP addresses (10.128.146.0/23), and for pod IP addresses assigned by Kubernetes (10.20.0.0/16). By adding a preDNAT field, Calico global network policy is applied before regular DNAT on the Kubernetes cluster, in order to apply this global policy you will need to label the nodes, in this case we will label them with nodeport-external-ingress: true

kubectl label node 10.128.146.177 nodeport-external-ingress=true
kubectl label node 10.128.147.102 nodeport-external-ingress=true
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: allow-cluster-internal-ingress-only
spec:
  order: 20
  preDNAT: true
  applyOnForward: true
  ingress:
    - action: Allow
      source:
        nets: [10.128.146.0/23, 10.20.0.0/16]
    - action: Deny
  selector: has(nodeport-external-ingress)
calicoctl apply -f allow-cluster-internal-ingress-only.yaml

Allow localhost Egress Traffic

We also need a global network policy to allow egress traffic through each node’s external interface. Otherwise, when we define host endpoints for those interfaces, no egress traffic will be allowed from local processes (except for traffic that is allowed by the Failsafe rules.

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: allow-outbound-external
spec:
  order: 10
  egress:
    - action: Allow
  selector: has(nodeport-external-ingress)
calicoctl apply -f allow-outbound-external.yaml

Create host endpoints with appropriate network policy

In this example, we assume that you have already defined Calico host endpoints with a network policy that is appropriate for the cluster. (For example, you wouldn’t want a host endpoint with a “default deny all traffic to/from this host” network policy because that is counter to the goal of allowing/denying specific traffic.) For help, see host endpoints.

All of our previously-defined global network policies have a selector that makes them applicable to any endpoint with a nodeport-external-ingress label; so we will include that label in our definitions.

Let’s create the following HostEndpoint objects. We will describe each of the nodes we want to receive ingress traffic in the manifest below. In this example we are defining two workers.

apiVersion: projectcalico.org/v3
kind: HostEndpoint
metadata:
  name: 10.128.146.177-eth0
  labels:
    nodeport-external-ingress: true
spec:
  interfaceName: eth0
  node: 10.128.146.177
  expectedIPs:
  - 10.128.146.177
---
apiVersion: projectcalico.org/v3
kind: HostEndpoint
metadata:
  name: 10.128.147.102-eth0
  labels:
    nodeport-external-ingress: true
spec:
  interfaceName: eth0
  node: 10.128.147.102
  expectedIPs:
  - 10.128.147.102
calicoctl apply -f host-endpoints.yaml

Allow ingress traffic to specific node ports

Now we can allow external access to the node ports by creating a global network policy with the preDNAT field. In this example, ingress traffic is allowed for any host endpoint with port: 32127.

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: allow-nodeport
spec:
  preDNAT: true
  applyOnForward: true
  order: 10
  ingress:
    - action: Allow
      protocol: TCP
      destination:
        selector: has(nodeport-external-ingress)
        ports: [32127]
  selector: has(nodeport-external-ingress)
calicoctl apply -f allow-nodeport.yaml

Let’s test connectivity

curl  http://10.128.147.102:32127 --max-time 5
curl: (28) Operation timed out after 5004 milliseconds with 0 bytes received
curl  http://10.128.146.177:32127 --max-time 5
curl: (28) Operation timed out after 5006 milliseconds with 0 bytes received

Identifying Traffic In Order to Be Allowed.

The above test failed, because we have just enabled HostEndpoint connectivity over port 32127, but we haven’t talked about Application Ingress traffic when the traffic comes from a NodePort, how does it look at the Kubernetes stack? Let’s check that out in the next section.

Let’s start ruling certain things out, like if the traffic is arriving properly to the interfaces while trying to reach the port. Then, how is the traffic seen, among other things. We have identified that the pod of our application is hosted on worker node 10.128.147.102. Let’s start sniffing for traffic with port 32127 involved.

kubectl get pods -A -o wide
NAMESPACE              NAME                                         READY   STATUS      RESTARTS   AGE     IP               NODE             NOMINATED NODE   READINESS GATES
advanced-policy-demo   nginx-6799fc88d8-zp4zj                       1/1     Running     0          19h     10.20.112.90     10.128.147.102   <none>           <none>

Let’s ssh into worker node 10.128.147.102 and start sniffing eth0 for port 32127

tcpdump -i eth0 port 32127 -nn -vv
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
16:13:36.276053 IP (tos 0x0, ttl 61, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    10.7.0.99.61237 > 10.128.147.102.32127: Flags [S], cksum 0xe4bd (correct), seq 2905843726, win 65535, options [mss 1240,nop,wscale 6,nop,nop,TS val 3518437147 ecr 0,sackOK,eol], length 0

Based on the traffic capture we were able to identify the sourceIP that is originating the traffic is 10.7.0.99.

Let’s update our previous NGINX ingress rule to accept Ingress traffic.

apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
  name: allow-nginx-ingress
  namespace: advanced-policy-demo
spec:
  selector: app == 'nginx'
  types:
  - Ingress
  ingress:
  - action: Allow
    source:
      selector: run == 'access'
  - action: Allow
    source:
      nets:
      - 10.7.0.99/32
calicoctl apply -f allow-nginx-ingress.yaml

Let’s try again  to reach our NGINX over NodePort.

curl  http://10.128.147.102:32127 --max-time 5
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

It is working! Now let’s try to reach it from the other worker node. This  should work since our service has ExternalTrafficPolicy: Cluster.

curl  http://10.128.146.177:32127 --max-time 5
curl: (28) Operation timed out after 5001 milliseconds with 0 bytes received

What happened, why isn’t it working? Let’s troubleshoot that in our next section.

Troubleshooting Exposed Nginx Applications

In order to identify why the traffic is being dropped from the second worker, we need to enable log rules.

Let’s update our DefaultDeny GloablNetwork Policy so we can identify how the traffic is arriving to worker 10.128.147.102.

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: deny-app-policy
spec:
  namespaceSelector: has(projectcalico.org/name) && projectcalico.org/name not in {"kyverno", "kube-system", "calico-system", "kube-node-lease", "kube-public", "kubernetes-dashboard", "metallb-system", "pf9-addons", "pf9-monitoring", "pf9-olm", "pf9-operators", "platform9-system"}
  types:
  - Ingress
  - Egress
  ingress:
  # log all ingress attempts that are not allowed
  - action: Log
    destination: {}
    protocol: TCP
    source: {}
  egress:
  # allow all namespaces to communicate to DNS pods
  - action: Allow
    protocol: UDP
    destination:
      selector: 'k8s-app == "kube-dns"'
      ports:
      - 53
calicoctl apply -f global-default-deny-log.yaml

Let’s ssh in to worker 10.128.147.102 and take a look at /var/log/messages and then try to curl http://10.128.146.177:32127 so we can see what traffic is being dropped.

curl  http://10.128.146.177:32127 --max-time 5
curl: (28) Operation timed out after 5005 milliseconds with 0 bytes received

Inside /var/log/messages, from worker 10.128.147.102, we were able to see the following kernel messages with the traffic flow that is being dropped.

Feb 16 18:20:49 kyverno03 kernel: calico-packet: IN=eth0 OUT=calia58d9dbfdee MAC=fa:16:3e:c2:43:f3:fa:16:3e:01:2e:36:08:00 SRC=10.128.146.177 DST=10.20.112.90 LEN=64 TOS=0x00 PREC=0x00 TTL=59 ID=0 DF PROTO=TCP SPT=44342 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0
Feb 16 18:20:50 kyverno03 kernel: calico-packet: IN=eth0 OUT=calia58d9dbfdee MAC=fa:16:3e:c2:43:f3:fa:16:3e:01:2e:36:08:00 SRC=10.128.146.177 DST=10.20.112.90 LEN=64 TOS=0x00 PREC=0x00 TTL=60 ID=0 DF PROTO=TCP SPT=44342 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0

Let’s Identify what is what.

kubectl get pods -n advanced-policy-demo -o wide
NAME                         READY   STATUS    RESTARTS   AGE   IP             NODE             NOMINATED NODE   READINESS GATES
pod/nginx-6799fc88d8-zp4zj   1/1     Running   0          21h   10.20.112.90   10.128.147.102   <none>           <none>
kubectl get nodes
NAME             STATUS                     ROLES    AGE     VERSION
10.128.146.177   Ready                      master   7d19h   v1.20.11
10.128.146.255   Ready,SchedulingDisabled   master   7d19h   v1.20.11
10.128.147.102   Ready                      master   7d19h   v1.20.11
10.128.147.68    Ready,SchedulingDisabled   worker   7d19h   v1.20.11

The IP address 10.20.112.90 is the IP of the pod from the replica set, the source  IP that is initiating the traffic is  worker 10.128.146.177.

We have gathered enough information to start updating our ingress policies!

Updating Nginx Ingress Policy

Now that we have identified the traffic that is being dropped, update  the NGINX ingress policies for this kind of traffic.

apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
  name: allow-nginx-ingress
  namespace: advanced-policy-demo
spec:
  selector: app == 'nginx'
  types:
  - Ingress
  ingress:
  - action: Allow
    source:
      selector: run == 'access'
  - action: Allow
    source:
      nets:
      - 10.7.0.99/32
      - 10.128.146.0/23
    destination:
      selector: app == 'nginx'
calicoctl apply -f allow-nginx-ingress.yaml

Now with the policy in place the curl should be working for both endpoints.

curl  http://10.128.146.177:32127 --max-time 5
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

curl  http://10.128.147.102:32127 --max-time 1
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

WireGuard and Zero Trust Networking for NodePorts Exposed Applications

In this section we will enable WireGuard in our environment. Please take a look at the following link in order to review how to enable wireguard in a PMK cluster.

Enabling WireGuard

https://platform9.com/blog/how-to-implement-pci-requirement-of-data-encryption-in-flight-in-kubernetes-clusters/

Now with WireGuard enabled in our environment, we can try to communicate with our application.

curl  http://10.128.146.177:32127 --max-time 1
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

We can confirm that using the NodeIP which has a replica pod of our application is working based on the output. Lets try to confirm from the other NodeIP that has no replica pods of  our application.

curl  http://10.128.147.102:32127 --max-time 1
curl: (28) Operation timed out after 1000 milliseconds with 0 bytes received

Our  other NodeIP is not responding, now that we have wireguard enabled, let’s see why this is happening.

Troubleshooting and Inspecting Traffic

Next we will ssh into node 10.128.156.177 and check /var/log/messages to see if we are able to find any clues about what traffic is being dropped.

Feb 22 21:34:34 kyverno02 kernel: calico-packet: IN=wireguard.cali OUT=califd5b7afc8b0 MAC= SRC=10.20.112.110 DST=10.20.103.150 LEN=64 TOS=0x00 PREC=0x00 TTL=59 ID=0 DF PROTO=TCP SPT=16038 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0
Feb 22 21:34:35 kyverno02 kernel: calico-packet: IN=wireguard.cali OUT=califd5b7afc8b0 MAC= SRC=10.20.112.110 DST=10.20.103.150 LEN=64 TOS=0x00 PREC=0x00 TTL=60 ID=0 DF PROTO=TCP SPT=16038 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0

The traffic that is dropped comes from SRC IP 10.20.112.110, let’s see who is that IP.  Let’s connect to 10.128.147.102 and list the ip interfaces with ip address.

ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
      valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
      valid_lft forever preferred_lft forever
.....
.....
34: wireguard.cali: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none
    inet 10.20.112.110/32 scope global wireguard.cali
      valid_lft forever preferred_lft forever
    inet6 fe80::1ae9:946d:f71:83fa/64 scope link flags 800
      valid_lft forever preferred_lft forever

We can now see that the traffic is coming from the WireGuard interface, since the east-west traffic is being encrypted we can see all traffic is coming from that IP.

Updating Nginx Ingress Policy

Let’s update the ingress policy for our NGINXapplication to accept  ingress traffic coming from the specific WireGuard IP interfaces.

apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
  name: allow-nginx-ingress
  namespace: advanced-policy-demo
spec:
  selector: app == 'nginx'
  types:
  - Ingress
  ingress:
  - action: Allow
    source:
      selector: run == 'access'
  - action: Allow
    source:
      nets:
      - 10.7.0.99/32
      - 10.128.146.0/23
      # WireGuard-Ips
      - 10.20.112.110/32
      - 10.20.40.187/32
    destination:
      selector: app == 'nginx'

Let’s validate that we can reach the application from IP node 10.128.147.102.

curl  http://10.128.147.102:32127 --max-time 1
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

Exposing Applications Over Load Balancer with MetalLB in Zero Trust Networking Environment

In this section we will describe how to enable access to the IPpool defined for MetalLB in our Zero Trust Networking environment.

Patching NodePort service type to LoadBalancer

Let’s execute the next patch command to patch our NGINX service from NodePort to LoadBalancer type.

kubectl patch svc nginx -n advanced-policy-demo --type='json' -p '[{"op":"replace","path":"/spec/type","value":"LoadBalancer"}]'
kubectl get svc -n advanced-policy-demo
NAME    TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)        AGE
nginx   LoadBalancer   10.21.250.63   10.128.146.112   80:32127/TCP   7d23h

Reaching The Exposed Application Over External IP

We will try to curl our application via IP 10.128.146.112 and see if we have success reaching it.

curl  http://10.128.146.112 --max-time 1
curl: (28) Operation timed out after 1002 milliseconds with 0 bytes received

It looks like something is blocking this traffic, we will start troubleshooting in the next section.

Understanding how MetalLB In Layer2 Works

MetalLB works in two modes, L2 and L3 or BGP mode, in this example we will be focusing on layer 2. By default MetalLB operates in L2 mode, which means it uses arp to communicate/advertise who is the active speaker pod that will be handling the traffic for the service, of the type LoadBalancer, we are using in this example.

https://metallb.universe.tf/concepts/layer2/

How Traffic is seen in the Network when reaching a MetalLB IP

Since we are operating in a locked PMK cluster, we need to understand at what level the traffic is going to be seen, and highly likely dropped by our security posture.

At this point we are trying to Ingress into the Cluster from an external IP to another IP that is not part of calico natively, but it is natively for MetalLB. Within the Kubernetes stack the IP that we are trying to reach is contact with kube-proxy, metallb, ipvs and all hostendpoints objects. Based on the calico documentation we should focus our efforts on policies that  have preDNAT set to true. https://projectcalico.docs.tigera.io/reference/host-endpoints/pre-dnat

Let’s enable login actions in our preDNAT policies by updating our policy and applying it.

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: allow-cluster-internal-ingress-only
spec:
  order: 20
  preDNAT: true
  applyOnForward: true
  ingress:
    - action: Allow
      source:
        nets: [10.128.146.0/23, 10.20.0.0/16]
    - action: Log
      destination: {}
      protocol: TCP
      source: {}
    - action: Deny
  selector: has(nodeport-external-ingress)

Now with our policy updated, we need to find out which of the nodes in our cluster is advertising  ,via ARP, the ownership of IP 10.128.146.112. Unfortunately there is no straightforward way to see which is the selected node that is advertising IP 192.128.146.112 and is reachable via macaddress X.

Since we have our log actions in our policy in place now, we can check /var/log/messages on each of the workers and catch dropped traffic.

Feb 23 16:49:38 kyverno02 kernel: calico-packet: IN=eth0 OUT= MAC=fa:16:3e:01:2e:36:00:1c:73:1e:03:a0:08:00 SRC=10.7.0.99 DST=10.128.146.112 LEN=64 TOS=0x00 PREC=0x00 TTL=60 ID=0 DF PROTO=TCP SPT=61863 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0
Feb 23 16:49:39 kyverno02 kernel: calico-packet: IN=eth0 OUT= MAC=fa:16:3e:01:2e:36:00:1c:73:1e:03:a0:08:00 SRC=10.7.0.99 DST=10.128.146.112 LEN=64 TOS=0x00 PREC=0x00 TTL=61 ID=0 DF PROTO=TCP SPT=61863 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0
Feb 23 16:49:40 kyverno02 kernel: calico-packet: IN=eth0 OUT= MAC=fa:16:3e:01:2e:36:00:1c:73:1e:03:a0:08:00 SRC=10.7.0.99 DST=10.128.146.112 LEN=64 TOS=0x00 PREC=0x00 TTL=61 ID=0 DF PROTO=TCP SPT=61863 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0

Updating preDNAT Network Policy

We identified that the 10.128.146.177 node is the selected node that is advertising MetalLB IP 10.128.146.112. We also learned how the traffic is being seen, so we can now update our ingress preDNAT in order to accept the desired traffic.

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: allow-cluster-internal-ingress-only
spec:
  order: 20
  preDNAT: true
  applyOnForward: true
  ingress:
    - action: Allow
      source:
        nets: [10.128.146.0/23, 10.20.0.0/16]
    - action: Allow
      source:
        nets: [10.7.0.99/32]
      destination:
        nets: [10.128.146.112/32]
    - action: Log
      destination: {}
      protocol: TCP
      source: {}
    - action: Deny
  selector: has(nodeport-external-ingress)

Validating Connectivity to Our Exposed Application

Let’s try to reach the application via 10.128.146.112 and see if it is working.

curl  http://10.128.146.112 --max-time 5
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

Exposing Services Over External BGP With Zero Trust Networking

In this section we will continue modifying our current GlobalNetworkPolicies and NetworkPolicies to accept Ingress traffic from a specific network that is being advertised over external BGP.

The following scenario has established that our PMK cluster has started BGP Peering against an external entity. In order to keep things simple I just peer against another PMK cluster that is running a different Autonomous System.

How to advertise Services CIDRs and PODs CIDRs

To start advertising over External BGP we will need two calico objects, a BGP-Config object and a BGP-Peer object, you can find more information about both objects here https://projectcalico.docs.tigera.io/networking/advertise-service-ips, go ahead and create each of the needed objects.

apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
  creationTimestamp: null
  name: default
spec:
  asNumber: 65521
  logSeverityScreen: Debug
  nodeToNodeMeshEnabled: true
  serviceClusterIPs:
  - cidr: 10.21.0.0/16
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: bgppeerextcluster30
spec:
  peerIP: 10.128.147.141
  asNumber: 65531

If we take a look at the routing table of the external entity (10.128.147.141) we are peering against, it will show entries for the networks 10.20.0.0/16 and 10.21.0.0/16

ip route
default via 10.128.146.1 dev eth0
10.20.40.128/26 proto bird
nexthop via 10.128.146.177 dev eth0 weight 1
nexthop via 10.128.146.255 dev eth0 weight 1
10.20.103.128/26 proto bird
nexthop via 10.128.146.177 dev eth0 weight 1
nexthop via 10.128.146.255 dev eth0 weight 1
10.21.0.0/16 proto bird
nexthop via 10.128.146.177 dev eth0 weight 1
nexthop via 10.128.146.255 dev eth0 weight 1
blackhole 10.30.192.192/26 proto bird
10.30.192.193 dev cali3c4cf92133b scope link
10.30.192.194 dev cali38e7407e30b scope link
10.30.192.195 dev calib8cb2f4938d scope link
10.30.192.196 dev cali931d4de0f1b scope link
10.30.192.197 dev calibc79e3370e8 scope link
10.30.192.199 dev cali4e9ef2c9989 scope link
10.30.192.200 dev calibd3aa04a819 scope link
10.128.146.0/23 dev eth0 proto kernel scope link src 10.128.147.141
169.254.0.0/16 dev eth0 scope link metric 1002
169.254.169.254 via 10.128.146.4 dev eth0 proto static

Let’s move on and start updating the Network Policies

Updating Network Policies From Specific External Sources

Based on the previous sections where we have to allow access at different points in the clusters,  we need to do the same for this new flow of traffic.

First, we need to allow traffic at the perimeter level specifying the source and destination in detail.

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: allow-cluster-internal-ingress-only
spec:
  order: 20
  preDNAT: true
  applyOnForward: true
  ingress:
    - action: Allow
      source:
        nets: [10.128.146.0/23, 10.20.0.0/16]
    - action: Allow
      source:
        nets: [10.7.0.99/32]
      destination:
        nets: [10.128.146.112/32]
    - action: Log
      destination: {}
      protocol: TCP
      source: {}
    - action: Allow
      source:
        nets: [10.30.0.0/16]
      destination:
        nets: [10.21.172.202/32]
    - action: Deny
  selector: has(nodeport-external-ingress)

Updating Application Network Policies To Receive Traffic

While testing this use case, a valuable lesson was learned. When you peer with an external router and you are using route reflectors, and you advertised the Service IP over BGP or in other words the (ClusterIP),  the source IP that is generating the traffic is going to be lost/Source-NAT’d independently that we are using ExternalTrafficPolicy: Local for the service, KubeProxy is responsible for that. In our scenario, we saw that the source IP  detected in the logs of the pods was the IP of any of your route reflector nodes.

So in this case, if you have reached up to this point, you can guess that the application is reachable over BGP fabric, since we are allowing sa source network – 10.128.146.0/23 ( which happens to be is the network in which all the nodes are using to conform the cluster), so no changes are needed to the application policy.

You may also enjoy

The Six Most Popular Kubernetes Networking Troubleshooting Issues

By Platform9

Using MetalLB to add the LoadBalancer Service to Kubernetes Environments

By Mike Petersen

The browser you are using is outdated. For the best experience please download or update your browser to one of the following:

Webinar: How to rapidly evaluate Managed K8s SolutionsRegister Now