Kubernetes Audit Logs — Detection Engineering Before It Is Too Late

The Uncomfortable Truth

An attacker execs into a production pod, reads the mounted service account token, tests permissions, lists secrets, creates a ClusterRoleBinding, deploys a cryptominer, and disappears inside normal cluster noise. The attack takes minutes. The incident is discovered weeks later when a bill spikes or a customer reports impact.

Audit logs were not missing. Signal was missing. The API server recorded the behavior, but the policy captured the wrong level, the backend stored too much noise, and nobody had detections mapped to attacker actions.

Kubernetes audit logging is the best native evidence source in the platform. Every API request crosses the API server: pods/exec, secrets, rolebindings, clusterrolebindings, serviceaccounts/token, workload creation, and network policy deletion. If you tune it correctly, you see the attack before the attacker gets durable control.

Why Default Audit Logging Fails

Most clusters fail in two predictable ways.

No audit logging

The API server is not configured with an audit policy. After compromise, your forensic capability is guesswork.

Metadata for everything

The cluster produces massive event volume, but the logs lack request bodies for the operations that matter.

Secrets logged incorrectly

Logging secrets at RequestResponse turns your SIEM into a credential vault for attackers.

No noise suppression

Kubelets, probes, leases, and events drown out human and attacker behavior.

A good audit policy is not a compliance checkbox. It is a threat model expressed as YAML: drop what cannot help, preserve what attackers touch, and keep sensitive values out of the log stream.

Audit Levels That Matter

Level	Captured data	Use it for	Risk
`None`	No event	Health checks, leases, API discovery, noisy system chatter	Blind if overused
`Metadata`	User, verb, resource, namespace, status, source IP	Secrets, broad read operations, auth probes	Limited forensic depth
`Request`	Metadata plus request body	Selected write operations	Can expose sensitive inputs
`RequestResponse`	Metadata, request body, response body	RBAC mutations, workload creation, pod exec, service account token creation	High volume; dangerous on secrets

Never log secrets at Request or RequestResponse. You need to know who touched a secret, not copy the secret value into your log backend.

Security-Focused Audit Policy

This policy is opinionated. It drops system noise first, then captures high-value attacker behavior at the right level. It also removes RequestReceived, which doubles volume without helping detection.

/etc/kubernetes/audit-policy.yaml

apiVersion: audit.k8s.io/v1
kind: Policy

omitStages:
  - "RequestReceived"

rules:
  # Drop high-volume system noise.
  - level: None
    users: ["system:kube-probe"]

  - level: None
    userGroups: ["system:nodes"]
    verbs: ["get", "list", "watch"]

  - level: None
    resources:
      - group: ""
        resources: ["events"]
      - group: "coordination.k8s.io"
        resources: ["leases"]

  - level: None
    nonResourceURLs: ["/healthz*", "/livez*", "/readyz*", "/version"]

  # Secrets: metadata only. Never log secret values.
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets"]

  # RBAC mutations: full body for forensics.
  - level: RequestResponse
    resources:
      - group: "rbac.authorization.k8s.io"
        resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]

  # Interactive access.
  - level: RequestResponse
    resources:
      - group: ""
        resources: ["pods/exec", "pods/attach", "pods/portforward"]

  # Service account and token creation.
  - level: RequestResponse
    resources:
      - group: ""
        resources: ["serviceaccounts", "serviceaccounts/token"]

  # Workload mutations.
  - level: RequestResponse
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: ""
        resources: ["pods", "configmaps", "namespaces"]
      - group: "apps"
        resources: ["deployments", "daemonsets", "statefulsets", "replicasets"]
      - group: "batch"
        resources: ["jobs", "cronjobs"]

  # Network policy changes can enable lateral movement.
  - level: RequestResponse
    resources:
      - group: "networking.k8s.io"
        resources: ["networkpolicies"]

  # Auth probing.
  - level: Metadata
    resources:
      - group: "authentication.k8s.io"
        resources: ["tokenreviews"]
      - group: "authorization.k8s.io"
        resources: ["subjectaccessreviews"]

  # Catch-all.
  - level: Metadata

Detection Rules Mapped to the Kill Chain

Detection rules must map to what attackers actually do. Do not alert on everything. Alert on behavior that changes attacker capability.

1. ProbeAnonymous requests, 403 spikes, SubjectAccessReview abuse

2. Executepods/exec, pods/attach, port-forward

3. HarvestSecret access and service account token creation

4. EscalateRoleBinding or ClusterRoleBinding mutation

5. PersistCronJob, DaemonSet, unknown image registry

1. Anonymous or unauthenticated API access

Any successful request from system:anonymous or system:unauthenticated is a page. Denied bursts are reconnaissance.

Detection logic

user.username = "system:anonymous"
OR user.groups contains "system:unauthenticated"

Alert:
- any 2xx response
- more than 10 denied requests in 5 minutes from one source IP

2. 403 spikes by identity

Attackers probe RBAC. Legitimate operators usually know what they are allowed to do. A service account generating denied requests across namespaces is a strong signal.

Detection logic

annotations.authorization.k8s.io/decision = "forbid"
GROUP BY user.username
WHERE count > 10 in 5 minutes

High priority:
objectRef.subresource = "exec"
AND decision = "forbid"

3. Cross-namespace secret access

Normal workloads rarely enumerate secrets across namespaces. Attackers do. Secrets must stay at Metadata level so the detection sees access without leaking values.

Detection logic

objectRef.resource = "secrets"
AND verb IN ("get", "list")
GROUP BY user.username
WHERE distinct(objectRef.namespace) > 3 in 10 minutes

4. Pod exec and attach

pods/exec is command execution inside a container. In production, every exec should be explained by an incident ticket, an SRE action, or a break-glass workflow.

Detection logic

objectRef.subresource IN ("exec", "attach")
AND verb = "create"

Critical:
objectRef.namespace = "kube-system"

5. Cluster-admin binding creation

This is not suspicious. This is escalation. A new ClusterRoleBinding pointing to cluster-admin gives the subject full control of the cluster.

Detection logic

objectRef.resource = "clusterrolebindings"
AND verb IN ("create", "patch", "update")
AND requestObject.roleRef.name = "cluster-admin"

Also alert:
RBAC mutation by any service account outside platform namespaces

6. Privileged pod or hostPath creation

A privileged container, hostPID, hostNetwork, or sensitive hostPath mount is container escape by configuration.

Sensitive fields

requestObject.spec.hostPID = true
OR requestObject.spec.hostNetwork = true
OR requestObject.spec.containers[*].securityContext.privileged = true
OR requestObject.spec.volumes[*].hostPath.path IN ("/", "/etc", "/proc", "/dev", "/sys", "/var/run/docker.sock")

7. Persistence workload in system namespaces

Attackers hide in places operators mentally skip. A CronJob or DaemonSet in kube-system or a platform namespace deserves immediate review.

Detection logic

verb = "create"
AND objectRef.resource IN ("cronjobs", "daemonsets", "deployments")
AND objectRef.namespace IN ("kube-system", "monitoring", "logging", "ingress-nginx")

8. NetworkPolicy deletion

Deleting or weakening NetworkPolicies is a lateral movement enabler. If an identity removes egress restrictions, assume the next move is east-west scanning.

Detection logic

objectRef.resource = "networkpolicies"
AND verb IN ("delete", "patch", "update")

Audit Event Anatomy

Raw audit events are noisy until you normalize the fields that carry security meaning. These are the fields I would keep in every detection pipeline before enrichment.

Field	Why it matters	Detection use
`user.username`	Identity behind the request.	Separate human operators, controllers, and service accounts.
`sourceIPs`	Network origin of the API call.	Detect stolen service account tokens used outside expected pod ranges.
`verb`	Action requested.	Prioritize `create`, `patch`, `update`, `delete`, and suspicious `list`.
`objectRef.resource`	Target resource.	Catch access to `secrets`, `pods/exec`, `clusterrolebindings`, and `networkpolicies`.
`objectRef.namespace`	Blast-radius boundary.	Detect cross-namespace sweeps and access into `kube-system`.
`responseStatus.code`	Allowed or denied result.	Differentiate successful compromise from permission probing.
`requestObject`	Payload of high-risk writes.	Find privileged pods, hostPath mounts, cluster-admin bindings, unknown images.
`annotations.authorization.k8s.io/decision`	RBAC decision context.	Baseline denied requests and flag enumeration spikes.

high-signal audit event shape

{
  "verb": "create",
  "user": { "username": "system:serviceaccount:prod:web" },
  "sourceIPs": ["10.42.8.19"],
  "objectRef": {
    "namespace": "payments",
    "resource": "pods",
    "subresource": "exec"
  },
  "responseStatus": { "code": 201 },
  "annotations": {
    "authorization.k8s.io/decision": "allow"
  }
}

SIEM Query Pack

The article should not stop at theory. Below are starter queries you can adapt to CloudWatch Logs Insights, Microsoft Sentinel / Log Analytics, and Splunk. Tune namespaces and identities to your environment.

CloudWatch Logs Insights — EKS exec and secret access

cloudwatch logs insights

fields @timestamp, user.username, verb, objectRef.namespace, objectRef.resource, objectRef.subresource, sourceIPs.0, responseStatus.code
| filter objectRef.subresource in ["exec", "attach"] or objectRef.resource = "secrets"
| filter responseStatus.code between 200 and 299
| sort @timestamp desc
| limit 100

Microsoft Sentinel / AKS — cluster-admin binding

kql

AzureDiagnostics
| where Category in ("kube-audit", "kube-audit-admin")
| where requestObject_s has "cluster-admin"
| where objectRef_resource_s == "clusterrolebindings"
| where verb_s in ("create", "patch", "update")
| project TimeGenerated, user_username_s, sourceIPs_s, verb_s, objectRef_name_s, requestObject_s

Splunk — forbidden request spike

splunk

index=kubernetes_audit annotations.authorization.k8s.io/decision=forbid
| bin _time span=5m
| stats count dc(objectRef.resource) as resources values(objectRef.namespace) as namespaces by _time user.username sourceIPs{}
| where count > 10 OR resources > 4

Response Playbook

Audit detections should trigger action. The faster you move from event to containment, the less value the attacker extracts from the cluster.

Revoke identity

Delete or rotate the compromised service account token. Remove suspicious RoleBindings and ClusterRoleBindings immediately.

Isolate workload

Apply a deny-all egress NetworkPolicy to the namespace or quarantine the pod with labels your CNI enforces.

Reconstruct timeline

Pivot on user.username, sourceIPs, pod name, namespace, and auditID across the previous 24 hours.

Prevent replay

Disable automounted service account tokens where not needed and tighten RBAC to namespace-scoped, verb-scoped roles.

Build the Detection Pipeline

A detection rule in a document is not a control. Ship audit logs outside the cluster, normalize them, alert on high-confidence behavior, and keep enough retention for incident reconstruction.

Reference pipeline

API Server audit log
  ├─ file backend or webhook backend
  ├─ Fluent Bit / managed control-plane logging
  ├─ external storage: S3, CloudWatch, Log Analytics, Elasticsearch, Splunk
  ├─ detection processor: SIEM rules, Sigma mappings, custom correlation
  └─ alerting: Slack, Teams, PagerDuty, incident queue

The audit trail must survive cluster compromise. If the attacker can delete your logs from inside the cluster, your logging architecture is part of the blast radius.

Managed Kubernetes Reality

Platform	What to enable	Operational note
EKS	Control plane audit logging to CloudWatch	Disabled unless explicitly enabled. Query with CloudWatch Logs Insights and forward to SIEM.
GKE	Admin Activity plus Data Access logs	Admin Activity is default. Data Access gives richer read visibility and must be deliberately enabled.
AKS	Diagnostic settings for `kube-audit` or `kube-audit-admin`	Send to Log Analytics. Use analytics rules for exec, secret access, RBAC mutation, and privileged pod creation.

Audit logs capture API server activity. They do not capture commands after a successful exec, raw pod-to-pod traffic, direct kubelet API abuse, direct etcd access, or process-level behavior inside the container.

Container commands

After pods/exec, the API server does not see every shell command. Use runtime telemetry.

East-west traffic

API audit does not show a compromised pod talking to a database. Use CNI/network telemetry.

Direct kubelet access

Traffic to kubelet can bypass the API server. Harden and monitor kubelet endpoints.

Direct etcd access

etcd access bypasses Kubernetes audit logs entirely. Isolate etcd and enforce authentication.

Audit Log Maturity Model

Level	State	Detection capability
0	No audit logging	Post-incident guessing.
1	Default logging, noisy policy	Some forensics, weak detection.
2	Tuned policy and external storage	Fast investigation and reliable event history.
3	Automated detections	Real-time alerts for common attack paths.
4	Layered detection and response	Audit logs, runtime signals, network telemetry, and automated containment.

Get to Level 2 in one week. Get to Level 3 in one month. Level 4 is where mature Kubernetes security operations live.

Final Word

Every serious Kubernetes attack leaves API-server fingerprints. The attacker who steals a service account token uses it against the API server. The attacker who creates a privileged pod submits it through the API server. The attacker who binds themselves to cluster-admin modifies RBAC through the API server.

My position is simple: if Kubernetes audit logs are not tuned, shipped externally, and connected to detection logic, the cluster is not monitored. It is only producing evidence for someone to read after the damage is done.

— Riad DAHMANI, k8sec.io

Turn audit logs into attack-path intelligence.

K8SEC correlates audit events with RBAC, workload posture, and network exposure so teams can see attack paths before they become incidents.

Explore K8SEC

Kubernetes Audit Logs
Detect Attacks Before It Is Too Late

The Uncomfortable Truth

Why Default Audit Logging Fails

Audit Levels That Matter

Security-Focused Audit Policy

Detection Rules Mapped to the Kill Chain

1. Anonymous or unauthenticated API access

2. 403 spikes by identity

3. Cross-namespace secret access

4. Pod exec and attach

5. Cluster-admin binding creation

6. Privileged pod or hostPath creation

7. Persistence workload in system namespaces

8. NetworkPolicy deletion

Audit Event Anatomy

SIEM Query Pack

CloudWatch Logs Insights — EKS exec and secret access

Microsoft Sentinel / AKS — cluster-admin binding

Splunk — forbidden request spike

Response Playbook

Build the Detection Pipeline

Managed Kubernetes Reality

What Audit Logs Cannot See

Audit Log Maturity Model

Final Word

Turn audit logs into attack-path intelligence.