External Data Sources
The Variables section discusses how variables can help create smarter and reusable policy definitions and introduced the concept of a rule context
that stores all variables.
This section provides details on using ConfigMaps, API calls, and image registries to reference external data as variables in policies.
Note
For improved security and performance, Kyverno is designed to not allow connections to systems other than the cluster Kubernetes API server and image registries. Use a separate controller to fetch data from any source and store it in a ConfigMap that can be efficiently used in a policy. This design enables separation of concerns and enforcement of security boundaries.Variables from ConfigMaps
A ConfigMap resource in Kubernetes is commonly used as a source of configuration details which can be consumed by applications. This data can be written in multiple formats, stored in a Namespace, and accessed easily. Kyverno supports using a ConfigMap as a data source for variables. When a policy referencing a ConfigMap resource is evaluated, the ConfigMap data is checked at that time ensuring that references to the ConfigMap are always dynamic. Should the ConfigMap be updated, subsequent policy lookups will pick up the latest data at that point.
In order to consume data from a ConfigMap in a rule
, a context
is required. For each rule
you wish to consume data from a ConfigMap, you must define a context
. The context data can then be referenced in the policy rule
using JMESPath notation.
Looking up ConfigMap values
A ConfigMap that is defined in a rule’s context
can be referred to using its unique name within the context. ConfigMap values can be referenced using a JMESPath style expression.
1{{ <context-name>.data.<key-name> }}
Consider a simple ConfigMap definition like so.
1apiVersion: v1
2kind: ConfigMap
3metadata:
4 name: some-config-map
5 namespace: some-namespace
6data:
7 env: production
To refer to values from a ConfigMap inside a rule
, define a context
inside the rule
with one or more ConfigMap declarations. Using the sample ConfigMap snippet referenced above, the below rule
defines a context
which references this specific ConfigMap by name.
1rules:
2 - name: example-lookup
3 # Define a context for the rule
4 context:
5 # A unique name for the ConfigMap
6 - name: dictionary
7 configMap:
8 # Name of the ConfigMap which will be looked up
9 name: some-config-map
10 # Namespace in which this ConfigMap is stored
11 namespace: some-namespace
Based on the example above, we can now refer to a ConfigMap value using {{dictionary.data.env}}
. The variable will be substituted with the value production
during policy execution.
Put into context of a full ClusterPolicy
, referencing a ConfigMap as a variable looks like the following.
1apiVersion: kyverno.io/v1
2kind: ClusterPolicy
3metadata:
4 name: cm-variable-example
5 annotations:
6 pod-policies.kyverno.io/autogen-controllers: DaemonSet,Deployment,StatefulSet
7spec:
8 rules:
9 - name: example-configmap-lookup
10 context:
11 - name: dictionary
12 configMap:
13 name: some-config-map
14 namespace: some-namespace
15 match:
16 any:
17 - resources:
18 kinds:
19 - Pod
20 mutate:
21 patchStrategicMerge:
22 metadata:
23 labels:
24 my-environment-name: "{{dictionary.data.env}}"
In the above ClusterPolicy
, a mutate
rule matches all incoming Pod resources and adds a label to them with the name of my-environment-name
. Because we have defined a context
which points to our earlier ConfigMap named mycmap
, we can reference the value with the expression {{dictionary.data.env}}
. A new Pod will then receive the label my-environment-name=production
.
Note
ConfigMap names and keys can contain characters that are not supported by JMESPath, such as “-” (minus or dash) or “/” (slash). To evaluate these characters as literals, add double quotes to that part of the JMESPath expression as follows:
{{ "<name>".data."<key>" }}
See the JMESPath page for more information on formatting concerns.
Handling ConfigMap Array Values
In addition to simple string values, Kyverno has the ability to consume array values from a ConfigMap.
Note
Storing array values in a YAML block scalar was removed as of Kyverno 1.7.0. Please use JSON-encoded array of strings instead.For example, let’s say you wanted to define a list of allowed roles in a ConfigMap. A Kyverno policy can refer to this list to deny a request where the role, defined as an annotation, does not match one of the values in the list.
Consider a ConfigMap with the following content written as a YAML multi-line value.
1apiVersion: v1
2kind: ConfigMap
3metadata:
4 name: roles-dictionary
5 namespace: default
6data:
7 allowed-roles: "[\"cluster-admin\", \"cluster-operator\", \"tenant-admin\"]"
Note
As mentioned previously, certain characters must be escaped for JMESPath processing. In this case, the backslash ("\
") character is used to escape the double quotes which allow the ConfigMap data to be stored as a JSON array.
Now that the array data is saved in the allowed-roles
key, here is a sample ClusterPolicy containing a single rule
that blocks a Deployment if the value of the annotation named role
is not in the allowed list:
1apiVersion: kyverno.io/v1
2kind: ClusterPolicy
3metadata:
4 name: cm-array-example
5spec:
6 validationFailureAction: enforce
7 background: false
8 rules:
9 - name: validate-role-annotation
10 context:
11 - name: roles-dictionary
12 configMap:
13 name: roles-dictionary
14 namespace: default
15 match:
16 any:
17 - resources:
18 kinds:
19 - Deployment
20 validate:
21 message: "The role {{ request.object.metadata.annotations.role }} is not in the allowed list of roles: {{ \"roles-dictionary\".data.\"allowed-roles\" }}."
22 deny:
23 conditions:
24 any:
25 - key: "{{ request.object.metadata.annotations.role }}"
26 operator: NotIn
27 value: "{{ \"roles-dictionary\".data.\"allowed-roles\" }}"
This rule denies the request for a new Deployment if the annotation role
is not found in the array we defined in the earlier ConfigMap named roles-dictionary
.
Note
You may also notice that this sample uses variables from both AdmissionReview and ConfigMap sources in a single rule. This combination can prove to be very powerful and flexible in crafting useful policies.Once creating this sample ClusterPolicy
, attempt to create a new Deployment where the annotation role=super-user
and test the result.
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: busybox
5 annotations:
6 role: super-user
7 labels:
8 app: busybox
9spec:
10 replicas: 1
11 selector:
12 matchLabels:
13 app: busybox
14 template:
15 metadata:
16 labels:
17 app: busybox
18 spec:
19 containers:
20 - image: busybox:1.28
21 name: busybox
22 command: ["sleep", "9999"]
Submit the manifest and see how Kyverno reacts.
1kubectl create -f deploy.yaml
1Error from server: error when creating "deploy.yaml": admission webhook "validate.kyverno.svc" denied the request:
2
3resource Deployment/default/busybox was blocked due to the following policies
4
5cm-array-example:
6 validate-role-annotation: 'The role super-user is not in the allowed list of roles: ["cluster-admin", "cluster-operator", "tenant-admin"].'
Changing the role
annotation to one of the values present in the ConfigMap, for example tenant-admin
, allows the Deployment resource to be created.
Variables from Kubernetes API Server Calls
Kubernetes is powered by a declarative API that allows querying and manipulating resources. Kyverno policies can use the Kubernetes API to fetch a resource, or even collections of resource types, for use in a policy. Additionally, Kyverno allows applying JMESPath (JSON Match Expression) to the resource data to extract and transform values into a format that is easy to use within a policy.
A Kyverno Kubernetes API call works just as with kubectl
and other API clients, and can be tested using existing tools.
For example, here is a command line that uses kubectl
to fetch the list of Pods in a Namespace and then pipes the output to kyverno jp
which counts the number of Pods:
1kubectl get --raw /api/v1/namespaces/kyverno/pods | kyverno jp "items | length(@)"
The corresponding API call in Kyverno is defined as below. It uses a variable {{request.namespace}}
to use the Namespace of the object being operated on, and then applies the same JMESPath to store the count of Pods in the Namespace in the context as the variable podCount
. Variables may be used in both fields. This new resulting variable podCount
can then be used in the policy rule.
1rules:
2- name: example-api-call
3 context:
4 - name: podCount
5 apiCall:
6 urlPath: "/api/v1/namespaces/{{request.namespace}}/pods"
7 jmesPath: "items | length(@)"
URL Paths
The Kubernetes API organizes resources under groups and versions. For example, the resource type Deployment
is available in the API Group apps
with a version v1
.
The HTTP URL paths of the API calls are based on the group, version, and resource type as follows:
/apis/{GROUP}/{VERSION}/{RESOURCETYPE}
: get a collection of resources/apis/{GROUP}/{VERSION}/{RESOURCETYPE}/{NAME}
: get a resource
For namespaced resources, to get a specific resource by name or to get all resources in a Namespace, the Namespace name must also be provided as follows:
/apis/{GROUP}/{VERSION}/namespaces/{NAMESPACE}/{RESOURCETYPE}
: get a collection of resources in the namespace/apis/{GROUP}/{VERSION}/namespaces/{NAMESPACE}/{RESOURCETYPE}/{NAME}
: get a resource in a namespace
For historic resources, the Kubernetes Core API is available under /api/v1
. For example, to query all Namespace resources the path /api/v1/namespaces
is used.
The Kubernetes API groups are defined in the API reference documentation for v1.22 and can also be retrieved via the kubectl api-resources
command shown below:
1$ kubectl api-resources
2NAME SHORTNAMES APIGROUP NAMESPACED KIND
3bindings true Binding
4componentstatuses cs false ComponentStatus
5configmaps cm true ConfigMap
6endpoints ep true Endpoints
7events ev true Event
8limitranges limits true LimitRange
9namespaces ns false Namespace
10nodes no false Node
11persistentvolumeclaims pvc true PersistentVolumeClaim
12
13...
The kubectl api-versions
command prints out the available versions for each API group. Here is a sample:
1$ kubectl api-versions
2admissionregistration.k8s.io/v1
3admissionregistration.k8s.io/v1beta1
4apiextensions.k8s.io/v1
5apiextensions.k8s.io/v1beta1
6apiregistration.k8s.io/v1
7apiregistration.k8s.io/v1beta1
8apps/v1
9authentication.k8s.io/v1
10authentication.k8s.io/v1beta1
11authorization.k8s.io/v1
12authorization.k8s.io/v1beta1
13autoscaling/v1
14autoscaling/v2beta1
15autoscaling/v2beta2
16batch/v1
17...
You can use these commands together to find the URL path for resources, as shown below:
Tip
To find the API group and version for a resource use kubectl api-resources
to find the group and then kubectl api-versions
to find the available versions.
This example finds the group of Deployment
resources and then queries the version:
1kubectl api-resources | grep deploy
The API group is shown in the third column of the output. You can then use the group name to find the version:
1kubectl api-versions | grep apps
The output of this will be apps/v1
. Older versions of Kubernetes (prior to 1.18) will also show apps/v1beta2
.
Handling collections
The API server response for a HTTP GET
on a URL path that requests collections of resources will be an object with a list of items (resources).
Here is an example that fetches all Namespace resources:
1kubectl get --raw /api/v1/namespaces | jq
Tip
Usejq
to format output for readability.
This will return a NamespaceList
object with a property items
that contains the list of Namespaces:
1{
2 "kind": "NamespaceList",
3 "apiVersion": "v1",
4 "metadata": {
5 "selfLink": "/api/v1/namespaces",
6 "resourceVersion": "2009258"
7 },
8 "items": [
9 {
10 "metadata": {
11 "name": "default",
12 "selfLink": "/api/v1/namespaces/default",
13 "uid": "5011b5d5-abb7-4fef-93f9-8b5fa4b2eba9",
14 "resourceVersion": "155",
15 "creationTimestamp": "2021-01-19T20:20:37Z",
16 "managedFields": [
17 {
18 "manager": "kube-apiserver",
19 "operation": "Update",
20 "apiVersion": "v1",
21 "time": "2021-01-19T20:20:37Z",
22 "fieldsType": "FieldsV1",
23 "fieldsV1": {
24 "f:status": {
25 "f:phase": {}
26 }
27 }
28 }
29 ]
30 },
31 "spec": {
32 "finalizers": [
33 "kubernetes"
34 ]
35 },
36 "status": {
37 "phase": "Active"
38 }
39 },
40 ...
To process this data in JMESPath, reference the items
. Here is an example which extracts a few metadata fields across all Namespace resources:
1kubectl get --raw /api/v1/namespaces | kyverno jp "items[*].{name: metadata.name, creationTime: metadata.creationTimestamp}"
This produces a new JSON list of objects with properties name
and creationTime
.
1[
2 {
3 "creationTimestamp": "2021-01-19T20:20:37Z",
4 "name": "default"
5 },
6 {
7 "creationTimestamp": "2021-01-19T20:20:36Z",
8 "name": "kube-node-lease"
9 },
10 ...
To find an item in the list you can use JMESPath filters. For example, this command will match a Namespace by its name:
1kubectl get --raw /api/v1/namespaces | kyverno jp "items[?metadata.name == 'default'].{uid: metadata.uid, creationTimestamp: metadata.creationTimestamp}"
In addition to wildcards and filters, JMESPath has many additional powerful features including several useful functions. Be sure to go through the JMESPath tutorial and try the interactive examples in addition to the Kyverno JMESPath page here.
Sample Policy: Limit Services of type LoadBalancer in a Namespace
Here is a complete sample policy that limits each namespace to a single service of type LoadBalancer
.
1apiVersion: kyverno.io/v1
2kind: ClusterPolicy
3metadata:
4 name: limits
5spec:
6 validationFailureAction: enforce
7 rules:
8 - name: limit-lb-svc
9 match:
10 any:
11 - resources:
12 kinds:
13 - Service
14 context:
15 - name: serviceCount
16 apiCall:
17 urlPath: "/api/v1/namespaces/{{ request.namespace }}/services"
18 jmesPath: "items[?spec.type == 'LoadBalancer'] | length(@)"
19 preconditions:
20 any:
21 - key: "{{ request.operation }}"
22 operator: Equals
23 value: CREATE
24 validate:
25 message: "Only one LoadBalancer service is allowed per namespace"
26 deny:
27 conditions:
28 any:
29 - key: "{{ serviceCount }}"
30 operator: GreaterThan
31 value: 1
This sample policy retrieves the list of Services in the Namespace and stores the count of type LoadBalancer
in a variable called serviceCount. A deny
rule is used to ensure that the count cannot exceed one.
Variables from Image Registries
A context can also be used to store metadata on an OCI image by using the imageRegistry
context type. By using this external data source, a Kyverno policy can make decisions based on details of the container image that occurs as part of an incoming resource.
For example, if you are using an imageRegistry
like shown below:
1context:
2- name: imageData
3 imageRegistry:
4 reference: "ghcr.io/kyverno/kyverno"
the output imageData
variable will have a structure which looks like the following:
1{
2 "image": "ghcr.io/kyverno/kyverno",
3 "resolvedImage": "ghcr.io/kyverno/kyverno@sha256:17bfcdf276ce2cec0236e069f0ad6b3536c653c73dbeba59405334c0d3b51ecb",
4 "registry": "ghcr.io",
5 "repository": "kyverno/kyverno",
6 "identifier": "latest",
7 "manifest": manifest,
8 "configData": config,
9}
Note
The imageData
variable represents a “normalized” view of an image after any redirects by the registry are performed and internal modifications by Kyverno (Kyverno by default sets an empty registry to docker.io
and an empty tag to latest
). Most notably, this impacts official images hosted on Docker Hub. Official images on Docker Hub are differentiated from other images in that their repository is prefixed by library/
even if the image being pulled does not contain it. For example, pulling the python official image with python:slim
results in the following fields of imageData
being set:
1{
2 "image": "docker.io/python:slim",
3 "resolvedImage": "index.docker.io/library/python@sha256:43705a7d3a22c5b954ed4bd8db073698522128cf2aaec07690a34aab59c65066",
4 "registry": "index.docker.io",
5 "repository": "library/python",
6 "identifier": "slim"
7}
The manifest
and config
keys contain the output from crane manifest <image>
and crane config <image>
respectively.
For example, one could inspect the labels, entrypoint, volumes, history, layers, etc of a given image. Using the crane tool, show the config of the ghcr.io/kyverno/kyverno:latest
image:
1$ crane config ghcr.io/kyverno/kyverno:latest | jq
2{
3 "architecture": "amd64",
4 "config": {
5 "User": "10001",
6 "Env": [
7 "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
8 ],
9 "Entrypoint": [
10 "./kyverno"
11 ],
12 "WorkingDir": "/",
13 "Labels": {
14 "maintainer": "Kyverno"
15 },
16 "OnBuild": null
17 },
18 "created": "2022-02-04T08:57:38.818583756Z",
19 "history": [
20 {
21 "created": "2022-02-04T08:57:38.454742161Z",
22 "created_by": "LABEL maintainer=Kyverno",
23 "comment": "buildkit.dockerfile.v0",
24 "empty_layer": true
25 },
26 {
27 "created": "2022-02-04T08:57:38.454742161Z",
28 "created_by": "COPY /output/kyverno / # buildkit",
29 "comment": "buildkit.dockerfile.v0"
30 },
31 {
32 "created": "2022-02-04T08:57:38.802069102Z",
33 "created_by": "COPY /etc/passwd /etc/passwd # buildkit",
34 "comment": "buildkit.dockerfile.v0"
35 },
36 {
37 "created": "2022-02-04T08:57:38.818583756Z",
38 "created_by": "COPY /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ # buildkit",
39 "comment": "buildkit.dockerfile.v0"
40 },
41 {
42 "created": "2022-02-04T08:57:38.818583756Z",
43 "created_by": "USER 10001",
44 "comment": "buildkit.dockerfile.v0",
45 "empty_layer": true
46 },
47 {
48 "created": "2022-02-04T08:57:38.818583756Z",
49 "created_by": "ENTRYPOINT [\"./kyverno\"]",
50 "comment": "buildkit.dockerfile.v0",
51 "empty_layer": true
52 }
53 ],
54 "os": "linux",
55 "rootfs": {
56 "type": "layers",
57 "diff_ids": [
58 "sha256:180b308b8730567d2d06a342148e1e9d274c8db84113077cfd0104a7e68db646",
59 "sha256:99187eab8264c714d0c260ae8b727c4d2bda3a9962635aaea67d04d0f8b0f466",
60 "sha256:26d825f3d198779c4990007ae907ba21e7c7b6213a7eb78d908122e435ec9958"
61 ]
62 }
63}
In the output above, we can see under config.User
that the USER
Dockerfile statement to run this container is 10001
. A Kyverno policy can be written to harness this information and perform, for example, a validation that the USER
of an image is non-root.
1apiVersion: kyverno.io/v1
2kind: ClusterPolicy
3metadata:
4 name: imageref-demo
5spec:
6 validationFailureAction: enforce
7 rules:
8 - name: no-root-images
9 match:
10 any:
11 - resources:
12 kinds:
13 - Pod
14 preconditions:
15 all:
16 - key: "{{request.operation}}"
17 operator: NotEquals
18 value: DELETE
19 validate:
20 message: "Images run as root are not allowed."
21 foreach:
22 - list: "request.object.spec.containers"
23 context:
24 - name: imageData
25 imageRegistry:
26 reference: "{{ element.image }}"
27 deny:
28 conditions:
29 any:
30 - key: "{{ imageData.configData.config.User || ''}}"
31 operator: Equals
32 value: ""
In the above sample policy, a new context has been written named imageData
which uses the imageRegistry
type. The reference
key is used to instruct Kyverno where the image metadata is stored. In this case, the location is the same as the image itself hence element.image
where element
is each container image inside of a Pod. The value can then be referenced in an expression, for example in deny.conditions
via the key {{ imageData.configData.config.User || ''}}
.
Using a sample “bad” resource to test which violates this policy, such as below, the Pod is blocked.
1apiVersion: v1
2kind: Pod
3metadata:
4 name: badpod
5spec:
6 containers:
7 - name: ubuntu
8 image: ubuntu:latest
1$ kubectl apply -f bad.yaml
2Error from server: error when creating "bad.yaml": admission webhook "validate.kyverno.svc-fail" denied the request:
3
4resource Pod/default/badpod was blocked due to the following policies
5
6imageref-demo:
7 no-root-images: 'validation failure: Images run as root are not allowed.'
By contrast, when using a “good” Pod, such as the Kyverno container image referenced above, the resource is allowed.
1apiVersion: v1
2kind: Pod
3metadata:
4 name: goodpod
5spec:
6 containers:
7 - name: kyverno
8 image: ghcr.io/kyverno/kyverno:latest
1$ kubectl apply -f good.yaml
2pod/goodpod created
The imageRegistry
context type also has an optional property called jmesPath
which can be used to apply a JMESPath expression to contents returned by imageRegistry
prior to storing as the context value. For example, the below snippet stores the total size of an image in a context named imageSize
by summing up all the constituent layers of the image as reported by its manifest (visible with, for example, crane
by using the crane manifest
command). The value of the context variable can then be evaluated in a later expression.
1context:
2 - name: imageSize
3 imageRegistry:
4 reference: "{{ element.image }}"
5 # Note that we need to use `to_string` here to allow kyverno to treat it like a resource quantity of type memory
6 # the total size of an image as calculated by docker is the total sum of its layer sizes
7 jmesPath: "to_string(sum(manifest.layers[*].size))"
To access images stored on private registries, see using private registries
For more examples of using an imageRegistry context, see the samples page.