Monitoring

Monitoring Instances

For each PostgreSQL instance, the operator provides an exporter of metrics for Prometheus via HTTP, on port 9187, named metrics. The operator comes with a predefined set of metrics, as well as a highly configurable and customizable system to define additional queries via one or more ConfigMap or Secret resources.

Metrics can be accessed as follows:

curl http://<pod_ip>:9187/metrics

All monitoring queries are:

  • transactionally atomic (one transaction per query)
  • executed with the pg_monitor role
  • executed with application_name set to cnp_metrics_exporter
  • executed as user postgres

Please refer to the "Default roles" section in PostgreSQL documentation for details on the pg_monitor role.

Currently, metrics' queries can be run only against a single database, chosen depending on the specified bootstrap method in the Cluster resource, according to the following logic:

  • using initdb: queries will be run against the specified database, so the value passed as initdb.database or defaulting to app if not specified.
  • not using initdb: queries will be run against the postgres database.

Note

This behaviour will be improved starting from the next version of Cloud Native PostgreSQL.

Prometheus Operator example

A specific PostgreSQL cluster can be monitored using the Prometheus Operator by defining the following PodMonitor resource:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: cluster-example
spec:
  selector:
    matchLabels:
      postgresql: cluster-example
  podMetricsEndpoints:
  - port: metrics

Important

Make sure you modify the example above with a unique name as well as the correct cluster's namespace and labels (we are using cluster-example).

User defined metrics

This feature is currently in beta state and the format is inspired by the queries.yaml file of the PostgreSQL Prometheus Exporter.

Custom metrics can be defined by users by referring to the created Configmap/Secret in a Cluster definition under the .spec.monitoring.customQueriesConfigMap or customQueriesSecret section as in the following example:

apiVersion: postgresql.k8s.enterprisedb.io/v1
kind: Cluster
metadata:
  name: cluster-example
  namespace: test
spec:
  instances: 3

  storage:
    size: 1Gi

  monitoring:
    customQueriesConfigMap:
      - name: example-monitoring
        key: custom-queries

The customQueriesConfigMap/customQueriesSecret sections contain a list of ConfigMap/Secret references specifying the key in which the custom queries are defined. Take care that the referred resources have to be created in the same namespace as the Cluster resource.

Example of user defined metric

Here you can see an example of a ConfigMap containing a single custom query, referenced by the Cluster example above:

apiVersion: v1
kind: ConfigMap
metadata:
  name: example-monitoring
  namespace: test
data:
  custom-queries: |
    pg_replication:
      query: "SELECT CASE WHEN NOT pg_is_in_recovery()
              THEN 0
              ELSE GREATEST (0,
                EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())))
              END AS lag"
      metrics:
        - lag:
            usage: "GAUGE"
            description: "Replication lag behind primary in seconds"

A list of basic monitoring queries can be found in the cnp-basic-monitoring.yaml file.

Structure of a user defined metric

Every custom query has the following basic structure:

<MetricName>:
      query: "<SQLQuery>"
      metrics:
        - <ColumnName>:
            usage: "<MetricType>"
            description: "<MetricDescription>"

Here is a short description of all the available fields:

  • <MetricName>: the name of the Prometheus metric
    • query: the SQL query to run on the target database to generate the metrics
    • primary: whether to run the query only on the primary instance
    • master: same as primary (for compatibility with the Prometheus PostgreSQL exporter's syntax - deprecated)
    • metrics: section containing a list of all exported columns, defined as follows:
    • <ColumnName>: the name of the column returned by the query
      • usage: one of the values described below
      • description: the metric's description
      • metrics_mapping: the optional column mapping when usage is set to MAPPEDMETRIC

The possible values for usage are:

Column Usage Label Description
DISCARD this column should be ignored
LABEL use this column as a label
COUNTER use this column as a counter
GAUGE use this column as a gauge
MAPPEDMETRIC use this column with the supplied mapping of text values
DURATION use this column as a text duration (in milliseconds)
HISTOGRAM use this column as an histogram

Please visit the "Metric Types" page from the Prometheus documentation for more information.

Output of a user defined metric

Custom defined metrics are returned by the Prometheus exporter endpoint (:9187/metrics) with the following format:

cnp_<MetricName>_<ColumnName>{<LabelColumnName>=<LabelColumnValue> ... } <ColumnValue>

Note

LabelColumnName are metrics with usage set to LABEL and their Value

Considering the pg_replication example above, the exporter's endpoint would return the following output when invoked:

# HELP cnp_pg_replication_lag Replication lag behind primary in seconds
# TYPE cnp_pg_replication_lag gauge
cnp_pg_replication_lag 0

Differences with the Prometheus Postgres exporter

Cloud Native PostgreSQL is inspired by the PostgreSQL Prometheus Exporter, but presents some differences. In particular, the following fields of a metric that are defined in the official Prometheus exporter are not implemented in Cloud Native PostgreSQL's exporter:

  • cache_seconds: number of seconds to cache the result of the query
  • runonserver: a semantic version range to limit the versions of PostgreSQL the query should run on (e.g. ">=10.0.0")

Similarly, the pg_version field of a column definition is not implemented.

Monitoring the operator

The operator internally exposes Prometheus metrics via HTTP on port 8080, named metrics.

Metrics can be accessed as follows:

curl http://<pod_ip>:8080/metrics

Currently, the operator exposes default kubebuilder metrics, see kubebuilder documentation for more details.

Prometheus Operator example

The operator deployment can be monitored using the Prometheus Operator by defining the following PodMonitor resource:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: postgresql-operator-controller-manager
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: cloud-native-postgresql
  podMetricsEndpoints:
    - port: metrics