Running PostrgeSQL in a Kubernetes Cluster

In the earlier, not early, days of Kubernetes, the general wisdom was to only run ephemeral workloads on Kubernetes clusters. This was with good reason as the support for persistence was not as mature.

Today, with the work that the community has put into the CloudNative PG, it has become much more feasible to run Production PostgreSQL database in Kubernetes. I will go over the setup, backup and restore, migration, and overal maintenance of CloudNative PG clusters in Kubernetes.

One of the main benefits of running your entire stack in Kubernetes and using open source systems is avoiding vendor lock in. When you’ve deployed your whole stack on Kubernetes, with the right practices, it becomes very easy to switch cloud providers or even move to your own on-premise or self-hosted servers. The trade-off really is one of convenience vs. flexibility. Cloud managed RDBMSs provide convenient setup and are often very good if you are just starting out and need an easy simple way to have your database. Self-managed databases on your own cluster offer you flexibility and full control over your stack, with the added burden of managing the actual database.

Setting Up Link to heading

There are two components required to run CNPG clusters in a Kubernetes cluster, the CNPG Operator and the CNPG Cluster.

The Operator sets up the CRDs and other related resources that allows a CNPG cluster to be deployed. The Cluster is where the actual database is deployed.

Before beginning it is always a good idea to have at least a quick read through the documentation.

CNPG Operator Link to heading

The simplest way to deploy the CNPG Operator is to use the helm chart.

helm repo add cnpg https://cloudnative-pg.github.io/charts

helm upgrade --install cnpg \
  --namespace cnpg-system \
  --create-namespace \
  cnpg/cloudnative-pg

CNPG Cluster Link to heading

There is also a helm chart for the cluster. It is a great option so long as your setup follows what the chart expects. For my use-case I decided to set up my Clusters manually since I wanted a more minimal setup and to learn each component of the cluster works.

A sample cluster configuration:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: my-cluster-name
  namespace: my-namespace
spec:
  instances: 3
  
  # PostgreSQL configuration
  postgresql:
    parameters:
      max_connections: "100"
      shared_buffers: "128MB"
      effective_cache_size: "512MB"
      log_statement: "all"
      log_min_duration_statement: "1000"
  
  # Service account configuration for permissions
  # Most cloud providers use annotations and labels
  # for setting permissions
  serviceAccountTemplate:
    metadata:
      annotations: {}
  
  # Pod template configuration
  inheritedMetadata:
    labels: {}
  
  # Setting resources for the PG instances
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 500m
      memory: 512Mi

  # Node affinity and tolerations
  affinity:
    nodeSelector: {}
    # Assumes you have nodes set up with the database taint.
    # This allows CNPG pods to be scheduled on those nodes.
    tolerations:
    - key: workload
      operator: Equal
      value: database
      effect: NoSchedule

  # Topology spread constraints for AZ distribution
  topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        cnpg.io/cluster: my-cluster-name

  # Storage configuration
  storage:
    size: 10Gi
    storageClass: my-database-storage-class
  
  # Plugin configuration for backups
  plugins:
    - name: barman-cloud.cloudnative-pg.io
      isWALArchiver: true
      parameters:
        barmanObjectName: my-backup-store
  
  # Monitoring
  monitoring:
    # Exports database metrics to Prometheus
    enablePodMonitor: true
    customQueriesConfigMap:
      - name: cnpg-default-monitoring
        key: queries
  
  # Bootstrap configuration
  # Only run on a brand new Cluster
  # Once there is existing data in the PVC,
  # this config is ignored.
  bootstrap:
    initdb:
      database: my_database_name
      owner: app_user
      secret:
        name: secret-with-database-password
      postInitApplicationSQL:
        - "CREATE DATABASE my_database_name OWNER app_user;"
        - "ALTER USER app_user CREATEDB;"

Since we are using the Barman Plugin for backups and WAL streaming, we need to also install the Barman Cloud CNPG-I plugin and then set up the Barman Object Store.

apiVersion: barmancloud.cnpg.io/v1
kind: ObjectStore
metadata:
  name: my-backup-store
  namespace: my-namespace
spec:
  configuration:
    destinationPath: ""
    # Insert the credentials configs based on your cloud provider
    # or your self-hosting set up
    wal:
      maxParallel: 8
    data:
      jobs: 2
  retentionPolicy: "30d"

This object store automatically backs up your WAL and data for 30 days.

With this setup, you get a high availability 3 instance cluster spread across AZs. CNPG automatically handles having a leader, and follower instances, failover in case the leader instance goes down. The database is accessible at my-cluster-name-rw.my-namespace. For read-only connections, it will be accessible at my-cluster-name-ro.my-namespace.

Multi-Cloud Setups Link to heading

I had the opportunity to move my database between cloud providers from AWS to Azure. Since I was already running CNPG on both Kubernetes clusters, it was a simple process of standing up the new cluster in Azure in recovery mode with pointing to the backup object stores of the existing cluster.

  bootstrap:
    recovery:
      source: s3-backup
      database: database_name
      owner: app_user

  replica:
    enabled: true
    source: s3-backup

  externalClusters:
    - name: s3-backup
      barmanObjectStore:
        serverName: aws-db
        destinationPath: s3://backup-bucket/database_name
        endpointURL: https://s3.us-east-1.amazonaws.com
        s3Credentials:
          accessKeyId:
            name: credential-secret-name
            key: ACCESS_KEY_ID
          secretAccessKey:
            name: credential-secret-name
            key: ACCESS_SECRET_KEY
        wal:
          maxParallel: 8

This starts the database in replica mode streaming changes from the S3 bucket. It continues this for as long as replica.enabled=true. To promote the cluster, just update replica.endabled to false and it will automatically handle its promotion.

Multi-Region/Cloud CNPG Database Link to heading

It is now a relatively simple process to run a CNPG database across multiple regions or cloud providers. In the case the primary region or the primary cloud goes down, switching to the replica is a matter of changing the configurations to replica.enabled=false.

The disaster recovery now hinges on how quickly you can detect the primary failing and switching over. The usual issues with data loss still apply. This does not make your database somehow more data resilient that other deployment options, it just reduces your vendor lock in.