Production Hardening Guide

Below table lists various checks that must be performed before actual roll out of a deployment. This list is not exhaustive and it is expected that SIs use this as a reference and augment their own hardening procedures.

TopicTasks
  • Change log level to INFO in application properties.

  • Disable registration processor External Stage if not required.

  • Reprocessor cronjob frequency and other settings

  • All cronjobs timings according to the country (check property files).

  • Disable '111111' default OTP.

  • Review idschema attribute names against names in Datashare policy and Auth policy for all partner (including IDA).

  • Review attributes specified in ida-zero-knowledge-unencrypted-credential-attributes

  • Review id-authentication-mapping.json` in config vis-a-vis attribute names in idschema

  • Kafka: disable option to delete a topic: deleteTopicEnable: false (this is set while installing Kafka).

Backup

  • Set up backup for Longhorn.

  • Backup of Postgres db.

  • Replication factor in Minio.

Cluster hardening

  • On-prem K8s cluster production configuration as given here.

Archival

  • Archival of logs: Since logs data grows at a rapid pace, the data needs to be achived frequently. Set up an archival process.

Keycloak

  • Keycloak Realm connection timeout settings - review all.

  • Valid urls redirect in Keycloak - set specific urls.

Postgres

Access control

  • Multi-factor authentication for Rancher and Keycloak.

  • Review all Wireguard keys. Are all keys accounted for? Do the machines with Wireguard keys have sufficient protection - like firewalls, password/biometric login etc.

  • Are correct cluster roles assigned to users in Rancher? Is RBAC set properly?

  • Do the users of Rancher have strong passwords only known to them?

  • Is Rancher and Keycloak accessible only on Wireguard and not on public net?

  • Who holds the Keycloak Admin credentials? Are the credentials secure?

  • Any stray passwords lying on the disks?

Cluster setup

  • Increase the number of nodes in the cluster according to expected load.

  • Set rate control (throttling) parameters for PreReg.

  • Scripts to clean up processed packets in landing zone.

  • Review pod replication factors for all modules. E.g ClamAV.

Persistence

  • Enable persistence in all modules. On cloud change the storage class from 'Delete' to 'Retain'. If you already have PV as 'Delete', you can edit the PV config and change it to 'Retain' (without having to change storage class).

  • Make sure storage class allows expansion of storage.

allowVolumeExpansion: true

  • Review size of persistent volumes and update.

  • Increase MinIO persistent volume size based on your estimations.

  • Review production settings of Longhorn.

Last updated