In MOSIP, privacy and security are the highest priorities and this document details the measures that have been implemented in the platform so far. As an open-source project, we aim to continuously improve the security features and incorporate new developments through collaborations and community contributions. We use a variety of security tools to assess security, discover vulnerabilities and address them.
MOSIP's approach to privacy and security is determined by the framework principles under which it operates.
Direct access to data stored in the database is not permitted - data is accessed via APIs only.
The Zero-Knowledge Administration principle is used so administrators can manage data without seeing the actual data. Data can be accessed only via APIs
The integrity of each database row is protected to prevent any malicious tampering like swapping identities, for instance.
Revocable Virtual IDs and Tokens are used to thwart any attempt on profiling the users.
Access controls are implemented on all APIs to ensure data privacy (who can see what).
All APIs support rate-limiting and are digitally signed.
All network channels assumed 'dirty'.
Every artefact (including JSON data sent over API) is digitally signed.
MOSIP uses the following algorithms:
RSA OAEP 2048 bit minimum for all PKI-based encryption.
AES GCM 256-bit minimum for all Symmetric key encryption.
SHA256 is the standard hashing algorithm.
X509 V3 as the certificate standard.
FIPS 140-2 Level 3 is the minimum Hardware Security Module (HSM) standard.
PKCS11 is used for HSM communication.
As a principle, MOSIP does not use any mechanism in-built in a database for encryption. All sensitive data to be stored in a DB is encrypted/decrypted outside the DB at the application layer.
All sensitive (configurable) data is encrypted using a symmetric key algorithm. MOSIP supports AES 256 algorithm by default.
Each cell is encrypted using its own symmetric key and the keys are selected randomly.
By default, we generate 10,000 symmetric keys for database encryption. This is a soft limit and that can be increased.
The symmetric keys are encrypted using a master key in HSM.
Every key has an expiry and the application follows the expiry to update the data with new keys.
Registration Client is used to collecting all the personal and biometric information for MOSIP. The client is designed to work on TPM-compatible machines. The client follows the following principles:
All machines are registered using the TPM identity keys. The public part of these keys is pre-whitelisted in the MOSIP database.
An SK from the SRK key is created to encrypt all the other keys used by MOSIP.
All local configurations are encrypted using the same mechanism.
A set of (defaulted to 10000) RSA keys is created for registration. These keys are generated in an HSM and the public part of these keys is embedded in the MOSIP registration client. These keys are used to encrypt the user's/residents' data.
The registration data in its unencrypted form is always stored in the volatile memory and never stored on permanent storage.
MOSIP uses AES and RSA keys. By default, the MOSIP is designed to have the expiry and key rotation as configurable parameters. The default values are set as follows:
AES 256-bit keys - 6 months from the date of creation.
RSA 2048-bit encryption keys - 1 year from the date of creation
RSA 2048-bit signature keys - 2 years from the date of creation.
MOSIP uses symmetric keys to protect its database.
Every key has an expiry date upon creation. (It's defined by the configuration, Default set to 6 Months)
There are two modes of operation for key management:
Inline
In this mode, the system would look at a configuration to see
When data is written back to the database a new active key is used.
When data is read where the encryption key is expired the system notifies the key management that the expired key is used and has to be re-encrypted with an active key.
Batch
In this mode, the system would search for all the tables for encrypted data with expired keys.
Re-encrypt them with the new active keys.
This mode is scheduled to run on a need basis or bi-monthly so there is no huge data crunch and the inline mode would have re-encrypted most of the data.
Registration Client is used to collecting all the personal and biometric information for MOSIP. The client is designed to work on TPM-compatible machines. The client follows the following principles:
All machines are registered using the TPM identity keys. The public part of these keys is pre-whitelisted in the MOSIP database.
An SK from the SRK key is created to encrypt all the other keys used by MOSIP.
All local configurations are encrypted using the same mechanism.
A set of (defaulted to 10000) RSA keys is created for registration. These keys are generated in an HSM and the public part of these keys is embedded in the MOSIP registration client. These keys are used to encrypt the user's/resident's data.
The registration data in its unencrypted form is always stored in the volatile memory and never stored in any permanent storage or filesystem.
RSA 2048-bit keys are used for the encryption of resident/user data from the registration client. The expiry policy for the same is set to 1 year.
The default expiry of these keys is set to 1 year.
These keys will be rotated using the API. Currently, the key rotation would happen manually with client upgrades.
The digital signature keys are domain-specific and are used to sign the artefacts generated by MOSIP for external consumption. It's expected that the countries would follow their legal standard on digital signatures. The default expiry is set to 2 years.
In MOSIP Authentication largely falls into the below categories:
Authentication via web channel (for Pre-Registration web app, Admin web app and Resident services portal)
Authentication via local system i.e., offline authentication (for Registration client)
In MOSIP Authorization falls into the below categories:
Authorization of APIs accessed via web channel (We are in migration to a KeyCloak server at this point in time. Soon we will publish the documents)
Authorization to access specific data (will be implemented in v3)
A country will have its own hierarchy of system users especially the Registration staff and system administration staff. So, instead of defining a fixed hierarchy, by default MOSIP will depend on an LDAP implementation to manage users, organizational hierarchy and roles for users in the hierarchy. MOSIP will use an open-source LDAP server as the LDAP implementation. Administrators can create a hierarchy and users using Apache Directory Studio.
The data level encryption is handled in the DTO layer in the application.
The key solution considerations are
Following are the key considerations of the encryption in the DTO layer,
The data are classified into,
Sensitive
Non-Sensitive
The Sensitive data is encrypted in the DTO layer.
AES-256 algorithm is used for the encryption.
The data are classified and kept in the configuration file. The application layer reads this configuration and the sensitivity property is injected into the DTO layer.
Hibernate interceptors are used to intercept the fields in the DTO layer.
During the reading of these fields, once again Hibernate interceptors are called to decrypt the data.
The key expiration is in-built into the key store.
Following are the various components in the system,
Keys are stored in the "Key Store". This is a database table in which the keys are maintained along with the index.
Indexes are persisted in a separate store. When a request comes to a system to encrypt, the current index is retrieved and using this index, the key for encryption is taken. Indexes are stored along with the encrypted data as the prefix separated by a colon. For example, 4:sdf*)(8S@#YFLSJ&*hfdlkj23h
The scheduler runs a job at some specific time when the necessity for re-encryption arises.
HSM devices are used to store the Master keys. These master keys are used to encrypt the keys in the Key store.
Encryption:
The properties in the entities which are supposed to be encrypted are configured in the config server.
During the encryption, a listener is installed in the DAO layer to intercept the incoming entity objects. If those properties are supposed to be, encrypted or not, are received from the config server.
The data is encrypted and prefixed with the index of the key, which is used for the encryption and stored in the data store.
The key itself is encrypted with the master key from HSM and stored in a separate data store.
The index is incremented if the old index is expired.
Encryption
Decryption:
When a request is received, the DTO fields are checked for sensitivity, from the config server.
If the DTO field is sensitive, the decryption() method is called.
During the decryption, the index is calculated by the delimiter. This index is used to find the Key, which was used for the encryption.
The Key itself has to be decrypted by the master key from HSM, before decrypting the content.
Decryption
Key rotation
On-Demand:
The keys are stored with the expiry date.
When a request comes to the system, the key is checked for expiry.
If the old key had expired, then a new index is generated and persisted in the Indexes. If there is no key exists in the Key store, a new key is created for the encryption. And the new key is used for further encryptions.
Bulk:
There are times, that the total encrypted data are re-encrypted again. A scheduler is maintained to oversee this. During the scheduled time, the encrypted data is read and re-encrypted once again and saved. The newly encrypted data will have the new index in front of the encrypted content separated by a delimiter.
Bulk mode is used to remove the expired keys and data is encrypted with the new key.
TODO: How do we handle failures during the bulk re-encryption?
TODO: How to handle the load, if it is extremely high?