Data purging
WSO2 Open Banking databases store data for different open banking requirements. As the volume of the data stored grows over time, It is highly recommended to perform data purging on servers to mitigate performance issues.
Based on the use cases the data in the database might grow fast sometimes and cleaning them up from the product itself becomes expensive. This page explains how to perform data purging by clearing consent and event notification related data using stored procedures. These offload the expensive data cleanups to the database server.
Tip
It is recommended to run these steps when the server traffic is low. Especially, if you are running this in a production environment for the first time, the data volume to be purged may be higher. However, consider this as a housekeeping task that needs to be run at regular intervals.
- Back up all running databases.
-
Set up the database dump in a test environment and test it for any issues.
Note
We recommend that you test the database dump before the cleanup task as the cleanup can take some time.
-
According to your database type, execute the store procedures in the
<OB_IS_ACCELERATOR_HOME>/carbon-home/dbscripts/stored-procedures
directory. -
Once the cleanup is over, start WSO2 Servers by pointing them to the cleaned-up database dump and test thoroughly for any issues.
Consent Cleanup¶
Consent cleanup is performed via the consent-cleanup.sql
script. For configurable attributes, see
the Common Configurable Attributes section.
Info
The consent-cleanup.sql
script contains a batch-wise delete. This procedure includes the cleanup of consents
from the following tables:
OB_CONSENT
OB_CONSENT_AUTH_RESOURCE
OB_CONSENT_MAPPING
OB_CONSENT_FILE
OB_CONSENT_ATTRIBUTE
OB_CONSENT_STATUS_AUDIT
-
According to your database type, compile the stored procedure in the
<OB_IS_ACCELERATOR_HOME>/carbon-home/dbscripts/stored-procedures/consent-purging-procedures/<DB-Type>/<DB-Type>-consent-cleanup-script.sql
file. -
Execute the compiled store procedure.
-
Configure according to the following:
Click here to see the Logical condition and configurations used for Consent data purging...
Consent Data Purging Parameters
Parameter Description consentTypes consent_types that are eligible for purging. For example, accounts
,payments
, to skip leave it emptyclientIds client_ids that are eligible for purging. To skip this, leave it empty. consentStatuses consent_statuses that are eligible for purging. For example, expired
,revoked
purgeConsentsOlderThanXNumberOfDays The time period to delete consents older than n days. lastUpdatedTime last_updated_time
for purging. If the consent updated time is older than this value then it's eligible for purging
To purge consents with the lastupdated_time
older than 31 days, configure the following value in dayspurgeConsentsOlderThanXNumberOfDays = 31
.
To configure exact timestamp of thelastUpdatedTime
rather than a time period, ignore configuringpurgeConsentsOlderThanXNumberOfDays = NULL
and only configurelastUpdatedTime
with the exact unix timestamp. For example,lastupdatedtime = 1660737878;
Other Parameters
Parameter Description backupTables Enable the backup table to restore data at a later stage.
- This backup tables is overwritten every time you run the cleanup script.
- This does not capture the consents that were created between the backup task and the cleanup iteration. Therefore, if consents are created after the backup task, you cannot restore them once they are deleted from the cleanup iteration.
enableAudit Set this parameter to true
to log each deleted consents in theauditlog_ob_consent_cleanup
table in order to track them in future. -
Once the stored procedure is compiled, execute the procedure with input parameters as follows:
WSO2_OB_CONSENT_CLEANUP_SP( consentTypes, clientIds, consentStatuses, purgeConsentsOlderThanXNumberOfDays, lastUpdatedTime, backupTables, enableAudit, analyzeTables ); mysql> call WSO2_OB_CONSENT_CLEANUP_SP('accounts,payments', 'clientId1,clientId2', 'expired,revoked', 31, NULL, TRUE, TRUE, TRUE);
-
For more information, see the
/stored-procedures/consent-purging-procedures/<DB-Type>/readme.MD
file for each database type.
Event Notification Cleanup¶
Event Notification cleanup is performed via the event-notification-cleanup.sql
script. For configurable attributes,
see the Common Configurable Attributes section.
Info
The event-notification-cleanup.sql
script contains a batch-wise delete. This procedure includes the cleanup of event
notification data from the following tables:
OB_NOTIFICATION
OB_NOTIFICATION_EVENT
OB_NOTIFICATION_ERROR
-
According to your database type, compile the stored procedure in the
<OB_IS_ACCELERATOR_HOME>/carbon-home/dbscripts/stored-procedures/event-notification-purging-procedures/<DB-Type>/<DB-Type>-event-notification-cleanup-script.sql
file. -
Execute the compiled store procedure.
-
Configure according to the following:
Click here to see the Logical condition and configurations used for Event Notification data purging...
Event Notification Data Purging Parameters
Parameter Description eventStatuses The event statuses that are eligible for purging. For example, ACK
,ERR
clientIds client_ids that are eligible for purging. To skip this, leave it empty. purgeEventsOlderThanXNumberOfDays The time period to delete event-notifications older than n days. lastUpdatedTime last_updated_time
for purging. If the event-notification updated time is older than this value then it's eligible for purgingpurgeNonExistingResourceIds The flag to enable purging of untraceable event notifications. If the resource_id
of theob_notification
does not exist in theob_consent table
-consent_id
, then it is untraceable and eligible for purging
Set this totrue
, to purge the untraceable event notifications irrespective of other purging parameters.
To purge event-notifications with the lastupdated_time
older than 31 days, configure the following value in dayspurgeEventsOlderThanXNumberOfDays = 31
.
To configure exact timestamp of thelastUpdatedTime
rather than a time period, ignore configuringpurgeEventsOlderThanXNumberOfDays = NULL
and only configurelastUpdatedTime
with the exact unix timestamp. For example,lastupdatedtime = 1660737878;
Other Parameters
Parameter Description backupTables Enable the backup table to restore data at a later stage.
- This backup tables is overwritten every time you run the cleanup script.
- This does not capture the event notifications that were created between the backup task and the cleanup iteration. Therefore, if event notifications are created after the backup task, you cannot restore them once they are deleted from the cleanup iteration.
enableAudit Set this parameter to true
to log each deleted consents in theauditlog_ob_event_notification_cleanup
table in order to track them in future. -
Once the stored procedure is compiled, execute the procedure with input parameters as follows:
WSO2_OB_EVENT_NOTIFICATION_CLEANUP_SP( eventStatuses, clientIds, purgeEventsOlderThanXNumberOfDays, lastUpdatedTime, purgeNonExistingResourceIds, backupTables, enableAudit, analyzeTables ); mysql> call WSO2_OB_EVENT_NOTIFICATION_CLEANUP_SP('ACK,ERR', 'clientId1,clientId2', 10, NULL, TRUE, TRUE, TRUE, TRUE);
-
For more information, see the
/stored-procedures/event-notification-purging-procedures/<DB-Type>/readme.MD
file for each database type.
Common Configurable Attributes¶
You can configure the following important attributes in the script.
Variable | Description |
---|---|
batchSize | Defines how many records will be deleted per batch for one iteration. |
chunkSize | If you have millions of data records, allows you to handle data chunk wise. A chunk handles a larger set of data than a batch and a batch processes an entire chunk. For example, if you have 20 million data records in a particular table, the chunk will initially take half a million of data and provide it into a batch where 10000 records per batch are deleted. Once that chunk is complete, the other half million records and proceeded. |
checkCount | If consents are expiring/revoking or eligible for purging when the cleanup script is running, the script will run in an endless loop. Therefore, this variable defines a safe margin for the cleanup script to complete its job if the consents eligible for deletion are less than checkCount . |
sleepTime | Define the wait time for each iteration of the batch deletes to avoid table locks. |
enableLog | To enable or disable logs. |
logLevel | To set the log levels. |
Function | Description |
---|---|
BACKUP CONSENT TABLE | This section backs up all required tables in case of restoration is to be performed. |
CREATING AUDITLOG TABLES FOR DELETING CONSENTS | This section creates the initial audit logs table for persisting the deleted consents. |
CALCULATING CONSENTS IN OB_CONSENT TABLE | This section prints the breakdown of the consents to be deleted and retained. |
BATCH DELETE TABLE | This section performs chunk and batch-wise deletion for consent data. |
REBUILDING INDEXES | As an extra step to optimize the database, this can perform an index rebuilding task for improving the performance. It is not recommended to perform this on a live system unless you have downtime as this could lock down the whole table. |
ANALYSING TABLES | As an extra step, analyze the tables to gather statistics related to the delete operation. This also improves the performance of the database. It is not recommended to perform this on a live system unless you have downtime. |
Restore Script for Consent and Event-Notification data¶
Warning
Before making any changes to the production environment, it is strongly advised to take a complete backup in case a restoration is required.
- The
<DB>-consent-restore-script.sql
and<DB>-event-notification-restore-script.sql
contain the stored procedures that are used to restore the purged consents and event notifications from the consent tables respectively. - The restoration can only be performed if the
backupTables
property is set totrue
in the data purging procedures. - This is only an immediate restoration script for the data purging procedures, therefore each execution of the
WSO2_OB_CONSENT_CLEANUP_SP
andWSO2_OB_EVENT_NOTIFICATION_CLEANUP_SP
procedures will replace the backup tables. -
You can schedule a cleanup task to automatically run after a given period of time, for example:
USE 'WSO2_OB_CONSENT_DB'; DROP EVENT IF EXISTS consent_cleanup; CREATE EVENT consent_cleanup ON SCHEDULE EVERY 1 WEEK STARTS '2015-01-01 00:00.00' DO CALL `WSO2_OB_CONSENT_DB`.WSO2_OB_CONSENT_CLEANUP_SP('accounts,payments', 'clientId1,clientId2', 'expired,revoked', 31, NULL, TRUE, TRUE, TRUE); -- 'Turn on the event_scheduler' SET GLOBAL event_scheduler = ON;
Data Retention¶
By enabling data retention, you can archive the data that is purged using the consent purging scripts. This will keep the data records such as consent and its audit log in a separate retention database for more than seven years even after purging. The data retention period can be configured according to your requirements.
Info
This is only available as a WSO2 Update from WSO2 Open Banking Identity Server Accelerator Level 3.0.0.63 onwards. For more information on updating, see Getting WSO2 Updates.
Follow the below instructions to enable data retention:
-
Create a separate database to store the retention data.
-
Execute the default consent database creation scripts in the
<OB_IS_ACCELERATOR_HOME>/carbon-home/dbscripts/open-banking/consent
folder to create tables. -
Open the
<IS_HOME>/repository/conf/deployment.toml
file. -
To enable data retention support from the WSO2 Open Banking Identity Server, add the below configurations to introduce the new datasource. Given below is an example:
By enabling data retention, you can use the[[open_banking.jdbc_retention_data_persistence_manager]] data_source.name = "WSO2OB_RET_DB" connection_verification_timeout=1 [open_banking.consent.data_retention] enabled=true [[datasource]] id="WSO2OB_RET_DB" url = ".. jdbc url for retention database.." username = "user" password = "password" driver = "com.mysql.jdbc.Driver" jmx_enable=false pool_options.maxActive = "150" pool_options.maxWait = "60000" pool_options.minIdle = "5" pool_options.testOnBorrow = true pool_options.validationQuery="SELECT 1" #Use below for oracle #validationQuery="SELECT 1 FROM DUAL" pool_options.validationInterval="30000" pool_options.defaultAutoCommit=false
fetchFromRetentionDatabase
query parameter to fetch the data from the retention database in consent admin APIs. -
Configure the consent data purging script with types of data to be archived as shown below:
SET enableDataRetentionForAuthResourceAndMapping = TRUE; SET enableDataRetentionForObConsentFile = FALSE; SET enableDataRetentionForObConsentAttribute = FALSE;6. Enable data retention as shown below: SET enableDataRetentionForObConsentStatusAudit = TRUE;
-
Enable data retention as shown below:
WSO2_OB_CONSENT_CLEANUP_SP( consentTypes, clientIds, consentStatuses, purgeConsentsOlderThanXNumberOfDays, lastUpdatedTime, backupTables, enableAudit, analyzeTables, enableDataRetention );
-
Execute the purging script.
This will create temporary retention consent tables in consent database with the
RET_
prefix and store the purged data temporarily in these tables.
These temporary retention data need to be moved to the retention database which is created in step 1. This can be done by syncing retention data in the consent database with the retention database. All the moved consents will be deleted from the temporary retention tables.
Click here to see how it is done...
This task can be accomplished in two ways:
-
By invoking an API
curl --location --request GET 'https://localhost:9446/api/openbanking/consent/admin/sync-temporary-retention-data' \ --header 'Authorization: Basic YWRtaW5Ad3NvMi5jb206d3NvMjEyMw=='
-
By configuring as a periodical job
- Open the
<IS_HOME>/repository/conf/deployment.toml
file. - Add the below configurations:
[open_banking.consent.data_retention] enabled=true enable_db_sync_periodical_job=true db_sync_cron_value="0 0/2 * ? * * *"
- Open the
Use these purging scripts in the retention database to purge consents in the retention database by setting the last update time to more than seven years. For example:
WSO2_OB_CONSENT_CLEANUP_SP('', '', '', 31*12*7,NULL,FALSE,FALSE,TRUE,FALSE);
The archived data can be extracted using the consent search API, consent status audit search API, and consent file search API by sending the additional query parameter fetchFromRetentionDatabase
set to true
.
Click here to see some examples...
Given below are some examples:
Consent search API:
curl --location --request GET 'https://localhost:9446/api/openbanking/consent/admin/search/?fromTime=1638993429&toTime=1638993429&consentStatuses=valid&[email protected]@carbon.super&consentIDs=234ba17f-c3ac-4493-9049-d71f99c36dc2&clientIDs=V92mG4mQwiMqSE8oWS5_VvaVgb4a&limit=25&offset=0&fetchFromRetentionDatabase=true&consentTypes=accounts' \
--header 'Authorization: Basic YWRtaW5Ad3NvMi5jb206d3NvMjEyMw==' \
--data-raw ''
Consent status audit search API:
curl --location --request GET 'https://localhost:9446/api/openbanking/consent/admin/search/consent-status-audit?limit=2&offset=10&consentIDs=d239d1be-4e81-46d6-8a2e-76345370dbf3,f18b6ce1-15f7-44f7-9633-4e66fa6ceb88&fetchFromRetentionDatabase=true' \
--header 'Authorization: Basic YWRtaW5Ad3NvMi5jb206d3NvMjEyMw==' \
--data-raw ''
Consent file search API:
curl --location --request GET 'https://localhost:9446/api/openbanking/consent/admin/search/consent-file?consentId=05ac14f9-b980-45c1-913c-614cb1469a90' \
--header 'Authorization: Basic YWRtaW5Ad3NvMi5jb206d3NvMjEyMw==' \
--data-raw ''