Solution Overview
The DevOps Data Platform Continuous Compliance solution of Delphix provides a comprehensive approach to data masking. It meets enterprise-class performance, scalability, and security requirements. Delphix enables businesses to successfully protect sensitive data through these key steps:
Discover: Sensitive Data
Identify sensitive information such as names, email addresses, and payment information to provide an enterprise-wide view of risk and to pinpoint targets for masking.
Mask: Continuous Compliance
Apply masking to transform sensitive data values into fictitious yet realistic equivalents, while still preserving the business value and referential integrity of the data for use cases such as development and testing. Unlike approaches that leverage encryption, masking not only ensures that transformed data is still usable in non-production environments, but also entails an irreversible process. It prevents original data from being restored through decryption keys or other means (conforming with GDPR, CCPA, HIPAA and many other standards).
Provision: Scaling and Integration
Extend the solution to meet enterprise security requirements and integrate into critical workflows.
Taken together, these capabilities allow you to define, manage, and apply security policies from a single point of control across large, complex data estates. Delphix can enable global operations with support for international addresses and character sets. Moreover, you would be surprised how quickly Delphix masking is configured and deployed. This is provided via GUI-driven workflows without requiring any specialized programming expertise or lengthy services engagements.
I personally rate this platform as a very effective solution to anonymize large data volumes and to maintain the automated setup of it.
Discover Sensitive Data
After connecting to a supported data source, Delphix identifies what data should be secured. Sensitive data discovery is performed using two different methods, column level discovery and data level discovery:
- Column Level Discovery
Column level discovery uses regular expressions (regex) to scan the metadata (column names) of the selected data sources. There are several dozen pre-configured profile expressions designed to identify common sensitive data types (Social Security numbers, names, addresses, etc). Users also have the ability to write their own profile regular expressions.Example
First Name Expression <(?>(fi?rst)?(na?me?)|f?name)(?!\w*ID)>
- Data Level Discovery
Data level discovery also uses regex, but to scan the actual data instead of the metadata. Similar to column level profiling, there are several dozen pre-configured expressions and users can add their own.Example
US Phone No. Expression < (((?\b[0-9]{3})?[-. ]?[0-9]{3}[-. ]?[0-9]{4}\b)(?
…
Delphix comes prepackaged with over 90 profile expressions. They have been developed after validation with dozens of very large customers around the world to help businesses discovering over 25 types (account numbers, addresses, names, etc.) of sensitive data using both column and data level discovery.
Securing Sensitive Data
Delphix’s primary method for securing data is masking. Masking algorithms create a structurally similar but fictitious version of data. It can be used for purposes such as application development and testing. Masking protects the actual sensitive information while generating a functional substitute for occasions when the real data is not required.
- Delphix Masking – Is Irreversible
Masked data cannot be “reverse engineered” and restored to its original unmasked state.
- Creates Results Representative of the Source Data
The output of Delphix masking resembles production data for non-production purposes. This could include geographic distributions, credit card distributions (e.g. leaving the first 4 numbers unchanged, but scrambling the rest), or maintaining human readability of (fake) names and addresses.
- Preserves Referential Integrity
Delphix has the ability to mask data consistently to maintain referential integrity. If an account number is a primary key and scrambled as part of masking, then all instances of that account number linked through key pairs will be masked identically. Additionally, the Delphix platform scales horizontally so that masking algorithms will preserve referential integrity across multiple, heterogeneous data sources.
Mask or Tokenize
Transform sensitive data to comply with privacy regulations in two ways:
- Irreversibly mask data for non-production environments – or –
- Tokenize data to enable teams to reverse transformation
Key benefits
- Single solution for both masking and tokenization
- Masking completely and irreversibly neutralizes compliance risks in non-production environments
- Tokenization enables use cases requiring secure collaboration with third parties
2 Examples of Predefined Algorithms
Date Shift Framework
This algorithm masks date values to different dates based on a specified range around the input value. Masked values are calculated algorithmically using the algorithm‘s key, so rekeying the algorithm will cause different outputs to be generated for each input. All valid input values will be masked to a new value, and the new value will never match the input.
Segment Mapping Algorithm Framework
Segment mapping algorithms produce no overlaps or repetitions in the masked data. They let users create unique masked values by dividing a target value into a maximum of 36 segments and masking each segment individually. Businesses might use this method for information involving unique values, such as Social Security numbers, primary key columns, or foreign key columns. Segment mapping handles strings of a known format and preserves referential integrity.
Executing Masking Jobs
Masking jobs are created via a GUI-driven workflow. The user selects a target database, algorithms to use based on profiling results and resources allocated to the job. Optionally, SQL statements could be run before or/and after execution of the job.
Delphix can process and output masked data values in two different ways:
In-Place Masking (same Database)
An instance of Delphix Continuous Compliance will read data from a source, secure the data within the engine and then update the data source with the secure data. In-place masking only transforms the columns flagged as containing sensitive information, leaving the other columns alone. Since this method potentially requires copying production data into a non-production zone while the masking takes place, sensitive data might exist in the non-production zone until the masking is complete.
On-the-Fly Masking (2 Databases)
Delphix reads data from the data source, secures the data in the engine and then places the secure data in a target source (different from the location of the original data source) in an Extract Transform Load (ETL) process. Delphix extracts the data from a source environment, such as a production copy, gold copy, or disaster recovery copy (only reading from a database not an archived file). It masks the data in the memory of the application server on which it resides and then loads the masked data to the target environment. Delphix does not modify the original source data; only the target data changes.
Performance of Delphix Continuous Compliance
Key variables that influence masking performance are the number of tables to be masked, rows per table, columns per table, masking algorithm per column, data type, avg size per column, as well as indexes, constraints, and triggers on the masked table. But also the connected platforms (on-premise / cloud) are important as the assigned resources will limit the data throughput of the whole masking process.
Steps to Reach Compliance with Data Masking
- Execute Profiling Jobs
Automatically identifies the most cases of sensitive data fields by analysing database structure names and pattern matching of the field content. - Adapt masking
While the predefined out-of-the-box algorithms are useful in many cases, you will have special masking needs for some of your database columns, which have to be adapted after the automated recognition. - Execute Masking Jobs
Transformation from original to masked data: the final masking process replaces same names allways with the same fictitious strings on all data spaces (in this example “George” becomes “Bob” all over). - Test the masked data with your Application
Make sure that your application is still able to access the datebase without error although the relevant fields have been masked.
Summary of Delphix Continuous Compliance
Fast and Automated Masking
Delphix Continuous Compliance identifies sensitive information and automates data masking wherever data resides — from mainframes to modern cloud platforms. Unlike traditional solutions which take months to implement, dbi services implements Continuous Compliance for their customers within days. 👍
Combine Speed and Compliance
Delphix provides an API-driven data platform to configure all data masking needs. Continuous Compliance includes data profiling to identify PII or sensitive data and templates to automatically mask or tokenize data. Delphix masking replaces data at risk with fictitious data while preserving referential integrity.
About Delphix
Delphix is the industry leader for DevOps test data management. Businesses need to transform application delivery but struggle to balance speed with data security and compliance. Their DevOps Data Platform automates data security, while rapidly deploying test data to accelerate application releases. With Delphix, customers modernize applications, adopt multi-cloud, achieve CI/CD, and recover from downtime events such as ransomware much faster.
Leading companies around the world use Delphix to accelerate digital transformation and enable zero trust data management.