Sensitive data (names, emails, phone numbers, personal identifiers…) should not be freely exposed outside production. When you refresh a production database to a test or staging environment, or when analysts need access to real-looking data, anonymization becomes critical.
The PostgreSQL Anonymizer extension (often called anon) is an open‑source extension that lets you mask, fake, shuffle, or generalize data directly inside PostgreSQL, using simple SQL rules. This article explains what it is, how it works, how to install it on PostgreSQL 18.1 running on Red Hat Enterprise Linux 10.1, and how to use it with clear, beginner‑friendly examples.
The target audience is everyone, but especially beginner DBAs who want a practical, command‑line–focused introduction.
I. What is PostgreSQL Anonymizer?
PostgreSQL Anonymizer is an extension that helps protect sensitive data by replacing it with fake or obfuscated values.
Instead of exporting data and anonymizing it with scripts, the rules live inside the database itself. You declare how a column should be anonymized, and PostgreSQL applies that rule automatically.
Typical use cases:
- Refreshing production data into test / staging environments
- Giving developers or analysts access to realistic but non‑sensitive data
- Producing anonymized database dumps for external sharing
- Helping comply with GDPR and other privacy regulations
The extension supports three main approaches:
- Static masking – permanently replaces data in tables
- Dynamic masking – masks data on the fly for specific users
- Anonymous dumps – exports an already anonymized dump
II. How does it work ?
PostgreSQL Anonymizer uses PostgreSQL’s security labels mechanism. You attach a label to a column that says: “When this data is anonymized, use this function.”
Example:
SECURITY LABEL FOR anon ON customer.email IS 'MASKED WITH FUNCTION anon.fake_email()';
Once declared:
- Static masking rewrites the table using those rules
- Dynamic masking rewrites query results for masked users
- Dumps automatically apply the same rules
The rules stay attached to the schema, not to scripts or applications.
III. Installing PostgreSQL Anonymizer on RHEL 10.1 (PostgreSQL 18.1)
In this guide, PostgreSQL 18.1 is already installed following dbi services standard. PostgreSQL binaries are located in:
- /u01/app/postgres/product/18/db_1/bin
The PostgreSQL data directory (PGDATA) is:
- /u02/pgdata/18/demo-cluster
We will only focus on installing and enabling the PostgreSQL Anonymizer extension.
1. Install the Anonymizer extension
Since we are installing from source, we’ll start by installing the necessary prerequisites. We’ll use cargo to handle the PGRX system requirements. You can find the official documentation here, but keep in mind that I’ve updated the commands to reflect a more recent version.
#Cargo install
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
rustc --version
cargo --version
# postgresql_anonymizer install
cargo install cargo-pgrx --version 0.16.1 --locked
cargo pgrx init --pg18 /u01/app/postgres/product/18/db_1/bin/pg_config
git clone https://gitlab.com/dalibo/postgresql_anonymizer.git
cd postgresql_anonymizer/
make extension PG_CONFIG=/u01/app/postgres/product/18/db_1/bin/pg_config PGVER=pg18
sudo make install PG_CONFIG=/u01/app/postgres/product/18/db_1/bin/pg_config PGVER=pg18
2. Enable Anonymizer in postgresql.conf
Because PostgreSQL Anonymizer hooks into query execution, it must be loaded at session start. Edit the configuration file:
vi /u02/pgdata/18/demo-cluster/postgresql.conf
Add or update:
session_preload_libraries = 'anon'
Restart PostgreSQL:
pg_ctl -D /u02/pgdata/18/demo-cluster restart
3. Install and enable the Anonymizer extension
Connect as the PostgreSQL superuser:
psql -U postgres
Create a database:
CREATE DATABASE anonymizer_demo;
Create and initialize the extension:
postgres=# \c anonymizer_demo
You are now connected to database "anonymizer_demo" as user "postgres".
anonymizer_demo=# CREATE EXTENSION anon CASCADE;
CREATE EXTENSION
anonymizer_demo=# \dx
List of installed extensions
Name | Version | Default version | Schema | Description
---------+---------+-----------------+------------+---------------------------------------------
anon | 3.0.0 | 3.0.0 | public | Anonymization & Data Masking for PostgreSQL
plpgsql | 1.0 | 1.0 | pg_catalog | PL/pgSQL procedural language
(2 rows)
anonymizer_demo=# SELECT anon.init();
init
------
t
(1 row)
anon.init() loads fake data dictionaries (names, companies, cities, etc.) used by anonymization functions.
IV. Demo: anonymizing a simple table
For this demo, I will keep it simple and create everything inside the postgres database and default schema, but I recommend you to follow the best practices and use a dedicated database, user and schema.
Create sample data:
CREATE TABLE customer (
id SERIAL PRIMARY KEY,
first_name TEXT,
last_name TEXT,
birthdate DATE,
email TEXT,
company TEXT
);
INSERT INTO customer (first_name, last_name, birthdate, email, company) VALUES
('Alice', 'Martin', '1987-02-14', '[email protected]', 'Acme Corp'),
('Bob', 'Dupont', '1979-11-03', '[email protected]', 'Globex');
anonymizer_demo=# select * from customer;
id | full_name | birthdate | email | company
----+--------------+------------+--------------------------+-----------
1 | Alice Martin | 1987-02-14 | [email protected] | Acme Corp
2 | Bob Dupont | 1979-11-03 | [email protected] | Globex
(2 rows)
1. Static masking (permanent anonymization)
Static masking is a “fire and forget” approach where the original sensitive data is physically overwritten on the disk with faked or scrambled values. This process is destructive. Once the data is masked, the original values are gone forever. This should only be performed on non-production environments (like Staging or Dev) or on a backup copy of your database.
Before applying any rules, you must explicitly enable the static masking engine at both the database and role levels. This acts as a safety switch to prevent accidental data loss.
-- Enable the extension for the current database
anonymizer_demo=# ALTER DATABASE anonymizer_demo SET anon.static_masking = TRUE;
ALTER DATABASE
-- Grant the postgres user permission to execute static masking operations
anonymizer_demo=# ALTER ROLE postgres SET anon.static_masking = TRUE;
ALTER ROLE
Next, you define how the data should be transformed. We use SECURITY LABEL to attach masking logic to specific columns. This doesn’t change the data yet; it simply tells the anon extension which functions to use during the anonymization process.
-- Replace names with realistic dummy values
anonymizer_demo=# SECURITY LABEL FOR anon ON COLUMN customer.first_name IS 'MASKED WITH FUNCTION anon.dummy_first_name()';
SECURITY LABEL
anonymizer_demo=# SECURITY LABEL FOR anon ON COLUMN customer.last_name IS 'MASKED WITH FUNCTION anon.dummy_last_name()';
SECURITY LABEL
-- Generate a random date within a specific age range (1950-2000)
anonymizer_demo=# SECURITY LABEL FOR anon ON column customer.birthdate IS 'MASKED WITH FUNCTION anon.random_date_between(''1950-01-01'', ''2000-12-31'')';
SECURITY LABEL
-- Generate syntactically correct but fake emails and company names
anonymizer_demo=# SECURITY LABEL FOR anon ON column customer.email IS 'MASKED WITH FUNCTION anon.fake_email()';
SECURITY LABEL
anonymizer_demo=# SECURITY LABEL FOR anon ON column customer.company IS 'MASKED WITH FUNCTION anon.fake_company()';
SECURITY LABEL
This is the final execution step. Running anonymize_database() triggers the engine to scan your rules and overwrite the table data globally. Depending on the size of your database, this may take some time as it performs UPDATE operations on the disk.
anonymizer_demo=# SELECT anon.anonymize_database();
anonymize_database
--------------------
t
(1 row)
The data rules have now been applied and the data have been anonymized. You can check the result with the following query:
anonymizer_demo=# SELECT * FROM customer;
id | birthdate | email | company | first_name | last_name
----+------------+----------------------+----------------+------------+------------
1 | 1960-11-30 | [email protected] | Brown and Sons | Chyna | Mertz
2 | 1997-09-08 | [email protected] | Rice PLC | Damien | Williamson
(2 rows)
You can now turn off static masking to continue with the next demo, dynamic masking:
anonymizer_demo=# ALTER SYSTEM SET anon.static_masking TO off;
ALTER SYSTEM
anonymizer_demo=# ALTER ROLE postgres SET anon.static_masking TO off;
ALTER ROLE
2. Dynamic masking
Dynamic masking allows you to hide sensitive information from specific users (like developers or analysts) while preserving the original data for administrators or the application itself. The masking happens in memory at the moment the query is executed.
First, we tell the database to activate the transparent masking engine. This allows the anon extension to intercept queries from specific roles and apply masking rules before the results are returned.
anonymizer_demo=# ALTER DATABASE anonymizer_demo SET anon.transparent_dynamic_masking = TRUE;
ALTER DATABASE
To see show case masking in action, we will use two types of users: a masked user who sees fake data, and an unmasked user (like the superuser) who sees the actual data stored on disk.
In this step, we create demo_user and “tag” them with a security label that forces the masking engine to engage whenever they log in.
anonymizer_demo=# CREATE ROLE demo_user LOGIN;
CREATE ROLE
anonymizer_demo=# SECURITY LABEL FOR anon ON ROLE demo_user IS 'MASKED';
SECURITY LABEL
anonymizer_demo=# GRANT pg_read_all_data to demo_user;
GRANT ROLE
anonymizer_demo=# SECURITY LABEL FOR anon ON ROLE demo_user IS 'MASKED';
SECURITY LABEL
-- As PostgreSQL user:
-- Ensure the postgres user remains unmasked so we can see the 'real' data
anonymizer_demo=# SECURITY LABEL FOR anon ON ROLE postgres IS NULL;
SECURITY LABEL
We don’t need to redefine our masking rules (for names, emails, etc.) because the SECURITY LABEL definitions we created in the static masking section are still stored in the database schema.
Watch what happens when we query the table as demo_user:
anonymizer_demo=> select * from customer;
id | birthdate | email | company | first_name | last_name
----+------------+--------------------------+------------------+------------+-----------
1 | 1957-02-07 | [email protected] | Fernandez-Tucker | Emelie | Rohan
2 | 1950-05-15 | [email protected] | Leonard Group | Jane | Durgan
(2 rows)
If we run the exact same command again, the output changes:
anonymizer_demo=> select * from customer;
id | birthdate | email | company | first_name | last_name
----+------------+------------------------+---------------+------------+------------
1 | 1963-12-26 | [email protected] | Simpson Group | Tod | Balistreri
2 | 1971-09-14 | [email protected] | Davis-Hardin | Sonny | Wintheiser
(2 rows)
Because the data is being generated “on-the-fly” by the masking functions, the results are dynamic. Each request produces a fresh set of data.
Now, let’s switch back to the postgres user. Since we removed the MASKED label from this role, the engine steps aside and shows us the actual data residing on the disk (which, in this case, is the data we masked statically in the previous step).
anonymizer_demo=# SELECT * FROM customer;
id | birthdate | email | company | first_name | last_name
----+------------+----------------------+----------------+------------+------------
1 | 1960-11-30 | [email protected] | Brown and Sons | Chyna | Mertz
2 | 1997-09-08 | [email protected] | Rice PLC | Damien | Williamson
(2 rows)
3. Anonymized dump
An Anonymized Dump allows you to export your database into a .sql file where the sensitive data is already replaced by fake values. This is incredibly powerful because you can create a “safe” backup that can be shared with developers or consultants without ever giving them access to your live production server.
Rather than using a superuser, we create a specific role whose sole purpose is to retrieve the masked version of the data during the export process.
-- Create a specialized user for the dump process
anonymizer_demo=# CREATE ROLE demo_ano_dumper LOGIN PASSWORD 'secret';
CREATE ROLE
-- Force the masking engine to be active for this user session
anonymizer_demo=# ALTER ROLE demo_ano_dumper SET anon.transparent_dynamic_masking = TRUE;
ALTER ROLE
-- Apply the MASKED label to the role
anonymizer_demo=# SECURITY LABEL FOR anon ON ROLE demo_ano_dumper IS 'MASKED';
SECURITY LABEL
-- Grant permission to read all tables
anonymizer_demo=# GRANT pg_read_all_data TO demo_ano_dumper;
GRANT
Now, we use the standard pg_dump utility. Because we are logging in as demo_ano_dumper, the anon extension intercepts the data export on-the-fly. We use a few specific flags to ensure the resulting file is clean and doesn’t contain the masking logic itself:
/u01/app/postgres/product/18/db_1/bin/pg_dump anonymizer_demo --username=demo_ano_dumper --password --no-security-labels --exclude-extension="anon" --file=anonymized_dump.sql
--no-security-labels: Prevents the “MASKED” tags from being exported (the new database doesn’t need to know how the data was masked).
--exclude-extension="anon": Ensures the recipient doesn’t need the anon extension installed to restore the file.
If you open the generated anonymized_dump.sql file in a text editor, you will see that the COPY commands contain the fake data, not the original sensitive information.
--
-- Data for Name: customer; Type: TABLE DATA; Schema: public; Owner: postgres
--
COPY public.customer (id, birthdate, email, company, first_name, last_name) FROM stdin;
1 1980-12-17 [email protected] Carr-Doyle Katharina Shanahan
2 1998-03-27 [email protected] Pollard and Sons Ayla Spencer
\.
V. Conclusion
In modern development, the goal is to work with realistic data without the real-world risk. PostgreSQL Anonymizer bridges this gap by allowing you to transform sensitive production information into safe, functional datasets.
Now that we’ve explained how to use pg_anonymizer and covered all three methods, here is a quick guide on when to use each:
- Static Masking: Best for “cleaning” a staging database after a production refresh.
- Dynamic Masking: Best for internal users (DBAs, support staff) who need to work on the live database but shouldn’t see production data.
- Anonymized Dump: Best for sharing data with external partners or creating local development environments.
In the end, whether you choose Static, Dynamic, or Dump masking, the benefits remain the same:
- Utility: Because the data is masked with realistic functions (like
fake_emailordummy_first_name), your application logic—like email validation or UI layout—still works perfectly. - Compliance: Meet GDPR, HIPAA, and internal security requirements by default.
- Safety: Developers and analysts can work on real-world bugs and features without ever seeing a customer’s actual PII (Personally Identifiable Information).
I hope you enjoyed this guide and found these examples clear and easy to follow! My goal was to show that data privacy doesn’t have to be painful for your development workflow. Don’t forget to follow the extension latest news on the official website: https://postgresql-anonymizer.readthedocs.io/en/latest/
If you have any questions about these commands or how to implement them in your own environment, feel free to reach out 🙂