dbi Blog

A very, very long time ago I’ve written a blog about cstore_fdw, which brings columnar storage to PostgreSQL. This is now part of Citus and does not anymore come as a separate extension. While this still can be used there is another option if you’re looking for columnar storage in PostgreSQL, and this is pg_mooncake. The goal of this extension is to optimize PostgreSQL for analytics and stores the tables in Iceberg or Delta Lake format, either on local disk or on cloud storage such as s3.

Getting the extension onto the system is pretty simple and straight forward (as you can see below it uses DuckDB in the background), but it will take some time to compile and you need to have Cargo installed for this to succeed:

postgres@pgbox:/home/postgres/ [172] which pg_config
/u01/app/postgres/product/17/db_2/bin/pg_config
postgres@pgbox:/home/postgres/ [172] git clone https://github.com/Mooncake-Labs/pg_mooncake.git
Cloning into 'pg_mooncake'...
remote: Enumerating objects: 1085, done.
remote: Counting objects: 100% (533/533), done.
remote: Compressing objects: 100% (250/250), done.
remote: Total 1085 (delta 406), reused 284 (delta 283), pack-reused 552 (from 2)
Receiving objects: 100% (1085/1085), 728.27 KiB | 3.98 MiB/s, done.
Resolving deltas: 100% (631/631), done.
postgres@pgbox:/home/postgres/ [172] cd pg_mooncake/
postgres@pgbox:/home/postgres/pg_mooncake/ [172] git submodule update --init --recursive
Submodule 'third_party/duckdb' (https://github.com/duckdb/duckdb.git) registered for path 'third_party/duckdb'
Cloning into '/home/postgres/pg_mooncake/third_party/duckdb'...
Submodule path 'third_party/duckdb': checked out '19864453f7d0ed095256d848b46e7b8630989bac'
postgres@pgbox:/home/postgres/pg_mooncake/ [172] make release -j2
...
[ 23%] Building CXX object src/common/types/column/CMakeFiles/duckdb_common_types_column.dir/ub_duckdb_common_types_column.cpp.o
[ 23%] Built target duckdb_common_types_column
[ 23%] Building CXX object src/common/types/row/CMakeFiles/duckdb_common_types_row.dir/ub_duckdb_common_types_row.cpp.o
[ 23%] Built target duckdb_common_types_row
[ 23%] Building CXX object src/common/value_operations/CMakeFiles/duckdb_value_operations.dir/ub_duckdb_value_operations.cpp.o
[ 23%] Built target duckdb_common_types
...
postgres@pgbox:/home/postgres/pg_mooncake/ [172] make install
/usr/bin/mkdir -p '/u01/app/postgres/product/17/db_2/lib'
/usr/bin/mkdir -p '/u01/app/postgres/product/17/db_2/share/extension'
/usr/bin/mkdir -p '/u01/app/postgres/product/17/db_2/share/extension'
/usr/bin/mkdir -p '/u01/app/postgres/product/17/db_2/lib'
/usr/bin/install -c -m 755  pg_mooncake.so '/u01/app/postgres/product/17/db_2/lib/pg_mooncake.so'
/usr/bin/install -c -m 644 .//../../pg_mooncake.control '/u01/app/postgres/product/17/db_2/share/extension/'
/usr/bin/install -c -m 644 .//../../sql/pg_mooncake--0.1.0.sql .//../../sql/pg_mooncake--0.1.0--0.1.1.sql .//../../sql/pg_mooncake--0.1.1--0.1.2.sql  '/u01/app/postgres/product/17/db_2/share/extension/'
/usr/bin/install -c -m 755  libduckdb.so '/u01/app/postgres/product/17/db_2/lib/'

On this is compiled and installed simply add the extension to a database:

postgres=$ create extension pg_mooncake;
CREATE EXTENSION
postgres=$ \dx
                                            List of installed extensions
        Name        | Version |   Schema   |                              Description                               
--------------------+---------+------------+------------------------------------------------------------------------
 pg_mooncake        | 0.1.2   | public     | Columnstore Table in Postgres
 pg_stat_statements | 1.11    | public     | track planning and execution statistics of all SQL statements executed
 plpgsql            | 1.0     | pg_catalog | PL/pgSQL procedural language
(3 rows)

To compare this table layout against the standard PostgreSQL layout, let’s create two tables, one of them using the column store format:

postgres=# create table t1 ( a int, b text );
CREATE TABLE
postgres=# create table t2 ( a int, b text ) using columnstore;
CREATE TABLE
postgres=# \d+ t1
                                            Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 
--------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
 a      | integer |           |          |         | plain    |             |              | 
 b      | text    |           |          |         | extended |             |              | 
Access method: heap

postgres=# \d+ t2
                                            Table "public.t2"
 Column |  Type   | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 
--------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
 a      | integer |           |          |         | plain    |             |              | 
 b      | text    |           |          |         | extended |             |              | 
Access method: columnstore

Adding a millions rows to both of the tables:

postgres=# insert into t1 select i, md5(i::text) from generate_series(1,1000000) i;
INSERT 0 1000000
postgres=# insert into t2 select i, md5(i::text) from generate_series(1,1000000) i;
INSERT 0 1000000

Looking at the disk we can see the first surprise:

postgres=# select pg_relation_filepath('t1');
 pg_relation_filepath 
----------------------
 base/5/24715
(1 row)

postgres=# select pg_relation_filepath('t2');
 pg_relation_filepath 
----------------------
 base/5/24718
(1 row)

postgres=# select pg_size_pretty ( pg_relation_size ( 't1' ));
 pg_size_pretty 
----------------
 65 MB
(1 row)

postgres=# select pg_size_pretty ( pg_relation_size ( 't2' ));
 pg_size_pretty 
----------------
 0 bytes
(1 row)

The table in columnar format reports a size of zero bytes, and indeed there is not even a file on disk which corresponds to 24718:

postgres=# \! ls -la $PGDATA/base/5/24715
-rw-------. 1 postgres postgres 68272128 Feb 24 14:19 /u02/pgdata/17/base/5/24715
postgres=# \! ls -la $PGDATA/base/5/24718
ls: cannot access '/u02/pgdata/17/base/5/24718': No such file or directory

Instead the table is stored here as Parquet files:

 postgres@pgbox:/u02/pgdata/17/ [172] ls -la $PGDATA/mooncake_local_tables/mooncake_postgres_t2_24708/
total 107224
drwx------. 3 postgres postgres      180 Feb 24 14:31 .
drwx------. 3 postgres postgres       40 Feb 24 14:15 ..
-rw-------. 1 postgres postgres 36596064 Feb 24 14:19 560c6efe-1226-4a76-985f-1301169bcc44.parquet
-rw-------. 1 postgres postgres 36596064 Feb 24 14:18 ca0550d6-bd84-4bf9-b8cf-6ce85a65346b.parquet
drwx------. 2 postgres postgres     4096 Feb 24 14:19 _delta_log
-rw-------. 1 postgres postgres 36596064 Feb 24 14:19 fba0eff4-3c57-4dbb-bd9b-469f6622ab92.parquet

Here is more detailed blog about the design decisions.

The difference is also visible when you look at the explain plans against both tables:

postgres=# explain (analyze) select * from t1 where a = 1;
                                                   QUERY PLAN                                                    
-----------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..14542.43 rows=1 width=37) (actual time=1.591..41.014 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   -&gt;  Parallel Seq Scan on t1  (cost=0.00..13542.33 rows=1 width=37) (actual time=0.022..11.067 rows=0 loops=3)
         Filter: (a = 1)
         Rows Removed by Filter: 333333
 Planning Time: 0.861 ms
 Execution Time: 41.086 ms
(8 rows)


postgres=# explain (analyze) select * from t2 where a = 1;
                                                  QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
 Custom Scan (MooncakeDuckDBScan)  (cost=0.00..0.00 rows=0 width=0) (actual time=7.797..7.816 rows=1 loops=1)
   DuckDB Execution Plan: 
 
 ┌─────────────────────────────────────┐
 │┌───────────────────────────────────┐│
 ││    Query Profiling Information    ││
 │└───────────────────────────────────┘│
 └─────────────────────────────────────┘
 EXPLAIN ANALYZE SELECT a, b FROM pgmooncake.public.t2 WHERE (a = 1)
 ┌────────────────────────────────────────────────┐
 │┌──────────────────────────────────────────────┐│
 ││              Total Time: 0.0043s             ││
 │└──────────────────────────────────────────────┘│
 └────────────────────────────────────────────────┘
 ┌───────────────────────────┐
 │           QUERY           │
 └─────────────┬─────────────┘
 ┌─────────────┴─────────────┐
 │      EXPLAIN_ANALYZE      │
 │    ────────────────────   │
 │           0 Rows          │
 │          (0.00s)          │
 └─────────────┬─────────────┘
 ┌─────────────┴─────────────┐
 │         TABLE_SCAN        │
 │    ────────────────────   │
 │         Function:         │
 │      COLUMNSTORE_SCAN     │
 │                           │
 │        Projections:       │
 │             a             │
 │             b             │
 │                           │
 │          Filters:         │
 │   a=1 AND a IS NOT NULL   │
 │                           │
 │           1 Rows          │
 │          (0.00s)          │
 └───────────────────────────┘
 
 
 Planning Time: 1.693 ms
 Execution Time: 8.584 ms
(43 rows)

As usual with columnar data storage, is is best when you have data which can be compressed well on a columnar basis, e.g.:

postgres=# truncate t1,t2;
TRUNCATE TABLE
postgres=# insert into t1 select 1, md5(i::text) from generate_series(1,1000000) i;
INSERT 0 1000000
Time: 852.812 ms
postgres=# insert into t2 select 1, md5(i::text) from generate_series(1,1000000) i;
INSERT 0 1000000
Time: 243.532 ms

The insert into the t2 is consistently faster than the insert into the standard tables (just repeat the inserts multiple times to get an idea). The same happens when you read columns which are compressed well (same here, just repeat the query multiple times to get an idea):

postgres=# select count(a) from t1;
  count  
---------
 2000000
(1 row)

Time: 60.463 ms
postgres=# select count(a) from t2;
  count  
---------
 2000000
(1 row)

Time: 10.272 ms

This might be an option if you have use cases for this.

Post Views: 1,193