dbi Blog

Make sure that you read the last post about TOASTing strategies in PostgreSQL before you continue with this one, at least if you are not familiar with the previous topic. The strategies we looked at are one important implementation detail you need to be aware of when it comes to TOASTing. What we didn’t look at until know, is how TOASTing actually works and where the data is going, when it is moved out of the main table.

Let’s start with a small table:

postgres=# create table t ( a int, b text );
CREATE TABLE
postgres=# select attname, atttypid::regtype,attcompression,
                  case attstorage when 'p' then 'plain'
                                  when 'e' then 'external'
                                  when 'm' then 'main'
                                  when 'x' then 'extended'
                  end AS strategy
             from pg_attribute
            where attrelid = 't'::regclass and attnum > 0;
 attname | atttypid | attcompression | strategy 
---------+----------+----------------+----------
 a       | integer  |                | plain
 b       | text     |                | extended
(2 rows)

We already know that column “a” can potentially be compressed and column “b” can potentially be compressed and moved out of line. PostgreSQL, because column “b” might contain data that must be moved out of the main table, already created a TOAST table in the background. If you don’t know this, you probably will never notice this, as it happens transparently. The questions is, where was that table created and how can you have a look at it? It definitely was not created in the same schema as the table, because we do not see it:

postgres=# \d
        List of relations
 Schema | Name | Type  |  Owner   
--------+------+-------+----------
 public | t    | table | postgres
(1 row)

TOAST tables go into a special called “pg_toast” and you’ll notice that already are plenty of other TOAST tables and indexes:

postgres=# \dnS
            List of schemas
        Name        |       Owner       
--------------------+-------------------
 information_schema | postgres
 pg_catalog         | postgres
 pg_toast           | postgres
 public             | pg_database_owner
(4 rows)
postgres=# \d pg_toast.pg_toast_
pg_toast.pg_toast_1213         pg_toast.pg_toast_1418_index   pg_toast.pg_toast_2619         pg_toast.pg_toast_3456_index
pg_toast.pg_toast_1213_index   pg_toast.pg_toast_16418        pg_toast.pg_toast_2619_index   pg_toast.pg_toast_3466
pg_toast.pg_toast_1247         pg_toast.pg_toast_16418_index  pg_toast.pg_toast_2620         pg_toast.pg_toast_3466_index
pg_toast.pg_toast_1247_index   pg_toast.pg_toast_2328         pg_toast.pg_toast_2620_index   pg_toast.pg_toast_3592
pg_toast.pg_toast_1255         pg_toast.pg_toast_2328_index   pg_toast.pg_toast_2964         pg_toast.pg_toast_3592_index
pg_toast.pg_toast_1255_index   pg_toast.pg_toast_2396         pg_toast.pg_toast_2964_index   pg_toast.pg_toast_3596
pg_toast.pg_toast_1260         pg_toast.pg_toast_2396_index   pg_toast.pg_toast_3079         pg_toast.pg_toast_3596_index
pg_toast.pg_toast_1260_index   pg_toast.pg_toast_2600         pg_toast.pg_toast_3079_index   pg_toast.pg_toast_3600
pg_toast.pg_toast_1262         pg_toast.pg_toast_2600_index   pg_toast.pg_toast_3118         pg_toast.pg_toast_3600_index
pg_toast.pg_toast_1262_index   pg_toast.pg_toast_2604         pg_toast.pg_toast_3118_index   pg_toast.pg_toast_6000
pg_toast.pg_toast_13378        pg_toast.pg_toast_2604_index   pg_toast.pg_toast_3256         pg_toast.pg_toast_6000_index
pg_toast.pg_toast_13378_index  pg_toast.pg_toast_2606         pg_toast.pg_toast_3256_index   pg_toast.pg_toast_6100
pg_toast.pg_toast_13383        pg_toast.pg_toast_2606_index   pg_toast.pg_toast_3350         pg_toast.pg_toast_6100_index
pg_toast.pg_toast_13383_index  pg_toast.pg_toast_2609         pg_toast.pg_toast_3350_index   pg_toast.pg_toast_6106
pg_toast.pg_toast_13388        pg_toast.pg_toast_2609_index   pg_toast.pg_toast_3381         pg_toast.pg_toast_6106_index
pg_toast.pg_toast_13388_index  pg_toast.pg_toast_2612         pg_toast.pg_toast_3381_index   pg_toast.pg_toast_6243
pg_toast.pg_toast_13393        pg_toast.pg_toast_2612_index   pg_toast.pg_toast_3394         pg_toast.pg_toast_6243_index
pg_toast.pg_toast_13393_index  pg_toast.pg_toast_2615         pg_toast.pg_toast_3394_index   pg_toast.pg_toast_826
pg_toast.pg_toast_1417         pg_toast.pg_toast_2615_index   pg_toast.pg_toast_3429         pg_toast.pg_toast_826_index

It seems all are named “pg_toast_” followed by the OID of the table the TOAST table if for:

postgres=# select oid from pg_class where relname = 't';
  oid  
-------
 16418
(1 row)

postgres=# \d pg_toast.pg_toast_16418
TOAST table "pg_toast.pg_toast_16418"
   Column   |  Type   
------------+---------
 chunk_id   | oid
 chunk_seq  | integer
 chunk_data | bytea
Owning table: "public.t"
Indexes:
    "pg_toast_16418_index" PRIMARY KEY, btree (chunk_id, chunk_seq)

Be careful with this approach to identify a TOAST table, the OID of a table can change. The correct way for identifying a TOAST table is to ask pg_class:

postgres=# select relname,reltoastrelid::regclass 
             from pg_class
            where relname = 't'; \gset
 relname |      reltoastrelid      
---------+-------------------------
 t       | pg_toast.pg_toast_16418
(1 row)

Back to the structure of TOAST tables:

postgres=# \d :reltoastrelid
TOAST table "pg_toast.pg_toast_16418"
   Column   |  Type   
------------+---------
 chunk_id   | oid
 chunk_seq  | integer
 chunk_data | bytea
Owning table: "public.t"
Indexes:
    "pg_toast_16418_index" PRIMARY KEY, btree (chunk_id, chunk_seq)

The primary key (chunk_id, chunk_seq) is always used by PostgreSQL to access the data. To be more precise the primary key is used to retrieve the “chunks” of data which are stored in binary format in a bytea column and the main table contains a pointer to the chunk_id. This absolutely makes sense, as data is either compressed before it is moved out of line or the data anyway is already in binary format. What PostgreSQL is doing with large data is, it slices the data in smaller “chunks”, and these chunks go into the TOAST table. This keeps the main table small, and as long as you do not ask for the data which is TOASTed, there is no need to go to the TOAST table. This should already tell you, that you should avoid “select *” against tables which contain TOASTed data, even better, avoid “select *” at all.

That’s it for today. In the next post we’ll go into more details how TOASTing works internally.

Post Views: 6,779