Make sure that you read the last post about TOASTing strategies in PostgreSQL before you continue with this one, at least if you are not familiar with the previous topic. The strategies we looked at are one important implementation detail you need to be aware of when it comes to TOASTing. What we didn’t look at until know, is how TOASTing actually works and where the data is going, when it is moved out of the main table.
Let’s start with a small table:
postgres=# create table t ( a int, b text ); CREATE TABLE postgres=# select attname, atttypid::regtype,attcompression, case attstorage when 'p' then 'plain' when 'e' then 'external' when 'm' then 'main' when 'x' then 'extended' end AS strategy from pg_attribute where attrelid = 't'::regclass and attnum > 0; attname | atttypid | attcompression | strategy ---------+----------+----------------+---------- a | integer | | plain b | text | | extended (2 rows)
We already know that column “a” can potentially be compressed and column “b” can potentially be compressed and moved out of line. PostgreSQL, because column “b” might contain data that must be moved out of the main table, already created a TOAST table in the background. If you don’t know this, you probably will never notice this, as it happens transparently. The questions is, where was that table created and how can you have a look at it? It definitely was not created in the same schema as the table, because we do not see it:
postgres=# \d List of relations Schema | Name | Type | Owner --------+------+-------+---------- public | t | table | postgres (1 row)
TOAST tables go into a special called “pg_toast” and you’ll notice that already are plenty of other TOAST tables and indexes:
postgres=# \dnS List of schemas Name | Owner --------------------+------------------- information_schema | postgres pg_catalog | postgres pg_toast | postgres public | pg_database_owner (4 rows) postgres=# \d pg_toast.pg_toast_ pg_toast.pg_toast_1213 pg_toast.pg_toast_1418_index pg_toast.pg_toast_2619 pg_toast.pg_toast_3456_index pg_toast.pg_toast_1213_index pg_toast.pg_toast_16418 pg_toast.pg_toast_2619_index pg_toast.pg_toast_3466 pg_toast.pg_toast_1247 pg_toast.pg_toast_16418_index pg_toast.pg_toast_2620 pg_toast.pg_toast_3466_index pg_toast.pg_toast_1247_index pg_toast.pg_toast_2328 pg_toast.pg_toast_2620_index pg_toast.pg_toast_3592 pg_toast.pg_toast_1255 pg_toast.pg_toast_2328_index pg_toast.pg_toast_2964 pg_toast.pg_toast_3592_index pg_toast.pg_toast_1255_index pg_toast.pg_toast_2396 pg_toast.pg_toast_2964_index pg_toast.pg_toast_3596 pg_toast.pg_toast_1260 pg_toast.pg_toast_2396_index pg_toast.pg_toast_3079 pg_toast.pg_toast_3596_index pg_toast.pg_toast_1260_index pg_toast.pg_toast_2600 pg_toast.pg_toast_3079_index pg_toast.pg_toast_3600 pg_toast.pg_toast_1262 pg_toast.pg_toast_2600_index pg_toast.pg_toast_3118 pg_toast.pg_toast_3600_index pg_toast.pg_toast_1262_index pg_toast.pg_toast_2604 pg_toast.pg_toast_3118_index pg_toast.pg_toast_6000 pg_toast.pg_toast_13378 pg_toast.pg_toast_2604_index pg_toast.pg_toast_3256 pg_toast.pg_toast_6000_index pg_toast.pg_toast_13378_index pg_toast.pg_toast_2606 pg_toast.pg_toast_3256_index pg_toast.pg_toast_6100 pg_toast.pg_toast_13383 pg_toast.pg_toast_2606_index pg_toast.pg_toast_3350 pg_toast.pg_toast_6100_index pg_toast.pg_toast_13383_index pg_toast.pg_toast_2609 pg_toast.pg_toast_3350_index pg_toast.pg_toast_6106 pg_toast.pg_toast_13388 pg_toast.pg_toast_2609_index pg_toast.pg_toast_3381 pg_toast.pg_toast_6106_index pg_toast.pg_toast_13388_index pg_toast.pg_toast_2612 pg_toast.pg_toast_3381_index pg_toast.pg_toast_6243 pg_toast.pg_toast_13393 pg_toast.pg_toast_2612_index pg_toast.pg_toast_3394 pg_toast.pg_toast_6243_index pg_toast.pg_toast_13393_index pg_toast.pg_toast_2615 pg_toast.pg_toast_3394_index pg_toast.pg_toast_826 pg_toast.pg_toast_1417 pg_toast.pg_toast_2615_index pg_toast.pg_toast_3429 pg_toast.pg_toast_826_index
It seems all are named “pg_toast_” followed by the OID of the table the TOAST table if for:
postgres=# select oid from pg_class where relname = 't'; oid ------- 16418 (1 row) postgres=# \d pg_toast.pg_toast_16418 TOAST table "pg_toast.pg_toast_16418" Column | Type ------------+--------- chunk_id | oid chunk_seq | integer chunk_data | bytea Owning table: "public.t" Indexes: "pg_toast_16418_index" PRIMARY KEY, btree (chunk_id, chunk_seq)
Be careful with this approach to identify a TOAST table, the OID of a table can change. The correct way for identifying a TOAST table is to ask pg_class:
postgres=# select relname,reltoastrelid::regclass from pg_class where relname = 't'; \gset relname | reltoastrelid ---------+------------------------- t | pg_toast.pg_toast_16418 (1 row)
Back to the structure of TOAST tables:
postgres=# \d :reltoastrelid TOAST table "pg_toast.pg_toast_16418" Column | Type ------------+--------- chunk_id | oid chunk_seq | integer chunk_data | bytea Owning table: "public.t" Indexes: "pg_toast_16418_index" PRIMARY KEY, btree (chunk_id, chunk_seq)
The primary key (chunk_id, chunk_seq) is always used by PostgreSQL to access the data. To be more precise the primary key is used to retrieve the “chunks” of data which are stored in binary format in a bytea column and the main table contains a pointer to the chunk_id. This absolutely makes sense, as data is either compressed before it is moved out of line or the data anyway is already in binary format. What PostgreSQL is doing with large data is, it slices the data in smaller “chunks”, and these chunks go into the TOAST table. This keeps the main table small, and as long as you do not ask for the data which is TOASTed, there is no need to go to the TOAST table. This should already tell you, that you should avoid “select *” against tables which contain TOASTed data, even better, avoid “select *” at all.
That’s it for today. In the next post we’ll go into more details how TOASTing works internally.