PostgreSQL comes with three data types for storing characters. char, varchar and text. What we often see at our customers is, that varchar(n) is used when they want to store characters and limit the amount of characters which are allowed to store into the field/column. Before we go into the details: The only reason you might want to use char or varchar is, that both are defined in the SQL standard. For all other cases you should use text.
Let’s start with three sample tables, containing one column of char, varchar and text:
postgres=# create table t1 ( a char(2000) ); CREATE TABLE postgres=# \d t1 Table "public.t1" Column | Type | Collation | Nullable | Default --------+-----------------+-----------+----------+--------- a | character(2000) | | | postgres=# create table t2 ( a varchar(2000) ); CREATE TABLE postgres=# \d t2 Table "public.t2" Column | Type | Collation | Nullable | Default --------+-------------------------+-----------+----------+--------- a | character varying(2000) | | | postgres=# create table t3 ( a text ); CREATE TABLE postgres=# \d t3 Table "public.t3" Column | Type | Collation | Nullable | Default --------+------+-----------+----------+--------- a | text | | | postgres=#
Tables t1 and t2 can store strings up to the length of 2000 characters (not bytes). Table t3 can store any string up to one GB. To see the first difference between those data types we’ll populate them with one million rows like this:
postgres=# \timing on Timing is on. postgres=# insert into t1 select lpad(i::char,100,'0') from generate_series(1,1000000) i; INSERT 0 1000000 Time: 42041.031 ms (00:42.041) postgres=# insert into t2 select lpad(i::varchar,100,'0') from generate_series(1,1000000) i; INSERT 0 1000000 Time: 2813.176 ms (00:02.813) postgres=# insert into t3 select lpad(i::text,100,'0') from generate_series(1,1000000) i; INSERT 0 1000000 Time: 2790.226 ms (00:02.790) postgres=#
What you’ll notice is, that table t1 containing the char column is much slower to insert data into. There is not much difference between the varchar(n) and text tables. The questions is: Why is the char(n) table so much slower than the other two? The reason is quite obvious if you compare the size of the tables:
postgres=# select pg_size_pretty(pg_relation_size('t1')); pg_size_pretty ---------------- 1953 MB (1 row) Time: 1.719 ms postgres=# select pg_size_pretty(pg_relation_size('t2')); pg_size_pretty ---------------- 128 MB (1 row) Time: 0.682 ms postgres=# select pg_size_pretty(pg_relation_size('t3')); pg_size_pretty ---------------- 128 MB (1 row) Time: 0.629 ms
The first table containing the char(n) column is almost 2GB while the other two are only 128MB in size, this is quite a huge difference. The reason is, that char(n) is space padded and will always consume the size of 2000 characters, while varchar(n) and text are dynamic. Dynamic means, they only consume as much space as it is required to store the strings, but not the maximum possible. This already disqualifies char(n).
What about varchar(n) and text? There is no difference when it comes to space consumption on disk. Both use as much space as it is required to store the strings. What is the difference then? The difference is of course the length constraint of the varchar(n) type. You cannot insert more than 2000 characters into the t2 table:
postgres=# insert into t2 select lpad('i',2000,'0'); INSERT 0 1 Time: 9.201 ms postgres=# insert into t2 select lpad('i',2001,'0'); ERROR: value too long for type character varying(2000) Time: 3.445 ms
The t3 table with the text column does not have such a limit:
postgres=# insert into t3 select lpad('i',2001,'0'); INSERT 0 1 Time: 9.818 ms
This might seem to be an advantage of the varchar data type, but you can easily have the same behavior for text by adding a constraint:
postgres=# delete from t3 where a = lpad('i',2001,'0'); DELETE 1 Time: 200.586 ms postgres=# alter table t3 add constraint c1 check ( length(a) < 2001 ); ALTER TABLE Time: 289.121 ms postgres=# insert into t3 select lpad('i',2001,'0'); ERROR: new row for relation "t3" violates check constraint "c1" DETAIL: Failing row contains (0000000000000000000000000000000000000000000000000000000000000000...). Time: 1.623 ms
Does that make any difference when data inserted?
postgres=# truncate t2,t3; TRUNCATE TABLE Time: 58.063 ms postgres=# insert into t2 select lpad(i::varchar,100,'0') from generate_series(1,1000000) i; INSERT 0 1000000 Time: 2281.157 ms (00:02.281) postgres=# insert into t3 select lpad(i::text,100,'0') from generate_series(1,1000000) i; INSERT 0 1000000 Time: 2572.467 ms (00:02.572)
Not really. So what is the case for varchar(n)? SQL standard compliance, that’s it.
111
04.09.2023The maximum size of a CHARACTER VARYING column is 10 mega byte not 1 gigabyte.
Daniel Westermann
04.09.2023the 1GB refers to the text column, not varchar