With the just released version of 17 of PostgreSQL, copy got the “ON_ERROR” switch. This is very useful if you load something which contains rows that cannot be inserted into the target table. Before that, copy would just stop and error out. The only option was to fix the data and then copy again. Using on “ON_ERROR”, copy can be told to continue loading and ignore those rows with bad data.
The default behavior is still to stop as soon as there is a row/line which cannot be loaded:
postgres=# create table t ( a int, b int, c text );
CREATE TABLE
postgres=# \! cat load.txt
1,1,'aaa'
2,2,'bbb'
3,3,'ccc'
4,four,'ddd'
5,5,'eee'
6,6,'fff'
7,7,'ggg'
8,eight,'hhh'
9,9,'iii'
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',');
ERROR: invalid input syntax for type integer: "four"
CONTEXT: COPY t, line 4, column b: "four"
Starting with PostgreSQL 17, we can tell copy to ignore those rows and continue to load the data:
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',', on_error ignore);
COPY 7
postgres=# select * from t;
a | b | c
---+---+-------
1 | 1 | 'aaa'
2 | 2 | 'bbb'
3 | 3 | 'ccc'
5 | 5 | 'eee'
6 | 6 | 'fff'
7 | 7 | 'ggg'
9 | 9 | 'iii'
(7 rows)
What we cannot do up to PostgreSQL 17 is to set a limit for bad data. This will be possible starting with PostgreSQL 18, as copy got a new switch: “REJECT_LIMIT”.
Using this, we can tell copy to stop as soon as we reach n-rows which cannot be loaded:
postgres=# truncate t;
TRUNCATE TABLE
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',', on_error ignore, reject_limit 1);
ERROR: skipped more than REJECT_LIMIT (1) rows due to data type incompatibility
CONTEXT: COPY t, line 8, column b: "eight"
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',', on_error ignore, reject_limit 2);
ERROR: skipped more than REJECT_LIMIT (2) rows due to data type incompatibility
CONTEXT: COPY t, line 10, column a: ""
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',', on_error ignore, reject_limit 3);
COPY 7
postgres=# select * from t;
a | b | c
---+---+-------
1 | 1 | 'aaa'
2 | 2 | 'bbb'
3 | 3 | 'ccc'
5 | 5 | 'eee'
6 | 6 | 'fff'
7 | 7 | 'ggg'
9 | 9 | 'iii'
(7 rows)
Nice, this gives more control about how to handle data which does not fit into into the target table.