In Oracle you can use the external table feature to load data from files into the database. The traditional way to do this in PostgreSQL is to use the copy command. But there is another option which makes use of foreign data wrappers. The foreign data wrapper for doing this is file_fdw.
For creating some data here is a little script which generates 1’000 lines of the form: ID, TEXT:
#!/bin/bash OUTPUT="/var/tmp/data.csv" `rm -rf ${OUTPUT}` for i in {1..1000} do STRING=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1) echo "${i} , ${STRING}" >> /var/tmp/data.csv done
For loading that into PostgreSQL the procedure is simple. First we need to create the extension and then define a server and a foreign table:
(postgres@[local]:5000) [postgres] > create extension file_fdw; CREATE EXTENSION Time: 752.715 ms (postgres@[local]:5000) [postgres] > create server srv_file_fdw foreign data wrapper file_fdw; CREATE SERVER Time: 23.317 ms (postgres@[local]:5000) [postgres] > create foreign table t_csv ( a int, b varchar(50) ) server srv_file_fdw options ( filename '/var/tmp/data.csv', format 'csv' ); CREATE FOREIGN TABLE Time: 74.843 ms
From now we can access the table and load the data:
(postgres@[local]:5000) [postgres] > select count(*) from t_csv; count ------- 1000 (1 row) Time: 0.707 ms (postgres@[local]:5000) [postgres] > create table t_csv2 as select * from t_csv; SELECT 1000 Time: 147.859 ms
Currently, as outlined in the documentation, you can not write to a foreign table defined with file_fdw:
(postgres@[local]:5000) [postgres] > insert into t_csv values (-1,'a'); ERROR: cannot insert into foreign table "t_csv" Time: 18.113 ms
Using explain you can get some details of the file:
(postgres@[local]:5000) [postgres] > explain select count(*) from t_csv; QUERY PLAN -------------------------------------------------------------------- Aggregate (cost=171.15..171.16 rows=1 width=0) -> Foreign Scan on t_csv (cost=0.00..167.10 rows=1621 width=0) Foreign File: /var/tmp/data.csv Foreign File Size: 38893 (4 rows) Time: 0.521 ms
If you wonder why the estimate for the rows is wrong:
(postgres@[local]:5000) [postgres] > analyze t_csv; ANALYZE Time: 79.925 ms (postgres@[local]:5000) [postgres] > explain select count(*) from t_csv; QUERY PLAN -------------------------------------------------------------------- Aggregate (cost=107.50..107.51 rows=1 width=0) -> Foreign Scan on t_csv (cost=0.00..105.00 rows=1000 width=0) Foreign File: /var/tmp/data.csv Foreign File Size: 38893 (4 rows) Time: 0.296 ms
here you go …