In Oracle you can use the external table feature to load data from files into the database. The traditional way to do this in PostgreSQL is to use the copy command. But there is another option which makes use of foreign data wrappers. The foreign data wrapper for doing this is file_fdw.
For creating some data here is a little script which generates 1’000 lines of the form: ID, TEXT:
#!/bin/bash OUTPUT="/var/tmp/data.csv" `rm -rf ${OUTPUT}` for i in {1..1000} do STRING=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1) echo "${i} , ${STRING}" >> /var/tmp/data.csv done
For loading that into PostgreSQL the procedure is simple. First we need to create the extension and then define a server and a foreign table:
([email protected][local]:5000) [postgres] > create extension file_fdw; CREATE EXTENSION Time: 752.715 ms ([email protected][local]:5000) [postgres] > create server srv_file_fdw foreign data wrapper file_fdw; CREATE SERVER Time: 23.317 ms ([email protected][local]:5000) [postgres] > create foreign table t_csv ( a int, b varchar(50) ) server srv_file_fdw options ( filename '/var/tmp/data.csv', format 'csv' ); CREATE FOREIGN TABLE Time: 74.843 ms
From now we can access the table and load the data:
([email protected][local]:5000) [postgres] > select count(*) from t_csv; count ------- 1000 (1 row) Time: 0.707 ms ([email protected][local]:5000) [postgres] > create table t_csv2 as select * from t_csv; SELECT 1000 Time: 147.859 ms
Currently, as outlined in the documentation, you can not write to a foreign table defined with file_fdw:
([email protected][local]:5000) [postgres] > insert into t_csv values (-1,'a'); ERROR: cannot insert into foreign table "t_csv" Time: 18.113 ms
Using explain you can get some details of the file:
([email protected][local]:5000) [postgres] > explain select count(*) from t_csv; QUERY PLAN -------------------------------------------------------------------- Aggregate (cost=171.15..171.16 rows=1 width=0) -> Foreign Scan on t_csv (cost=0.00..167.10 rows=1621 width=0) Foreign File: /var/tmp/data.csv Foreign File Size: 38893 (4 rows) Time: 0.521 ms
If you wonder why the estimate for the rows is wrong:
([email protected][local]:5000) [postgres] > analyze t_csv; ANALYZE Time: 79.925 ms ([email protected][local]:5000) [postgres] > explain select count(*) from t_csv; QUERY PLAN -------------------------------------------------------------------- Aggregate (cost=107.50..107.51 rows=1 width=0) -> Foreign Scan on t_csv (cost=0.00..105.00 rows=1000 width=0) Foreign File: /var/tmp/data.csv Foreign File Size: 38893 (4 rows) Time: 0.296 ms
here you go …