In the last post we’ve highlighted some tips when developing functions and/or procedures in PostgreSQL’s PL/pgSQL. Especially the Compile-Time and Run-Time checks can save you a lot of work and give you important hints while developing your code. But still, at least when compared to other languages, the possibilities for debugging your code in PL/pgSQL are quite limited in vanilla PostgreSQL. Once more, extension to the rescue: There are plenty of extensions out there, and some of them are targeting the development experience with PL/pgSQL. In this post we’ll look at plpgsql_check.
Standard warning: When it comes to extensions outside of PostgreSQL core, always make sure that the extension you want to use is actively maintained. This is not an issue for plpgsql_check, but it might be for others. If is not maintained anymore, either maintain it on your own or look for other solutions for your requirement.
Getting plpgsql_check onto the system is straight forward (I am using the current development version here, please use a version from the release page for real systems):
postgres@debian11pg:/home/postgres/ [150] git clone https://github.com/okbob/plpgsql_check.git
postgres@debian11pg:/home/postgres/ [150] cd plpgsql_check/
postgres@debian11pg:/home/postgres/plpgsql_check/ [150] make USE_PGXS=1 clean
postgres@debian11pg:/home/postgres/plpgsql_check/ [150] make USE_PGXS=1 install
From there on it is the standard way for installing extensions:
postgres@debian11pg:/home/postgres/plpgsql_check/ [150] psql
psql (15.0 dbi services build)
Type "help" for help.
postgres=# select * from pg_available_extensions where name like '%check%';
name | default_version | installed_version | comment
---------------+-----------------+-------------------+--------------------------------------------
amcheck | 1.3 | | functions for verifying relation integrity
plpgsql_check | 2.2 | | extended check for plpgsql functions
(2 rows)
postgres=# create extension plpgsql_check;
CREATE EXTENSION
postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
---------------+---------+------------+--------------------------------------
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
plpgsql_check | 2.2 | public | extended check for plpgsql functions
(2 rows)
postgres=#
What can we do with it? Suppose the following table and function:
postgres=# create table t ( a int, b int );
CREATE TABLE
postgres=# create function f1() returns int as
postgres-# $$
postgres$# begin
postgres$# select c from t limit 1;
postgres$# end; $$ language plpgsql;
CREATE FUNCTION
As there is no column “c” this for sure fails during execution:
postgres=# select f1();
ERROR: column "c" does not exist
LINE 1: select c from t limit 1
^
QUERY: select c from t limit 1
CONTEXT: PL/pgSQL function f1() line 3 at SQL statement
Using the extension we can catch such kind of issues before it fails later on in production:
postgres=# select * from plpgsql_check_function_tb('f1()');
functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context
------------+--------+---------------+----------+---------------------------+--------+------+-------+----------+-------------------------+---------
f1 | 3 | SQL statement | 42703 | column "c" does not exist | | | error | 8 | select c from t limit 1 |
(1 row)
Another example is this:
postgres=# create or replace function f1() returns int as
$$
begin
select b from t;
end; $$ language plpgsql;
CREATE FUNCTION
postgres=# select * from plpgsql_check_function_tb('f1()');
functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context
------------+--------+---------------+----------+------------------------------------------+--------+------+-------+----------+-------+---------
f1 | 3 | SQL statement | 42601 | query has no destination for result data | | | error | | |
(1 row)
postgres=#
Or this:
postgres=# create or replace function f1() returns int as
$$
declare
i int;
begin
select b into i from t;
end; $$ language plpgsql;
CREATE FUNCTION
postgres=# select * from plpgsql_check_function_tb('f1()');
functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context
------------+--------+-----------+----------+------------------------------------------------+--------+------+---------------+----------+-------+---------
f1 | | | 2F005 | control reached end of function without RETURN | | | error | | |
f1 | 3 | DECLARE | 00000 | never read variable "i" | | | warning extra | | |
(2 rows)
There also is a profiler which gives you more information where time is spend inside your functions or procedures. Here is a function which does some real (but stupid) work:
postgres=# insert into t select i, i from generate_series(1,1000000) i;
INSERT 0 1000000
postgres=# create or replace function f1() returns int as
$$
declare
r record;
i int;
j int := 0;
begin
for r in select a from t
loop
i := r.a;
end loop;
for i in 1..1000000
loop
j := j + 1;
end loop;
return i;
end; $$ language plpgsql;
CREATE FUNCTION
First, lets confirm we do not have any issues which plpgsql_check is able to detect:
postgres=# select * from plpgsql_check_function_tb('f1()');
functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context
------------+--------+-----------+----------+---------+--------+------+-------+----------+-------+---------
(0 rows)
Fine, now lets check how time is spend inside the function:
postgres=# set plpgsql_check.profiler to on;
SET
postgres=# select * from f1();
f1
---------
1000000
(1 row)
postgres=# select * from plpgsql_profiler_function_tb('f1()');
lineno | stmt_lineno | queryids | cmds_on_row | exec_stmts | exec_stmts_err | total_time | avg_time | max_time | processed_rows | source
--------+-------------+------------------------+-------------+------------+----------------+------------+----------+-----------+----------------+----------------------------
1 | | | | | | | | | |
2 | | | | | | | | | | declare
3 | | | | | | | | | | r record;
4 | | | | | | | | | | i int;
5 | | | | | | | | | | j int := 0;
6 | 6 | | 1 | 1 | 0 | 0.009 | 0.009 | {356.329} | {0} | begin
7 | 7 | {5244141801870714068} | 1 | 1 | 0 | 159.998 | 159.998 | {249.678} | {0} | for r in select a from t
8 | | | | | | | | | | loop
9 | 9 | {4704117640272101272} | 1 | 1000002 | 0 | 89.68 | 0.001 | {0.236} | {0} | i := r.a;
10 | | | | | | | | | | end loop;
11 | 11 | | 1 | 1 | 0 | 41.715 | 41.715 | {106.642} | {0} | for i in 1..1000000
12 | | | | | | | | | | loop
13 | 13 | {-8156123798940658127} | 1 | 1000000 | 0 | 64.927 | 0.001 | {0.021} | {0} | j := j + 1;
14 | | | | | | | | | | end loop;
15 | 15 | | 1 | 1 | 0 | 0 | 0 | {0} | {0} | return i;
16 | | | | | | | | | | end;
(16 rows)
postgres=# select * from plpgsql_profiler_function_statements_tb('f1()');
stmtid | parent_stmtid | parent_note | block_num | lineno | queryid | exec_stmts | exec_stmts_err | total_time | avg_time | max_time | processed_rows | stmtname
--------+---------------+-------------+-----------+--------+----------------------+------------+----------------+------------+----------+----------+----------------+--------------------------------
0 | | | 1 | 6 | | 1 | 0 | 0.009 | 0.009 | 356.329 | 0 | statement block
1 | 0 | body | 1 | 7 | 5244141801870714068 | 1 | 0 | 159.998 | 159.998 | 249.678 | 0 | FOR over SELECT rows
2 | 1 | loop body | 1 | 9 | 4704117640272101272 | 1000002 | 0 | 89.68 | 0.001 | 0.236 | 0 | assignment
3 | 0 | body | 2 | 11 | | 1 | 0 | 41.715 | 41.715 | 106.642 | 0 | FOR with integer loop variable
4 | 3 | loop body | 1 | 13 | -8156123798940658127 | 1000000 | 0 | 64.927 | 0.001 | 0.021 | 0 | assignment
5 | 0 | body | 3 | 15 | | 1 | 0 | 0 | 0 | 0 | 0 | RETURN
Really nice. This was only about the basics, there is much more information on the GitHub page or on Pavel’s blog, e.g. here.