dbi Blog

In database management, query performance is critical for efficient data retrieval. PostgreSQL, a powerful open-source database, offers various extensions to optimize and tune query performance. Two such extensions are Index Advisor and HypoPG. These tools help database administrators (DBAs) identify and test potential indexes to improve query performance without the overhead of creating and dropping indexes repeatedly. This blog will guide you through the installation and usage of these extensions to enhance your PostgreSQL performance.

1. Installing Index Advisor and HypoPG Extensions

Before diving into the usage of these extensions, let’s start with the installation process.

1.1. PostgreSQL Index Advisor

Index Advisor is a PostgreSQL extension that recommends indexes based on query analysis. One nice point is that the extension code is still maintained. index_advisor requires Postgres with hypopg installed. Here’s how to install it:

Installation:

--- Install hypopg

postgres@server> sudo apt install postgresql-XY-hypopg

--- Install index_adviser

postgres@server> git clone https://github.com/supabase/index_advisor.git
postgres@server> cd index_advisor
postgres@server> sudo make install

Enable the Extension in PostgreSQL:

postgres# CREATE EXTENSION hypopg;
postgres# create extension if not exists index_advisor cascade;

Now that the extensions are installed, let’s explore how to use them to analyze and optimize query performance.

Index Advisor analyzes your queries and suggests indexes that can improve performance. Let’s see it in action with a sample query.

2.1. Example Query

Consider the following query that retrieves data from dbi_technologies, dbi_techres, and dbi_techlevels tables:

SELECT tech.id techid, tech.label techlabel, reslevel.techlevel techlevel, reslevel.techlvlid
FROM dbi_technologies tech
LEFT JOIN
  (SELECT techlvl.label techlevel, techlvl.id techlvlid, techres.tech_id tech_id
   FROM dbi_techres techres,
   dbi_techlevels techlvl
   WHERE techres.res_id = $2
   AND techlvl.id = techres.techlevel_id) AS reslevel
ON tech.id = reslevel.tech_id
WHERE tech.cat_id = $1
ORDER BY techlabel;

2.2. Running Index Advisor

To get index recommendations for the above query, use the following command:

SELECT * FROM index_advisor('YOUR QUERY HERE');

2.3. Sample Output

-[ RECORD 1 ]-------------------------
startup_cost_before | 70.97
startup_cost_after  | 66.76
total_cost_before   | 71.00
total_cost_after    | 66.79
index_statements    | ['CREATE INDEX ON dbicc.dbi_technologies USING btree (cat_id)', 'CREATE INDEX ON dbicc.dbi_techres USING btree (res_id)']
errors              | []

2.4. Output Explanation

startup_cost_before and startup_cost_after: These show the estimated cost until the query can return the first row of the result before and after applying the recommended indexes. For example, with a hash join, this includes the cost to build the internal table. For a sequential scan, the startup cost is always zero, as rows can be returned immediately.
total_cost_before and total_cost_after: Estimated total cost of executing the query before and after applying the recommended indexes.
index_statements: Suggested SQL statements to create the indexes. In this case, we got two suggestions.
errors: Any errors encountered during the analysis (empty in this case).

3. Using HypoPG to Test Hypothetical Indexes

HypoPG allows you to test the impact of hypothetical indexes without actually creating them. This is useful for evaluating potential indexes before committing to their creation. The hypothetical indexes created by HypoPG are stored in your connection’s private memory, not in any catalog. This means they won’t increase the size of any table or affect other connections.

3.1. Analyzing Query Performance without Hypothetical Indexes

Let’s start by testing the query performance without any hypothetical indexes.

postgres# EXPLAIN SELECT tech.id techid, tech.label techlabel, reslevel.techlevel techlevel, reslevel.techlvlid
FROM dbi_technologies tech
LEFT JOIN (
    SELECT techlvl.label techlevel, techlvl.id techlvlid, techres.tech_id tech_id
    FROM dbi_techres techres, dbi_techlevels techlvl
    WHERE techres.res_id = 60
    AND techlvl.id = techres.techlevel_id
) AS reslevel
ON tech.id = reslevel.tech_id
WHERE tech.cat_id = 10
ORDER BY techlabel;

+-------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                              |
|-------------------------------------------------------------------------------------------------------------------------|
| Sort  (cost=72.10..72.13 rows=14 width=241)                                                                             |
|   Sort Key: tech.label                                                                                                  |
|   ->  Nested Loop Left Join  (cost=0.41..71.83 rows=14 width=241)                                                       |
|         ->  Seq Scan on dbi_technologies tech  (cost=0.00..10.01 rows=14 width=19)                                      |
|               Filter: (cat_id = 10)                                                                                     |
|         ->  Nested Loop  (cost=0.41..4.41 rows=1 width=226)                                                             |
|               ->  Index Only Scan using idx_tech_res_level on dbi_techres techres  (cost=0.28..4.02 rows=1 width=8)     |
|                     Index Cond: ((tech_id = tech.id) AND (res_id = 60))                                                 |
|               ->  Index Scan using dbi_techlevels_id_pkey on dbi_techlevels techlvl  (cost=0.13..0.33 rows=1 width=222) |
|                     Index Cond: (id = techres.techlevel_id)                                                             |
+-------------------------------------------------------------------------------------------------------------------------+

3.2. Creating the First Hypothetical Index

We are going to create our first hypothetical index based on the output from the index_advisor statement. Use hypopg_create_index to create it:

SELECT * FROM hypopg_create_index('CREATE INDEX ON dbicc.dbi_technologies USING btree (cat_id)');

3.3. Analyzing Query Performance with Hypothetical Index

Run EXPLAIN on your query to see the impact of the hypothetical index:

postgres# EXPLAIN SELECT tech.id techid, tech.label techlabel, reslevel.techlevel techlevel, reslevel.techlvlid
            FROM dbi_technologies tech
            LEFT JOIN
            ( SELECT techlvl.label techlevel, techlvl.id techlvlid, techres.tech_id tech_id
            FROM dbi_techres techres,
            dbi_techlevels techlvl
            WHERE techres.res_id = 60
            AND techlvl.id = techres.techlevel_id) AS reslevel
            ON tech.id = reslevel.tech_id
            WHERE tech.cat_id = 10
            ORDER BY techlabel;
+-------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                              |
|-------------------------------------------------------------------------------------------------------------------------|
| Sort  (cost=71.39..71.43 rows=14 width=241)                                                                             |
|   Sort Key: tech.label                                                                                                  |
|   ->  Nested Loop Left Join  (cost=4.55..71.12 rows=14 width=241)                                                       |
|         ->  Bitmap Heap Scan on dbi_technologies tech  (cost=4.13..9.31 rows=14 width=19)                               |
|               Recheck Cond: (cat_id = 10)                                                                               |
|               ->  Bitmap Index Scan on "<14454>btree_dbicc_dbi_technologies_cat_id"  (cost=0.00..4.13 rows=14 width=0)  |
|                     Index Cond: (cat_id = 10)                                                                           |
|         ->  Nested Loop  (cost=0.41..4.41 rows=1 width=226)                                                             |
|               ->  Index Only Scan using idx_tech_res_level on dbi_techres techres  (cost=0.28..4.02 rows=1 width=8)     |
|                     Index Cond: ((tech_id = tech.id) AND (res_id = 60))                                                 |
|               ->  Index Scan using dbi_techlevels_id_pkey on dbi_techlevels techlvl  (cost=0.13..0.33 rows=1 width=222) |
|                     Index Cond: (id = techres.techlevel_id)                                                             |
+-------------------------------------------------------------------------------------------------------------------------+

3.4. Output Explanation

Bitmap Heap Scan and Bitmap Index Scan: Show how the hypothetical index is used in scanning the table.
Cost Reduction: Reduced costs in the query plan indicate improved performance due to the hypothetical index.

3.5. Creating our second Hypothetical Index and check the output

postgres# SELECT * FROM hypopg_create_index('CREATE INDEX ON dbicc.dbi_techres USING btree (res_id)');
+------------+---------------------------------------+
| indexrelid | indexname                             |
|------------+---------------------------------------|
| 14455      | <14455>btree_dbicc_dbi_techres_res_id |
+------------+---------------------------------------+
SELECT 1
Time: 0.013s

postgres# EXPLAIN SELECT tech.id techid, tech.label techlabel, reslevel.techlevel techlevel, reslevel.techlvlid
             FROM dbi_technologies tech
             LEFT JOIN
             ( SELECT techlvl.label techlevel, techlvl.id techlvlid, techres.tech_id tech_id
             FROM dbi_techres techres,
             dbi_techlevels techlvl
             WHERE techres.res_id = 60
             AND techlvl.id = techres.techlevel_id) AS reslevel
             ON tech.id = reslevel.tech_id
             WHERE tech.cat_id = 10
             ORDER BY techlabel;
+---------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                            |
|---------------------------------------------------------------------------------------------------------------------------------------|
| Sort  (cost=62.47..62.51 rows=14 width=241)                                                                                           |
|   Sort Key: tech.label                                                                                                                |
|   ->  Hash Right Join  (cost=10.65..62.21 rows=14 width=241)                                                                          |
|         Hash Cond: (techres.tech_id = tech.id)                                                                                        |
|         ->  Hash Join  (cost=1.17..52.55 rows=66 width=226)                                                                           |
|               Hash Cond: (techres.techlevel_id = techlvl.id)                                                                          |
|               ->  Index Scan using "<14455>btree_dbicc_dbi_techres_res_id" on dbi_techres techres  (cost=0.03..51.12 rows=66 width=8) |
|                     Index Cond: (res_id = 60)                                                                                         |
|               ->  Hash  (cost=1.06..1.06 rows=6 width=222)                                                                            |
|                     ->  Seq Scan on dbi_techlevels techlvl  (cost=0.00..1.06 rows=6 width=222)                                        |
|         ->  Hash  (cost=9.31..9.31 rows=14 width=19)                                                                                  |
|               ->  Bitmap Heap Scan on dbi_technologies tech  (cost=4.13..9.31 rows=14 width=19)                                       |
|                     Recheck Cond: (cat_id = 10)                                                                                       |
|                     ->  Bitmap Index Scan on "<14454>btree_dbicc_dbi_technologies_cat_id"  (cost=0.00..4.13 rows=14 width=0)          |
|                           Index Cond: (cat_id = 10)                                                                                   |
+---------------------------------------------------------------------------------------------------------------------------------------+

3.6. Output Explanation

Bitmap Heap Scan and Bitmap Index Scan: Show how the hypothetical index is used in scanning the table.
Cost Reduction: Reduced costs in the query plan indicate improved performance due to the hypothetical index.

3.7. Key Differences and Benefits

After the creation of our HypoPG indexes, we can see some differences in our explain statements and notice some benefits:

Reduced Total Cost:

Without Indexes: The total cost was estimated at around 71.83.
With HypoPG Indexes: The total cost decreased to about 62.21.

Benefit: The overall cost of executing the query is lower, indicating improved efficiency.

Improved Join Strategies:

Without Indexes: The query used nested loop joins, which can be less efficient for large datasets.
With HypoPG Indexes: The query plan switched to hash joins, which are generally faster for joining large sets of data.

Benefit: Hash joins are more efficient and can handle larger datasets better than nested loops.

Better Index Usage:

Without Indexes: The plan didn’t use any specific indexes for cat_id or res_id in the dbi_technologies table.
With HypoPG Indexes: The plan utilized the hypothetical indexes for both cat_id and res_id.

Benefit: The use of these indexes speeds up data retrieval by directly accessing the relevant rows rather than scanning the entire table.

Change in Access Methods:

Without Indexes: The plan involved a sequence scan and index-only scans.
With HypoPG Indexes: The plan used bitmap heap scans and bitmap index scans.

Benefit: Bitmap scans are efficient for handling multiple rows and are typically faster when dealing with indexed columns.

4. Conclusion

Index Advisor and HypoPG are powerful tools in the PostgreSQL ecosystem, allowing DBAs to fine-tune their database performance with minimal disruption (and without overheating your brain). By recommending and simulating indexes, these extensions help you make decisions about index creation, ensuring your queries run efficiently.

Use these tools to analyze your queries, create hypothetical indexes, and measure their impact before making any permanent changes. I hope that this blog will help you make your PostgreSQL performances better thanks to Index Advisor and HypoPG.

Happy optimizing!

Useful links

https://github.com/supabase/index_advisor
https://hypopg.readthedocs.io/en/rel1_stable/index.html
https://github.com/HypoPG/hypopg
Don’t mind checking my previous blog about pgcli to make your life easier while using PostgreSQL: https://www.dbi-services.com/blog/simplifying-postgresql-management-a-guide-to-install-and-use-pgcli/

Post Views: 1,677

Enhancing PostgreSQL Performance with Index Advisor and HypoPG

1. Installing Index Advisor and HypoPG Extensions

1.1. PostgreSQL Index Advisor

2.1. Example Query

2.2. Running Index Advisor

2.3. Sample Output

2.4. Output Explanation

3. Using HypoPG to Test Hypothetical Indexes

3.1. Analyzing Query Performance without Hypothetical Indexes

3.2. Creating the First Hypothetical Index

3.3. Analyzing Query Performance with Hypothetical Index

3.4. Output Explanation

3.5. Creating our second Hypothetical Index and check the output

3.6. Output Explanation

3.7. Key Differences and Benefits

4. Conclusion

Useful links

Joan Frey

Leave a Reply:

Related blog articles

1. Installing Index Advisor and HypoPG Extensions

1.1. PostgreSQL Index Advisor

2. Using Index Advisor to Recommend Indexes

2.1. Example Query

2.2. Running Index Advisor

2.3. Sample Output

2.4. Output Explanation

3. Using HypoPG to Test Hypothetical Indexes

3.1. Analyzing Query Performance without Hypothetical Indexes

3.2. Creating the First Hypothetical Index

3.3. Analyzing Query Performance with Hypothetical Index

3.4. Output Explanation

3.5. Creating our second Hypothetical Index and check the output

3.6. Output Explanation

3.7. Key Differences and Benefits

4. Conclusion

Useful links

Joan Frey

Leave a Reply:

Related blog articles

Contained Availability Group and login failed

SQL Server: New SQL Server Management Studio (SSMS) landing page

Step-by-Step Guide to Enabling Copilot in SSMS

SQL Server 2025: Local SQL Server Container without Docker Command