Index Pruning Techniques

Last week I ran across two different blog posts discussing removing duplicate and useless indexes from a database. Coincidentally, I have a nagging TODO item to clean up some indexes in the schema on one of the applications we have been developing. So, with inspiration in hand, here are some techniques to clean up your indexes.

Finding unused indexes

This is sort of my classic approach to index pruning, as in a lot of cases indexes are created based on an expected need, but the actual need turns out to be something different. This method should work against any recent version of postgres, though it relies on having postgresql statistics gathering turned on, and that you have been running production level traffic against the database for some time. The query is as follows:
select indexrelid::regclass as index, relid::regclass as table from pg_stat_user_indexes JOIN pg_index USING (indexrelid) where idx_scan = 0 and indisunique is false;
You’ll notice that we ignore Primary Key and Unique indexes, because they are required to enforce uniqueness, even if they are not used. I’d also note that the “idx_scan = 0” is used because it is pretty black and white, either the index has been used or it hasn’t. If you’re concerned that you may have indexes that have been used in the past that are no longer valid, you can either up that threshold, or reset your postgresql statistics and then have a fresh look at things once you have collected some data. In both cases, beware of the “little used but extremely necessary” index; for example an index needed to run a monthly report.

Finding duplicate indexes

This next method is actually a pretty good idea, especially during application development, where you don’t have the luxury of real queries running against your data. I’ve worked out some special instances of this query in the past, but I think this version will work well for general purposes. What this query does is look for true duplicate indexes, Ie. same table, same columns, same column order.
select indrelid::regclass, array_accum(indexrelid::regclass) from pg_index group by indrelid, indkey having count(*) > 1;