Last week I ran across two different blog posts discussing removing duplicate
indexes from a database. Coincidentally, I have a nagging TODO item to clean up some indexes in the schema on one of the applications we have been developing
. So, with inspiration in hand, here are some techniques to clean up your indexes.
Finding unused indexes
This is sort of my classic approach to index pruning, as in a lot of cases indexes are created based on an expected need, but the actual need turns out to be something different. This method should work against any recent version of postgres, though it relies on having postgresql statistics gathering
turned on, and that you have been running production level traffic against the database for some time. The query is as follows:
indexrelid::regclass as index, relid::regclass as table
JOIN pg_index USING (indexrelid)
idx_scan = 0 and indisunique is false;
You’ll notice that we ignore Primary Key and Unique indexes, because they are required to enforce uniqueness, even if they are not used. I’d also note that the “idx_scan = 0” is used because it is pretty black and white, either the index has been used or it hasn’t. If you’re concerned that you may have indexes that have been used in the past that are no longer valid, you can either up that threshold, or reset your postgresql statistics and then have a fresh look at things once you have collected some data. In both cases, beware of the “little used but extremely necessary” index; for example an index needed to run a monthly report.
Finding duplicate indexes
This next method is actually a pretty good idea, especially during application development, where you don’t have the luxury of real queries running against your data. I’ve worked out some special instances of this query in the past, but I think this version will work well for general purposes. What this query does is look for true duplicate indexes, Ie. same table, same columns, same column order.
count(*) > 1;