Thursday, July 9. 2009Three Cheers for the search_path!!!Trackbacks
Trackback specific URI for this entry
No Trackbacks
Comments
Display comments as
(Linear | Threaded)
We lso have a solution following the "Cheer One" description.
So I fully agree - one of the best things in Pg - the search_path!
Peter made the point "programmatically modifying the search path in a correct way is pretty complicated" which was preemptively pointing out a problem with your "cheer two" example: what if the schema you're adding is already in the search_path? Also, what if the current search_path is completely empty at that point? There are a number of corner cases like that to worry about before you can build a really generic search_path extender example.
I think it depends on what your trying to build. If your building something to fit into a new problem set, it's not that hard to achieve with sql. If your trying to build an arbitrary search_path manipulation function, then yes, that is more difficult, but even that isn't all that hard.
Robert - as I guess cheer one is application I'm working on (best regards from PL) - the solution with search_path is just working for us and I can not imagine how the application would complicate without this feature.
There is also security concern - with search_path you're constrained to your organization data only, and even worst developer mistake will not allow you to see your competitor data (like missing enterprise_id in SELECT People love search_path, however there is no doubt it should be used carefully.
Hey, good to hear from you! I wasn't thinking of your app at the time, but it does indeed fall into the cheer one category, and your right, you have to be careful, but it does work really well.
We used a similar setup to "Cheer One" to handle segmenting data from multiple interconnected organizations (hotels within a corporate entity) that share some common information (accounts and contacts).
Before we realized how powerful schemas and search_path were, we were stuck making multiple database connections to handle looking up this interconnected data. As we expanded our functionality it became obvious that it would require n+1 database connections to access data from the full corporate entity unless we switched to using separate schemas with a shared public schema inside a single unified database. Better performance, easier maintainability, and less connection overhead? Yes, please! We never regretted it and have never looked back.
In Cheer One if you have a thousand organizations it requires a thousand schemas. One per organization. I'm having a hard time understanding why this is better than, "the old 'entity_id column' database pattern".
The scenario I keep thinking about is, what happens when the structure of the organization specific objects need to change. In the Cheer One scenario the change has to be executed N times. Which (seems to) become(s) prohibitive when N becomes large. It seems that there should be some sort of cutoff point where a designer can say, "I expect N to be greater than or equal to this rule of thumb number, therefore I'm going the 'entity_id column' route". Am I wrong about this or is the Cheer One way always the best route, period?
I have no doubts that there are useful applications of the schema path, just like there are useful applications for all the other paths that I mentioned. The problem is rather that (1) it is designed in a way that it is a problem that everyone has to deal with, not just those using it, and (2) it is pressed into service for purposes for which it is not ideal, because there is no better solution.
Your point one could probably be solved with an import functionality. Your point two is now well established as just a workaround for the lack of a proper packaging system. Your point three basically makes my point: There are no standards for how to apply the search path, and sometimes it works for a specific purpose and sometimes it doesn't. If Tom Lane had his hands on check_postgres, he would probably make sure everything is schema-qualified, because it's the proper thing to do, and then your admittedly cool idea wouldn't work. Just randomly. You couldn't use psql, for example, to examine that "alternative reality" that you are setting up, but perhaps phpPgAdmin would work? Just randomly. I'd actually support it if all tools worked the way check_postgres worked, that is, don't use any qualifications at all, because that would actually enable those interesting uses across the board. But right now you can't rely on any of this working consistently. A sudo-type functionality might actually solve that third point better anyway, right?
I'd like to point out that check_postgres was specifically designed to not use schema qualifiers, to allow for cases such as the one Robert uses (it also helps a lot in the unit tests, as we can override system tables and views to emulate production behavior). I don't think it's particularly dangerous at all - if you are creating a table named 'pg_stat_activity' for example, and go out of your way to have it appear in the search_path before pg_catalog, you get what you deserve. Also, I doubt even Tom Lane would go as far as schema qualifying the operators in the queries as well, which you should do if you are going to qualify things like the system tables.
Great post!. We use search_path since 7.4 and this solution is almost perfect. One of issue is PERFORM in plpgsql that cache results without care of search_path in postgresql 8.2 and 8.3. When you have few schemat and one function in public that works on same table in each schema you need to use EXECUTE in function, use VOLATILE as function attribute or create function in each schema.
|
QuicksearchThis is the weblog of Robert Treat (bio | writings). I lead the Database Operations Group at OmniTI, where we work on some of todays largest database challenges. Hire me! Need help with your database? We are available for large scale or short term engagements. Hire you! If you have experience with Postgres, MySQL, or Oracle, we are looking for people to join our team. Upcoming Events
OSCon 2010 July 19th - 23rd At Portland, Oregon Surge 2010 Sept 30th - Oct 1st At Baltimore, Maryland Recent MusingsYou were saying? about I hardly gnu, you? Fri, 23.07.2010 15:26 Yeah, I talked with the Veraci ty guys at OSCon, they are def initely on a good track (it al so includes integrated d [...] about I hardly gnu, you? Mon, 19.07.2010 06:22 A lot of specialists state tha t loan help a lot of people to live the way they want, becau se they can feel free to [...] about I hardly gnu, you? Sun, 18.07.2010 19:15 Veracity (http://www.ericsink. com/entries/veracity_early.htm l) is supposed to be released under Apache 2.0 License [...] Blog Administration |