Thursday, June 11. 2009The Asynchronous Services Analogy
Today I had a chance to sit through a sneak preview of Theo Schlossnagle's new talk Scalable Internet Architectures, to be delivered next week at Velocity 2009 (Dev sessions are an underrated side benefit of working at OmniTI). As always Theo packs a lot of good information into his talks; I could probably do blog entries on half a dozen ideas I jotted down; but I wanted to highlight something that he mentioned with regards to scaling websites via asynchronous services.
Continue reading "The Asynchronous Services Analogy" Saturday, May 9. 2009IUCN Wildlife group needs development help
Passing this forward for a good cause, hopefully some of you can help out.
The African Elephant Specialist Group (with the Asian Elephant Specialist Group) is working on the redesign of the AED to become a multi-species database, the first version of which is the African and Asian Elephant Database. With funding from USFWS we are hiring a developer to undertake the project. I've put a copy of the TOR and Functional Specification on my site, but for more information, please contact: Note that the deadline for applications is May 11th, 2009. Wednesday, February 25. 2009re: axonflux on building and scaling a startup
Yesterday the posterous guys put up a very good article on scaling rails applications from the ground up (the ground up bit is key; we're focusing on what you need to do as a start-up, not once you are Facebook). It's a must read for any Rails shop, and a good read regardless of the application stack you work with. One of the comments to the post was from jonathanwallace asked about stories for folks using a Postgres back-end. I started to work up a response for this, but before long decided it needed to live as a full post. I should mention that much of what was posted here applies just as well to Postgres (or even Oracle); mostly I just wanted to point out some differences for those trying to scale on a Postgres backend instead, so this blog post is essentially a "diff" to the original article. Read that first, then read this.
"You're not going to run full text search out of your DB." If you're using Postgres, this probably isn't true. Postgres has a built in full text search implementation that runs very fast, gives multiple indexing options, and has all the flexibility of multiple languages, custom stop words, and custom dictionaries, that you would expect from a solid full text search implementation. It might not beat a solution built on Sphinx/Lucene/Solr on straight performance numbers, but the ability to maintain your full text data in a transactional manner, all using one piece of technology rather than having to bolt on an additional application, make it worth it to start with the built-in stuff. For many people, this will be all you ever need. "Storage engine matters, and you should probably use InnoDB" Ok, this isn't exactly about Postgres, but the reasons he gives for InnoDB (crash resistence, non-locking) also apply to Postgres. (And yes, there's a certain irony that all the reasons why you want to switch to InnoDB are the same reasons you should have just picked Postgres in the first place.) This ability to handle higher concurrency rates is actually even more important with Postgres because... "if you can start with some replication in place, do it. You'll want at least one slave for backups anyway. " You don't need a slave for backups (pg_dump will give you online backups just find thank you), but you should create a PITR slave for failover. This wont allow you to scale reads or writes; you'll need something like Slony for that, but since Postgres is going to scale way up, you don't need to worry about that for now, just keep buying bigger hardware. "Fix your DB bottlenecks" No matter what you code in, having tools like the ones laid out for Ruby/Rails can be very helpful. If you use any kind of ORM, you need to watch for excessive queries and dumb queries. PostgreSQL provides a slow query logging option, so that's one place to start. You can also do profiling of queries using pgfouine or PGSI. Oh, and Postgres's EXPLAIN tool kicks the MySQL EXPLAIN's butt. Learn it. Use it. And yes, you should start here before you move on to memcached and other types of query caching, because in a lot of cases you may not ever need to go there (honestly memcached use isn't very widespread in the Postgres world, and a lot of the reason is it's not needed. fwiw, it's even less so in the Oracle world.) Oh, it's not (just) that those databases are just full of more magic data grabbing juice, it's that you'll also likely want to implement materialized views inside the database before you start bolting on external solutions like memcached. Note that this doesn't apply to static content caching (images/js libraries/etc...), you'll probably want to move into caching that stuff much sooner. "Offline Job Queues... I don't know why people don't talk about this more, because if you run a site that basically does anything, you need something like this." In case you're not sure if it needs to be part of Rails, postgres has a notification service built in, and there are also plenty of external queuing systems available. The interesting thing is that it probably doesn't matter which technology you pick; the pain points will be obvious (resize an image 5 different ways and cache them on 3 different servers upon web form submission?) and moving those out of the critical path will be an obvious win. "If you don't monitor it, it will probably go down, and you will never know." Again a universal truth. In the Postgres world, check_postgres is hands down the best off the shelf monitoring you can use. You should also think about trending; In MySQL land, Cacti with Baron's templates are very solid, but there isn't a clear cut winner on the Postgres side. If you hired us for operations support and don't have an existing data trending solution, we'd likely use Cacti or Noit, although options like Staplr and I suppse evenMRTG are ok too; point here being you have options, use them. The rest of the post is mostly Rails focused, and I'll leave it be at that, but hopefully this gives you a better picture of how things will scale if you're doing Rails/Postgres. The picture is mostly the same, just the walls are moved further away. Wednesday, January 7. 2009Blog searching should be fixed now
Another quick blog update... I think search should now be fixed as well. After the import, I noticed all searching was broken, and also that many of the links when scrolling through the history would not work. This turned out to be two issues: the first was that my recent re-import of data uploaded this post which contained some invalid utf8 data. This meant that anytime the data was searched on, it would generate an invalid utf8 error, breaking the search. I'm unsure if I would have had this problem on 8.3 (we're currently running this on 8.2), but luckily I still have some code from a recent project where I had to ferret out invalid utf8 data, so that was relatively easily found and fixed, which fixed the search as a whole.
The next thing that I needed to fix was the broken permalinks. As it turns out, serendipity has a table to store permalinks based on configuration of your blog. As I hadn't dealt with that on the re-import, it was broken, so I simply needed to re-populate that table for the old (or is that new) entries. Much thanks to the guys on irc (particularly lluad) for helping me work through some regex issues. So that should make it much easier to find and reference any of the old entries. As always, should you find any broken permalinks, please do drop me an email. Friday, January 2. 2009Most blog postings now recovered
Over the Holidays I spent some time pushing forward with recovering my old blog posts and getting them loaded into the new blog. While the posts are now visible, not everything is 100%, so I thought a quick run-down of what works and what doesn't might be appropriate.
Big thanks go to Magnus for the initial re-import script (even if it was in python). While I hacked it to do recovery in a different way, he worked out most of the hard parts which made things actually possible for me (although I am now much more familiar with python regular expressions than I ever imagined I'd be) Monday, October 8. 2007PostgreSQL 8.3 beta has dropped
After many long months in feature freeze, PostgreSQL has finally [http://www.postgresql.org/about/news.872 released a beta for 8.3]. This release contains a large number of new features at some significant performance improvements as well. I'm hoping to see a large amount of testing with 3rd party languages (PHP, Ruby, Python, etc...) as well as software packages to help flush out bugs/issues while things are still in beta. If you contribute to such tools, please help out and take a stab at testing, and if you need a place to publish your results, feel free to contact me and I'll do what I can to help.
Friday, July 27. 2007oscon 2007 thursday wrapup
the wifi at the hotel went into the toilet yesterday, so I am again doing my wrap-up during the morning keynote. generally i don't mind this, but i've found it makes for more fuzzy memory (example; that place i went for lunch is called downtown pizza, not the boiler room). hopefully you'll gloss over these discrepencies (this works for perez, hopefully it will for me). anyway, heres the wrapup from thursday...
last year kathy sierra did a keynote that i thought was awesome, this year they had another one focused on branding and open source. i always think people need to pay more attention to the aspect of marketing projects, so having oreilly bring this topic in front of everybody is really great. i sum it up with the question, what does the word postgresql describe to you? after the keynotes i went down and did booth duty for an hour or so. we had plenty of people to handle this so i alternated between working the postgresql booth and talking to a few other people who support postgresql in thier endevors, like apress and xtuple (where i finally got to meet Ned Lily IRL as the kids say). i managed to sneak out of the booth and into david fetters pl[perl|ruby] talk, which was really a pl/perl talk due to the short notice he had to prepare, some issues around pl/ruby, and david's general inclination for perl. Unfortunatly I think a number of ruby folks were disappointed by this, as a couple of people left during the talk, but I think most people were able to get some ideas on the flexibility that postgresql's pl support gives you. following fetters talk, i went to lunch with the emma and mailer mailer folks. it was pretty uneventful (no fights broke out) and the general consensus was the food was as good as the day before, but it was better than the sandwhiches from earlier this week. after lunch i went to the ingres "databases dont matter" talk, which was in the corporate shtick section of the conference. i had planned to try to stay incognito, but sitting in front with a postgresql shirt and fetter next to me turned out to be a poor way to accomplish that goal. i think it might have intimidated the speaker a little to have some postgres guys sitting in front of her, but she handled it well (luckily there was only a "we like to bash oracle" slide, and no "we like to bash postgres" slide!). her central message was a little fuzzy, but she provided a nice analogy comparing databases to plumbing, noting they are both best whent they are invisible, available, reliable, and scalable. for me this analogy actually somes up how much databases matter to folks, they really dont care about them (how many of you attend plumbing conventions) even though they are critical to our daily lives. (for those that need further evidence, when asked how many people use mysql for data they dont care about, at least a half dozen people raised thier hands). The one real disappointment to the talk was that i didnt manage to get a tshirt... i dont have any ingres tshirts so would have been nice to add one... maybe next year. after that i swang into a php graphing talk, which took a stab at explaining different graphing solution available in php. i think i would have liked to see more discussion on specific php extension, but all in all i thought it was interesting stuff and i got a few things to think about (i've been doing a fair amount of pg stats graphing lately). following that was more mingling up until my pl/php talk, where i finally got to meet michal kimsal just by random chance of luck. pl/php is a pretty niche topic and it was the last track of the day, so turn-out was a little low, but i think the talk went well, and i got some good questions afterwards so hopefully we'll see an expansion of that community with a few new members. once my talk was done a number of us went up to burgerville (yikes!) for dinner, followed by a return to the convention center for the postgresql bof. since my name was on the bof page, i did a quick ramble and then quickly handed off to nasby and berkus who gave a nice presentation on upcoming things in 8.3. bonus points to this presentation for pulling jeff davies out of the audience to explain a feature he had worked on; bet you didn't see that coming eh jeff? a nice thanks to apress and omniti for supplying some give-aways (books and shirts), which we distributed through some rousing rounds of roku. following that i hit up a few parties, did some mingling, and then had a surreal conversation with one of the zimba guys that convinced me zimbras real goal is not to provide an exchange replacement, but instead meant to build a global network for syphoning off stalkerazi pictures from the emails of users worldwide. sadly this was not the most uncomfortable conversation i was in this week... ah, go oscon!
(Page 1 of 4, totaling 28 entries)
» next page
|
QuicksearchThis is the weblog of Robert Treat. I lead the Database Operations Group at OmniTI, where we work on some of todays largest database challenges. bio | writings Hire me! Need help with your database? We are available for large scale or short term engagements. Hire you! If you have experience with Postgres, MySQL, or Oracle, we are looking for people to join our team. Upcoming Events
PG East 2010 March 25th - 28th At Philadelphia, Pennsylvania PGCon 2010 May 18th - 21st At Ottawa, Canada Syndicate This BlogBlog Administration |

You were saying?
Tue, 09.03.2010 19:39
I'm way too lazy to type it al l, and it's not on the web, so far as I know, but here's the bottom of pg. 485: S [...]
Tue, 09.03.2010 19:31
Jeff: >> That might be an inc idental benefit. Robert: > > but the freebie that comes along for the ride is, l [...]
Tue, 09.03.2010 16:06
"The point of the relational m odel... is to slay the scalabi lity dragon by avoiding the by te bloat in the first pl [...]