Friday, August 26. 2011
I'm sitting in SFO tonight, awaiting my return trip back to Hurricane Pending Maryland. (As a former Floridian, I must of course scoff at any notions that this hurricane is significant). Walking through the airport I noticed a large billboard about "Big Data and the Cloud". This is the kind of billboard you only see in Silicon Valley; I don't see signs like that in Portland or Ottawa, and certainly not when I had to change flights in Detroit this year.
Anyway, these two buzz words aren't a local phenomenon, and are actually taking the tech world by storm. Big Data has become serious enough that there are multiple conferences now for folks interested in the topic. And cloud, well, perhaps harder to define, but more and more businesses are moving to the cloud every day. The problem here is that, most of the traditional ideas on big data run entirely counter to the ideas that work well in the cloud.
Last spring I moderated a panel PGEast in New York that focused on Postgres in the cloud. As someone who works on multi-terabyte systems, and someone who deals with cloud servers on at least a semi-regular basis, I tried to prod and poke my panelists into sharing their take on how they see Postgres's role in the cloud. Not too surprisingly, the idea behind "Big Data" on Postgres in the cloud was not a particularly popular one. The tools you need to do the job effectively with Postgres just aren't there. Not to say you can't try, but so far I haven't seen many wild successes.
Next month at Surge though, I'm going to be involved in another panel focusing on " Pushing Big Data To The Cloud". This time though I'm turning over moderating duties to long-time thought leader in the MySQL community Baron Schwartz. Joining me on the panel are several folks who all have a stake in the idea of Big Data in the cloud; John Hugg and Philip Wickline from VoltDB and Hadapt, respectivly, two new database vendors built with scale-out in mind; Bryan Cantrill, VP of Engineering at Joyant, a cloud provider with thier own strong opinions on dealing with data in the clouds, and Kate Matsudaira, someone who is currently managing those multi-TB databases, all in the cloud, over at SEOMoz. This should be a really good mix of people using different technology, with different biases against the problems involved. If you're looking to work on Big Data in The Cloud, I hope you'll join us, it should be a lot of fun.
Wednesday, August 17. 2011
In spite of all previous notions to the contrary, thanks to some last minute wrangling by the conference organizers, I will be making the trek out to Chicago this September for Postgres Open after all. I had been planning to sit out the event and just stay focused on Surge (which, I must say, looks even more kick ass than last year), but after looking at the schedule, and some persuading at OSCon, I'm very excited about what has been put together, and look forward to seeing many of my fellow Postgres community members once again.
Oh, and in case you were wondering, I'll be reprising my talk from this years Velocity conference, " Managing Databases in a DevOps Environment". At Velocity, the talk was intended to highlight how people already familiar with DevOps should approach their databases systems. I'm not sure how well "DevOps" is understood within the Postgres community, so I think I'll try to emphasize the differences between managing databases and traditional services, to hopefully give better expectations to DBA's whose organizations might be undergoing such a change. If you're going to be at Postgres Open and are interested in the topic, I'd love to hear your feedback on what aspects of this topic you're most interested in. (PS. I'll also be heading to the Velocity Summit next week in San Francisco, for those attending, I'd love to hear your thoughts on this topic as well).
Monday, August 15. 2011
I often run my ops like I take care of data; a bit overzealously. Case in point, when setting up a new database, I like to throw on a metric for database size, which gets turned into both a graph for trending, but also an alert on database size. Everyone is always on board with trending database size in a graph, but the alert is one people tend to question. This is not entirely without justification.
On a new database, with no data or activity, deciding when to alert is pretty fuzzy. When we set up a new client within our managed hosting service, I usually just toss up an arbitrary number, like 2GB or something. The idea isn't that a 2GB database is a problem, it's that when we cross 2GB, we should probably take a look at the trending graph and do a projection. Depending on how things look, we'll bump up the threshold on the alert to a new level, based on when we think we might want to look at things again. For example, in this graph we take a month long sample, and then project it out for three months. We can then set a new threshold somewhere along that line.
While this is good for capacity planning, there's more that can be gained from this process. The act of alerting forces us to pay attention. And if we get notices before our expectations, we go back in and re-evaluate the data patterns. Of course, some times people will question this. Getting a notice that your database has passed 4GB can seem pointless when you have 100+ GB of free space on your disks. And besides, isn't that what free space monitors are for?
Here is a graph of another of our clients database growth. Their data size is not particularly large (don't confuse scalability with size; it doesn't take a large database to have scalability issues), but what's important is that we kept getting notices that the size was growing, and when talking with the developers, no one thought it should be growing at nearly this rate. Eventually we were able to track down the problem to purging job that had gone awry. Once that was fixed, the growth pattern leveled off completely (and the database size returned to the tiny amount that was expected!)
Monday, August 8. 2011
There has been a lot of chatter the past week about Apple replacing MySQL with Postgres in the new OSX Lion Server [ U.S. | England | New Zealand ]. Most of it seems to tie things back to Oracle's new stewardship over the MySQL project, a lot of that stemming from what I would say is FUD from the EnterpriseDB folks, regarding doom and gloom about the way Oracle might handle the project in the future. Not that the FUD is entirely unwarrented; While Oracle has done a pretty decent job with MySQL so far, looking at what Oracle has done to projects like Open Solaris certainly would make one queasy. And yes, we've seen an uptick in people asking for help with Oracle/MySQL to Postgres migrations since the acquisition of Sun. That said, I have an alternative theory. Maybe they just like it better?
Continue reading "Maybe they just like it better?"
Tuesday, July 5. 2011
OK, I am just trying to set the record straight. People are still confused thinking I might be going to PGWest, but I'm not. I know where the confusion comes from; on the PG West website, there is a picture of me in the banner graphic; which makes people think I am going to PG West. This is not unreasonable, it's just untrue. For what it's worth, I did ask Joshua to remove my picture when people first started asking me if I was going, and he said he would, but that was well over a month ago. I do think he will take it down, but in the mean time, I figure I should at least put some effort into clarifying things myself. So, to be clear, I will not be going to PG West this year. Also, to be clear, it's not that I have anything against PG West per se. I've gone to multiple PG West cons in the past, and I suspect I'll probably go to more in the future. It's just that this year, I've got something better to go to. That something is Surge.
What is Surge?
Surge is the premeire conference on internet scalability. Now, I have to disclose, I am affiliated with the conference, but this conference really stands on it's own merits, no question. Now in it's second year, Surge packs an incredible lineup of people leading large scale operations on the net. Reading through the speakers list, I see companies like Yahoo, Wikia, Message Systems, Varnish, MyYearbook, Percona, Etsy. If you are trying to grow at scale, you can learn a lot from this crowd.
Yeah, but I'm a DBA
So most people going to PGWest are probably DBA's, or at least work closely with Postgres, so it makes sense for them to go to PGWest; I get that. But here is why you may not want to. The thing about Surge is that, while it isn't a database conference per se, a fair amount of the content does revolve around managing data. Let's face it, if you are running a website at scale, chances are you have to deal with large amounts of data. Whether it's massive data on disk, or dealing with massive throughput of data, or trying to figure out how to visulaize all that data, Surge has it covered. And what I find most intriguing is that because Surge is not focused on any particular technology, you get to see both problems and solutions from different angles, which I think helps to learn even more. Of course, you don't have to take my word for it; scan the speakers list, check out the talk profiles, and see if there isn't something there looks awesome.
See you in September
In any case, record set straight. You know where I'll be, I hope to see you then. Oh, and in case you need incentive, early bird pricing is still in effect until the end of July. Get on it!
Tuesday, May 10. 2011
A reminder that tomorrow, Wednesday (the 11th) is the BWPUG meeting for
May. This month Greg Smith will be stopping by to give a preview of
his upcoming PGCon Tutorial, "Postgres Performance PItfalls".
PostgreSQL is a database system that can deliver excellent performance
for a wide variety of applications. But it's easy to run into an issue
that keeps you from seeing its full potential. There are a few basic
PostgreSQL configuration and use misunderstandings that cause most of
the early performance issues administrators and developers encounter.
When: May 11th, ~6:30PM.
Where: 7070 Samuel Morse Dr, Columbia, MD, 21042.
Host: OmniTI
As always we will have time for networking and we can do some more
open Q & A, and we'll likely hit one of the local restaurants after
the meet.
BWPUG Meetup Page
BWPUG Mailing List
Wednesday, April 13. 2011
A reminder that today, Wednesday (the 13th) is the BWPUG meeting for
April. This month we're going to forgo a formal speaker and in favor of an "Open Mic Night". Got questions about Postgres? Need help on a problem? Recently done something awesome? Did you learn anything from PGEast? Come swap stories with fellow Meetup members and/or get help if you need it.
When: April 13th, ~6:30PM.
Where: 7070 Samuel Morse Dr, Columbia, MD, 21042.
Host: OmniTI
As always we will have time for networking and we can do some more
open Q & A, and we'll likely hit one of the local restaurants after
the meet.
Don't forget to check out our Meetup page, please feel free to sign up and/or RSVP.
|
You were saying?
Tue, 20.12.2011 10:49
thanks for the slides and the post.
Sun, 27.11.2011 15:42
And the slides are up at http: //www.2ndquadrant.com/en/talks /
Thu, 24.11.2011 11:42
You probably want array_agg in stead of array_accum. That sa id, if you don't understand ho w to fix the query, it's [...]