Wednesday, July 22. 2009
Happy to see the first issue of the new Open Source Database Magazine has been released.
For those that aren't aware, Open Source Database Magazine is a re-incarnation of the old MySQL Magazine. The open source database ecosystem has grown a lot of the last year, with the rising popularity of newish systems like Drizzle and MariaDB, the continued growth of the PostgreSQL community, the revival of old concepts like CouchDB, and the really ground breaking stuff like HadoopDB.
So, check out the website, check out my article, and let the OSDBzine folks know what new stuff in Open Source Database you'd like to see more about.
Sunday, April 19. 2009
This week I'll be traveling to California to take part in the Percona Performance Conference, a two-day event run concurrently with the MySQL Users Conference, but centered around all aspects of internet based design, performance, and scalability. At OmniTI, we have helped many organizations scale their systems using Oracle, Postgres, and/or MySQL, both for high OLTP loads as well as large OLAP oriented systems. In my talk, "Scaling with Postgres" (Wednesday @ 11:55AM), I plan to explain the general path for scaling Postgres related systems, and compare how this lines up with other database systems.
Since I'll be in town for a few days, I've also signed up to do a Postgres BOF at the MySQL users conference (Wednesday @ 7:30PM). Currently the BOF has no topic, but there are a number of topics that I expect people might want to talk about, including ways to use Postgres and MySQL together, Porting applications and doing cross database development, and of course the straight-forward "why should I use Postgres?" If you are planning on coming to the BOF and have a specific topic or question you'd like answered, please post it in the comments, or send me an email.
Note I'll also be in town Thursday as well; I'm hoping to take in several of the talks during that time, but I've no official duties, so should be available for further questions and conversation during that time as well. Hope to see you there.
Friday, December 12. 2008
Well, it's been a long time (100 editions in fact!) since I have hosted a log buffer, but I thought what better way to break-in the new blog than by hosting the 127th edition of Log Buffer. Let's get started!
Continue reading "Log Buffer #127: a Carnival of the Vanities for DBAs"
Friday, December 5. 2008
Lorenzo Alberton has a nice post about understanding how different databases handle quoting identifiers and case sensitivity. There was one thing I thought worth adding to the discussion, which was to remind folks that along with the inconsistent behavior you get running MySQL on different operating systems, you need to be aware that it also uses a non-standard quoting mechanism; using ` rather than " for quoting identifiers. You can change this behavior to something more standard by using the SQL_MODE parameter, worth looking at if you're doing cross database development (or just future proofing your app). Of course, thinking about SQL_MODE, you also have to be aware of things like lowercase_table_names, which can also change the behavior that you might see (i'd recommend defining that setting as part of your database naming conventions). And of course, you can override a lot of these things at the session level, so you can't necessarily count on setting these in your mysql conf file (consider an application that overrides this behavior). Blech... sometimes TMTOWTDI is more trouble than it is worth.
Thursday, October 9. 2008
When I first saw the Google Chrome comic book, I really thought it was a great idea. Delivering tech content in a light-hearted manner has often been my style, and this just raised that to a higher level, with friednly comic style animations to walk you through it. It did make me wonder though, would a postgres book along similar lines be successful? I mean, we've been talking about how processed based architectures lead to more stability for years, but no one seemed to care about that untill Google talked about it.
But, it seems I'm already late to the game, based on this upcoming book (hat tip to Mark for the link):
I know, it sounds like a joke, but check out the description:
In The Manga Guide to Databases, Tico the fairy teaches the Princess how to simplify her data management. We follow along as they design a relational database, understand the entity-relationship model, perform basic database operations, and delve into more advanced topics. Once the Princess is familiar with transactions and basic SQL statements, she can keep her data timely and accurate for the entire kingdom. Finally, Tico explains ways to make the database more efficient and secure, and they discuss methods for concurrency and replication.
No Starch, if you need a tech/book review, mark me down, I can't wait to see this!
(And BTW, if any Japanese folks are familiar with Mana Takahashi's work, I'd love to hear your thoughts or get some links in the comments)
Thursday, June 26. 2008
Lukas' recent post on [http://pooteeweet.org/blog/1201 solving the prepared statement problem] got wondering, if open source databases implement something, how likely is it to become a defacto standard? The best example of this is the LIMIT clause, which both [http://dev.mysql.com/doc/refman/5.0/en/limit-optimization.html MySQL] and [http://www.postgresql.org/docs/current/interactive/queries-limit.html PostgreSQL] support, and that anyone who has used either database will tell you is a far superior tool to what it is available in other systems (don't get me wrong, [http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html rownum()] has it's place, but for what LIMIT does it's not as good). And while LIMIT is not standard SQL, can you imagine if the SQL Standard Committee were to adopt a syntax that conflicted with the Postgres/MySQL implementations?
Take this with a grain of salt, but really there are only 5 databases that matter anymore, Oracle, DB2, MS SQL, MySQL, and Postgres. Sure, there are many other successful database systems, and also some niche databases that are big players in their segment, but those 5 are the ones that set the tone. Even with Oracle far out pacing MySQL on dollars, the number of people who have joined the tech industry within the last 5 years who have hands on experience with MySQL high enough that it is not uncommon for people to look for their syntax in other database products (for better or worse). And remember, Postgres is [http://freshmeat.net/stats/#popularity even more popular] than MySQL in some places; one should expect the impact of open source databases to rise in this area faster than just by watching market share.
So, what kind of problems could be solved with new syntax? Honestly I am not really sure. One piece of syntax I always though SQL got wrong was the UPDATE statement. I've always thought that the syntax should have been UPDATE table WHERE conditions SET column=data, if only because it would have saved us from countless unconstrained updates from junior DBA's. Other examples might be to take something like MySQL's [http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html ON DUPLICATE KEY] syntax, clear up any sanity issues with implementation, and then implement it in Postgres. (Yes, this examples are questionable, since there are SQL Standard ways of doing this stuff, but the syntax of this command sure seems more user friendly).
Of course, this whole idea overlooks the notion that the SQL Standard committee has often been used as a tool between database vendors to make life tougher for their competitors, so it might be unlikely that the committee would ever adopt the syntax from the [http://planetosdb.org/ FLOSS community] if it meant giving them some type of advantage, but this is also why FLOSS databases could give hope, since as more user oriented projects, if they worked together you should really see syntax that makes everyones lives easier. Imagine Postgres, MySQL, Firebird, SQLite, and Ingres all working together to create their own, user driven syntaxes.
Wednesday, August 29. 2007
A couple weeks ago I hacked up some sql for this and had planned to blog about it but never got around to it. Of course then I needed to use the technique again yesterday, so it's time to make a more permanent record for next time.
The problem we're solving is the need to do group by on time intervals of less than 1 hour. This isn't exactly hard, but it's a little tricky, since you can't easily pull out random fractions of an hour. The way I solved it was working my way through a series of sql statements.
pagila=# select date_trunc('M',current_timestamp),
pagila-# extract('minute' from current_timestamp);
date_trunc | date_part
------------------------+-----------
2007-08-28 22:47:00-04 | 47
(1 row)
This first sql statement has two parts; first I grab the current time and use the date_trunc function to cut it to the current minute. This isn't needed for the final query, it is more for reference while I worked things out. The second part of the query I used the extract function to get just the minute portion of the time, in this case 47 minutes.
pagila=# select date_trunc('M',current_timestamp),
pagila-# extract('minute' from current_timestamp)/15;
date_trunc | ?column?
------------------------+-----------------
2007-08-28 22:47:00-04 | 3.13333333333333
(1 row)
Next, I divided the minutes by 15, to give me the quarter hour. Of course the decimal is a bit unwieldy, so...
pagila=# select date_trunc('M',current_timestamp),
pagila-# round(extract('minute' from current_timestamp)/15);
date_trunc | round
------------------------+-------
2007-08-28 22:47:00-04 | 3
(1 row)
I used the round function to get something a little nicer to work with. Now, this tells us our timestamp is in the third quarter hour, but if we're going to have something to group by on, we want something a little more real world.
pagila=# select date_trunc('M',current_timestamp),
pagila-# round(extract('minute' from current_timestamp)/15)*15;
date_trunc | ?column?
------------------------+----------
2007-08-28 22:47:00-04 | 45
(1 row)
So now we have our time, rounded to the quarter hour of 45 minutes, and the only thing left to do is turn it back into a normal timestamp.
pagila=# select date_trunc('H',current_timestamp) + (round(extract('minute' from current_timestamp)/15)*15) * '1 minute'::interval;
?column?
------------------------
2007-08-28 22:45:00-04
(1 row)
Note here I used date_trunc to get my time to set to the base time of the hour, and then I add back in the '45 minutes' that are generated through our series of rounding and extracting. Once you have this, using it in an aggregate is only a short step away.
pagila=# select date_trunc('H',rental_date) + (round(extract('minute' from rental_date)/15)*15) * '1 minute'::interval as time,
pagila-# count(*) from rental group by time order by time desc limit 10;
time | count
---------------------+-------
2005-08-23 22:45:00 | 3
2005-08-23 22:30:00 | 3
2005-08-23 22:15:00 | 9
2005-08-23 22:00:00 | 6
2005-08-23 21:45:00 | 7
2005-08-23 21:30:00 | 7
2005-08-23 21:15:00 | 5
2005-08-23 21:00:00 | 6
2005-08-23 20:45:00 | 7
2005-08-23 20:30:00 | 9
(10 rows)
Note you can adjust the interval (eg. make it every 5 minutes) by changing the number you divde and multiply by, so this technique should work for any sub-hour intervals, and would not be hard to adapt to other non-standard amounts (like every 4 hours). Also remember this was something I whipped out rather quickly, I've done no analysis on how well it scales, if you've worked out a better/cleaner query by all means, please post to the comments. HTH.
|
You were saying?
Tue, 09.03.2010 16:06
"The point of the relational m odel... is to slay the scalabi lity dragon by avoiding the by te bloat in the first pl [...]
Tue, 09.03.2010 16:00
In regards to Date's "dropping ACID", what little I know abo ut it (second hand) says that the proposal is to just [...]
Tue, 09.03.2010 15:34
Finally. The point of the rel ational model (and industrial strength SQL databases) is to slay the scalability dra [...]