<?xml version="1.0" encoding="utf-8"?><!-- generator="wordpress/2.3.1" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: Relational Databases are Dead, Long Live Relational Databases</title>
	<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead</link>
	<description>"The biggest room in the world is the room for improvement."</description>
	<pubDate>Sat, 05 Jul 2008 09:58:08 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.1</generator>
		<item>
		<title>By: joe</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142968</link>
		<dc:creator>joe</dc:creator>
		<pubDate>Sun, 25 May 2008 17:39:31 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142968</guid>
		<description>If Google is intending to provide a data storage/access foundation for an emerging class of Web based applications that offer software as a service the model it uses must address ALL of the requirements of data management.
'Bigtable' has no inherent provision for storing normalized data. Comments I have seen treat data normalization as a
hangup for 'squares' from another generation. The comments suggest that we should just 'loosen up' and forget about data normalization.  This is silly, dangerous thinking. It is always wonderful when things can be simple but the reality is
that the world we model with data is rife with complex patterns of one to many relationships.  We cannot wish them away. Forty years of experience with data management has taught us that stores of data lose their integrity when redundant ( unnormalized ) is present.  During the 1960s database management in its pre-normalized phase was in crisis.  Computer scientists identified redundantly recorded  data as the culprit and created the theory and practice of normalizing data to remove update anomalies.</description>
		<content:encoded><![CDATA[<p>If Google is intending to provide a data storage/access foundation for an emerging class of Web based applications that offer software as a service the model it uses must address ALL of the requirements of data management.<br />
&#8216;Bigtable&#8217; has no inherent provision for storing normalized data. Comments I have seen treat data normalization as a<br />
hangup for &#8217;squares&#8217; from another generation. The comments suggest that we should just &#8216;loosen up&#8217; and forget about data normalization.  This is silly, dangerous thinking. It is always wonderful when things can be simple but the reality is<br />
that the world we model with data is rife with complex patterns of one to many relationships.  We cannot wish them away. Forty years of experience with data management has taught us that stores of data lose their integrity when redundant ( unnormalized ) is present.  During the 1960s database management in its pre-normalized phase was in crisis.  Computer scientists identified redundantly recorded  data as the culprit and created the theory and practice of normalizing data to remove update anomalies.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paul Tiseo</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142817</link>
		<dc:creator>Paul Tiseo</dc:creator>
		<pubDate>Fri, 25 Apr 2008 20:41:04 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142817</guid>
		<description>To add: BigTable is not a database system in the way people are used to thinking of it or experiencing it, meaning a broad set of services overlaying some subset of the Relational Model. 

It's a glorified map-like data structure, a.k.a. &lt;a href="http://en.wikipedia.org/wiki/Associative_array" rel="nofollow"&gt;associative array&lt;/a&gt;. From my never-worked-at-Google understanding, it's a key-value array, where there are basically two keys per value. (Actually, three keys, "row", "column" and time.) For all this talk, you could represent BigTable in any RDBMS simply by creating one table. Google has multiple BigTables for various of their out-facing applications like Search.

Even Google states that they built BigTable because their need for data consistency was very weak relative to their tremendous need for performance. This is a very fringe need and why RDBMSes will remain a preferable tool for things like ERPs (as Oren states) and any other complex, transactional web application.</description>
		<content:encoded><![CDATA[<p>To add: BigTable is not a database system in the way people are used to thinking of it or experiencing it, meaning a broad set of services overlaying some subset of the Relational Model. </p>
<p>It&#8217;s a glorified map-like data structure, a.k.a. <a href="http://en.wikipedia.org/wiki/Associative_array" rel="nofollow">associative array</a>. From my never-worked-at-Google understanding, it&#8217;s a key-value array, where there are basically two keys per value. (Actually, three keys, &#8220;row&#8221;, &#8220;column&#8221; and time.) For all this talk, you could represent BigTable in any RDBMS simply by creating one table. Google has multiple BigTables for various of their out-facing applications like Search.</p>
<p>Even Google states that they built BigTable because their need for data consistency was very weak relative to their tremendous need for performance. This is a very fringe need and why RDBMSes will remain a preferable tool for things like ERPs (as Oren states) and any other complex, transactional web application.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paul Tiseo</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142816</link>
		<dc:creator>Paul Tiseo</dc:creator>
		<pubDate>Fri, 25 Apr 2008 20:16:44 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142816</guid>
		<description>@tim: Object databases have been trying to get in from the fringe for about ten to twenty years now. That's not called "slow", that's called "dead". Your post reminds me of Monty Python's &lt;a href="http://www.youtube.com/watch?v=e6Lq771TVm4" rel="nofollow"&gt;"Dead Parrot"&lt;/a&gt; sketch.

"Look at that CacheDB. Bee-yu-ti-ful plummage, isn't it?" :)</description>
		<content:encoded><![CDATA[<p>@tim: Object databases have been trying to get in from the fringe for about ten to twenty years now. That&#8217;s not called &#8220;slow&#8221;, that&#8217;s called &#8220;dead&#8221;. Your post reminds me of Monty Python&#8217;s <a href="http://www.youtube.com/watch?v=e6Lq771TVm4" rel="nofollow">&#8220;Dead Parrot&#8221;</a> sketch.</p>
<p>&#8220;Look at that CacheDB. Bee-yu-ti-ful plummage, isn&#8217;t it?&#8221; <img src='http://www.zefhemel.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeffrey Gelens</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142717</link>
		<dc:creator>Jeffrey Gelens</dc:creator>
		<pubDate>Tue, 15 Apr 2008 00:11:05 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142717</guid>
		<description>What CouchDB is really missing now is a bulk GET, doing 10000 http GET requests is so slow.</description>
		<content:encoded><![CDATA[<p>What CouchDB is really missing now is a bulk GET, doing 10000 http GET requests is so slow.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tim</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142698</link>
		<dc:creator>tim</dc:creator>
		<pubDate>Sun, 13 Apr 2008 05:05:07 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142698</guid>
		<description>some guy: I've used object databases.  They're a dream to work with.  They're hard to get in the door in 2008, kind of like how GC was hard to get in the door in 1996 or HLLs were hard to get in the door in 1982.  Things that make programmers more productive all start out fringe, and only get picked up very slowly, unless they're very backwards-compatible like MS-DOS or C++.</description>
		<content:encoded><![CDATA[<p>some guy: I&#8217;ve used object databases.  They&#8217;re a dream to work with.  They&#8217;re hard to get in the door in 2008, kind of like how GC was hard to get in the door in 1996 or HLLs were hard to get in the door in 1982.  Things that make programmers more productive all start out fringe, and only get picked up very slowly, unless they&#8217;re very backwards-compatible like MS-DOS or C++.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Piyush</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142690</link>
		<dc:creator>Piyush</dc:creator>
		<pubDate>Fri, 11 Apr 2008 13:04:32 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142690</guid>
		<description>I have heard this comment so many times now!
It is true in some cases but not completely true for other cases. 
First one being the places where you just have to store data and fetch it given the id/handle of the record and show it to the user with almost no data processing on it. 

Second cases turn out to be things like log processing. Actually a place where you need to do a lot of data crunching. 
One guy at work implemented a search log processing system in a object database (lisp based allergo cache by Franz). We had some 2-3 million records in there and needed to pull out some key features by doing some data crunching and math. This thing used to take 15 minutes to process everything and push the data in, another 10 to do data crunching and spit out the features data. 
Just for the kicks I implemented this in MySQL and Ruby. It still took 15 mins to push data in the mysql database (after processing) but the features were being fetched in like 35 seconds odd!

Obejct DB, AFAIK, is not well suited for the second type of things.</description>
		<content:encoded><![CDATA[<p>I have heard this comment so many times now!<br />
It is true in some cases but not completely true for other cases.<br />
First one being the places where you just have to store data and fetch it given the id/handle of the record and show it to the user with almost no data processing on it. </p>
<p>Second cases turn out to be things like log processing. Actually a place where you need to do a lot of data crunching.<br />
One guy at work implemented a search log processing system in a object database (lisp based allergo cache by Franz). We had some 2-3 million records in there and needed to pull out some key features by doing some data crunching and math. This thing used to take 15 minutes to process everything and push the data in, another 10 to do data crunching and spit out the features data.<br />
Just for the kicks I implemented this in MySQL and Ruby. It still took 15 mins to push data in the mysql database (after processing) but the features were being fetched in like 35 seconds odd!</p>
<p>Obejct DB, AFAIK, is not well suited for the second type of things.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Oren</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142689</link>
		<dc:creator>Oren</dc:creator>
		<pubDate>Fri, 11 Apr 2008 12:36:54 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142689</guid>
		<description>These databases don't claim to be a replacement for the general-purpose DBMS. They would obviously be a very bad choice for an ERP system.

But they do a surprisingly good job for the typical database-backed web site. They are designed with a different set of assumptions and requirements like extreme scalability and distributed operation.</description>
		<content:encoded><![CDATA[<p>These databases don&#8217;t claim to be a replacement for the general-purpose DBMS. They would obviously be a very bad choice for an ERP system.</p>
<p>But they do a surprisingly good job for the typical database-backed web site. They are designed with a different set of assumptions and requirements like extreme scalability and distributed operation.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Zef Hemel</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142686</link>
		<dc:creator>Zef Hemel</dc:creator>
		<pubDate>Fri, 11 Apr 2008 06:35:49 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142686</guid>
		<description>Thomas, CouchDB is working on some ACID transaction stuff I think. I think there's a discussion going on on those right now, haven't really been following it: http://mail-archives.apache.org/mod_mbox/incubator-couchdb-user/200804.mbox/%3C0433F5A7-24B9-4194-91CF-631282DCD975@gmail.com%3E</description>
		<content:encoded><![CDATA[<p>Thomas, CouchDB is working on some ACID transaction stuff I think. I think there&#8217;s a discussion going on on those right now, haven&#8217;t really been following it: <a href="http://mail-archives.apache.org/mod_mbox/incubator-couchdb-user/200804.mbox/%3C0433F5A7-24B9-4194-91CF-631282DCD975@gmail.com%3E" rel="nofollow">http://mail-archives.apache.org/mod_mbox/incubator-couchdb-user/200804.mbox/%3C0433F5A7-24B9-4194-91CF-631282DCD975@gmail.com%3E</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Thomas</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142685</link>
		<dc:creator>Thomas</dc:creator>
		<pubDate>Fri, 11 Apr 2008 06:18:06 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142685</guid>
		<description>So, okay, we redesign our little database and flatten everything. Now a blog post also contains an array of comments. Then we run into the concurrency problem which, with e.g. Slashdot, is going to be a very real problem.

How do we deal with it? Require the user to resubmit her comment just because the programmer was lazy? Unacceptable. So the blog engine needs to deal with this: wait a while, exponential fallback or something, then retry.

This mechanism could in some cases be offloaded to the DBMS, if it allows us to upload some code to it, that should be run as a single transaction. In the example, we would send the DBMS the code that fetches a blog entry, appends a comment and stores the entry back into the database. Only if all steps succeed is this transaction committed. Does CouchDB support something like this?</description>
		<content:encoded><![CDATA[<p>So, okay, we redesign our little database and flatten everything. Now a blog post also contains an array of comments. Then we run into the concurrency problem which, with e.g. Slashdot, is going to be a very real problem.</p>
<p>How do we deal with it? Require the user to resubmit her comment just because the programmer was lazy? Unacceptable. So the blog engine needs to deal with this: wait a while, exponential fallback or something, then retry.</p>
<p>This mechanism could in some cases be offloaded to the DBMS, if it allows us to upload some code to it, that should be run as a single transaction. In the example, we would send the DBMS the code that fetches a blog entry, appends a comment and stores the entry back into the database. Only if all steps succeed is this transaction committed. Does CouchDB support something like this?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nazz</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142684</link>
		<dc:creator>Nazz</dc:creator>
		<pubDate>Fri, 11 Apr 2008 03:21:59 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142684</guid>
		<description>I've use most of the big relational databases including Oracle, Sybase, Informix, DB2, Teradata, MySQL, SQL Server, Postgres &#38; even Adabas back in the day. Hell, you might as well throw in Access &#38; Dbase just for laughs.

The funny thing is while I still use Oracle &#38; MySQL daily, the relational db is dying on the web. Hardly no ISP even offer Oracle who somehow let mysql just own the ISP space.

If Oracle's RAC is so kick ass, why hasn't Oracle built a 1,000 node RAC cluster and wiped out Google already?  Because it is no match. Probably 80% of all RAC clusters are just little rinky dink two node ones.

How can Google who knows jack shit about relational databases be the kings of search? Why aren't Oracle, IBM &#38; Microsoft the kings of search?

The answer is that relational database are good for somethings, but not for others.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve use most of the big relational databases including Oracle, Sybase, Informix, DB2, Teradata, MySQL, SQL Server, Postgres &amp; even Adabas back in the day. Hell, you might as well throw in Access &amp; Dbase just for laughs.</p>
<p>The funny thing is while I still use Oracle &amp; MySQL daily, the relational db is dying on the web. Hardly no ISP even offer Oracle who somehow let mysql just own the ISP space.</p>
<p>If Oracle&#8217;s RAC is so kick ass, why hasn&#8217;t Oracle built a 1,000 node RAC cluster and wiped out Google already?  Because it is no match. Probably 80% of all RAC clusters are just little rinky dink two node ones.</p>
<p>How can Google who knows jack shit about relational databases be the kings of search? Why aren&#8217;t Oracle, IBM &amp; Microsoft the kings of search?</p>
<p>The answer is that relational database are good for somethings, but not for others.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142682</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Fri, 11 Apr 2008 02:10:19 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142682</guid>
		<description>some guy said: 

"MySQL, PG, Oracle, and SQL Server aren’t goin’ anywhere, homez."

I'm sayin.  MySQL seems to be doing a pretty bang up job FTW.</description>
		<content:encoded><![CDATA[<p>some guy said: </p>
<p>&#8220;MySQL, PG, Oracle, and SQL Server aren’t goin’ anywhere, homez.&#8221;</p>
<p>I&#8217;m sayin.  MySQL seems to be doing a pretty bang up job FTW.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Josh in California</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142681</link>
		<dc:creator>Josh in California</dc:creator>
		<pubDate>Fri, 11 Apr 2008 01:37:17 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142681</guid>
		<description>"This is totally true. MySQL fetches rows based on indexed keys really quickly. Doing a JOIN totally slows things down. Normalization = Bad"

Sounds like someone is joining on columns that aren't indexed. Could also be poor schema design.</description>
		<content:encoded><![CDATA[<p>&#8220;This is totally true. MySQL fetches rows based on indexed keys really quickly. Doing a JOIN totally slows things down. Normalization = Bad&#8221;</p>
<p>Sounds like someone is joining on columns that aren&#8217;t indexed. Could also be poor schema design.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Amusing Trousers</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142679</link>
		<dc:creator>Amusing Trousers</dc:creator>
		<pubDate>Fri, 11 Apr 2008 00:34:52 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142679</guid>
		<description>This is totally true. MySQL fetches rows based on indexed keys really quickly. Doing a JOIN totally slows things down. Normalization = Bad</description>
		<content:encoded><![CDATA[<p>This is totally true. MySQL fetches rows based on indexed keys really quickly. Doing a JOIN totally slows things down. Normalization = Bad</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: frank</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142678</link>
		<dc:creator>frank</dc:creator>
		<pubDate>Thu, 10 Apr 2008 23:59:01 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142678</guid>
		<description>Goodbye, data integrity!

Goodbye, consistency!</description>
		<content:encoded><![CDATA[<p>Goodbye, data integrity!</p>
<p>Goodbye, consistency!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: some guy</title>
		<link>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142677</link>
		<dc:creator>some guy</dc:creator>
		<pubDate>Thu, 10 Apr 2008 22:52:03 +0000</pubDate>
		<guid>http://www.zefhemel.com/archives/2008/04/10/rdbms-dead#comment-142677</guid>
		<description>Acting like you're the first to predict the demise of the relational database is so ignorant.

Have you ever heard of object databases? They were supposed to kill RDBMSs. No need for ORM Vietnams. No impedance mismatch. Where are OO DBs? Fuckin' nowhere.

MySQL, PG, Oracle, and SQL Server aren't goin' anywhere, homez.</description>
		<content:encoded><![CDATA[<p>Acting like you&#8217;re the first to predict the demise of the relational database is so ignorant.</p>
<p>Have you ever heard of object databases? They were supposed to kill RDBMSs. No need for ORM Vietnams. No impedance mismatch. Where are OO DBs? Fuckin&#8217; nowhere.</p>
<p>MySQL, PG, Oracle, and SQL Server aren&#8217;t goin&#8217; anywhere, homez.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
