Posted on Nov. 29, 2008 at 4:31 P.M.

Last week I wrote an article called Why CouchDB Sucks, which many people correctly said should have been called "What CouchDB Sucks at Doing". Nearly everyone pointed out that it was not designed to do the things that I was mentioning in the article. This time around, I'd like to focus on some of the features about CouchDB that I think absolutely rock.

CouchDB is schema-free

One of the most annoying parts of dealing with a traditional SQL database is that you invariably need to change your schemata. This can be done usually with some ALTER TABLE statements, but other times it requires scripts and careful use of transactions, etc. In CouchDB, the solution is to just start using your new schema. No migration needed. If it's a significant change, then you might need to change your views slightly, but nothing as annoying as what would be needed with SQL.

The other advantage of having no schema is that some types of data just aren't well suited to having a strict schema enforced upon them. My CouchDB-based lifestreaming application is a perfect example of the inherent flexibility of CouchDB's schemaless design is that all kinds of disparate information can be stored alongside each other and sorted and aggregated. There's also no reason that you need to use its schema-free nature this way. You could, for example, manually enforce a schema for certain databases, if needed.

CouchDB is RESTFUL HTTP

When is the last time you tried to install MySQL or PostgreSQL drivers for your web development platform of choice? If you're using apt-get it's not so bad, but for just about every other platform, it's a total pain to get these drivers up and running. With CouchDB, there's no need. It speaks HTTP. Want to create a new database? Send an HTTP PUT request. Want to retrieve a document from the database? Send an HTTP GET. Want to delete a database? Send an HTTP DELETE. As you can see, the API is quite straightforward and if a client library doesn't already exist for your language of choice (hint: it does), then it will take you only a few minutes to write one.

But the best part about this is that we already have so many amazing and well-tested tools to deal with HTTP. For example, let's say you want to store one database on one server and another database on another server? It's as simple as setting up nginx or perlbal or varnish as a reverse proxy and having each URL go to a different machine. The same thing goes for transparent caching, etc. Oh, and also, web browsers know how to speak HTTP, too. You could easily write whole web apps served only from CouchDB.

Map/Reduce

Map/Reduce will kill every traditional data warehousing vendor in the market. Those who adapt to it as a design/deployment pattern will survive, the rest won't.

Sounds like someone from Google must have said this, or some Hadoop evangelist, or maybe someone who works on CouchDB. In fact, this comes from Brian Aker, a MySQL hacker who was Director of Architecture at MySQL AB and is now developing the open source fork of MySQL named Drizzle (also a very exciting project in its own right). He's right, too. Google was on to something in a big way when they unveiled their whitepaper on Map/Reduce. It's not the be-all end-all for processing and generating large data sets, but it certainly is a proven technology for that task.

Brian talks about massively multi-core machines which seem the inevitability these days, and we will need to start writing logic that is massively parallelizable to take advantage of these masses of CPUs. Map/Reduce is one way to force ourselves to write logic that can be parallelized. It is a good choice for any new database system to adopt for this reason, and that's why it's great to see that CouchDB has adopted it. It's just one more reason why CouchDB rocks.

So much more

I could talk about how it can handle 2,500 concurrent requests in 10mb of resident memory usage. I could talk about its pluggable view server backends, so that instead of writing views in JavaScript you can write them in Python or any other language (given the correct bindings). I could talk about CouchDBX, which makes installing it on the Mac, quite literally, one click. I could even talk about how it's written in Erlang, with an eye towards scalability. Or maybe about how its database store is append-only.

I could talk about any of those things, and more. It just comes down to this: CouchDB rocks. But don't take my word for it--try it out for yourself!

Andreas
at 6:44 p.m.
on Nov. 29, 2008

We should put native couchdb support for models in django on the 1.2 feature list.

at 8:31 p.m.
on Nov. 29, 2008

It seems that ZODB has been doing much of this for ten years with ACID support. Is it that much better due to extra value at the edges, or what? I've not used CouchDB; I think it'd be great if someone knowledgeable compared the two.

at 8:34 p.m.
on Nov. 29, 2008

I'd absolutely love to see a comparison of the two technologies as well. I've simply never used ZODB so I can't speak to it. I've heard FUD about it, but won't what I've heard out of ignorance.

Dan sickles
at 1:30 a.m.
on Nov. 30, 2008

I belive that ZODB is python specific. One of the key points of CouchDB is that unlike most object databases, it's not tied to any language. If you can talk http, you can talk to CouchDB.

at 7:51 a.m.
on Nov. 30, 2008

Yep, having an HTTP interface is useful for that.

I've started trying to compare the two technologies by speculatively creating (read: messing around with, I don't really know if I'll finish it) a REST HTTP interface to ZODB via http://svn.repoze.org/repoze.loveseat/trunk/ . So far all I have is the database creation API done, but it seems like a fairly straightforward job save for the replication stuff.

gareth
at 11:11 a.m.
on Nov. 30, 2008

It's too difficult to install CouchDB on Windows right now. It would be wonderful to have a 1-click installer for Windows.

at 11:49 a.m.
on Nov. 30, 2008

> I could talk about how it can handle 10,000 concurrent requests in 10mb

That were 2,500 concurrent requests :)

at 1:15 p.m.
on Nov. 30, 2008

Whoops, that was supposed to be 1000, as per jchris's article! But 2500 is even better. Correcting now.

Lars
at 12:09 p.m.
on Nov. 30, 2008

Oooh, you poor boy. You got scared because you can't have an opinion and stand by it when others criticize you? BE A MAN!

at 1:07 p.m.
on Nov. 30, 2008

Actually, it's a good traffic tactic. You get traffic from your first strident, possibly incorrect article, and you get more traffic from your "mea culpa" follow-up.

at 1:18 p.m.
on Nov. 30, 2008

The plan was always to do two articles. The first one would focus on things that it just wasn't designed to do, and the second would focus on why it rocks at what it does do.

I stand by everything that I said in that article alongside everything I say in this article.

at 11:24 p.m.
on Dec. 1, 2008

Oooh, you poor jerk. You got confused because you can't read very well, and hit "Submit Comment" before before your brain caught up with your nuts. BE A MENSCH!

Joon
at 4:19 a.m.
on Dec. 19, 2008

Hello folks!

Is CouchDB appropriate for document management systems?

PrufareCrarve
at 12:41 p.m.
on Dec. 19, 2008

Qualitative resource

Search

 

Badges

  • django badge
  • apache badge
  • GeoURL
  • XFN Friendly
  • Valid HTML 4.01 Transitional