Ahhh, Django: my favorite web framework. And CouchDB: my favorite new database technology. How can I pair these two awesomes together to make an awesome-er?
One of the features that I would like to add to this site when it's time for an upgrade is a lifestream. It seems like everyone is doing it these days (isn't this great logic!), so I probably should too. Originally this was going to be written in the standard Django way--write some models, fill it with data, and slice and dice that data to make it pretty.
After thinking about it, I decided not to go that route. Why? Well, let's go over it: There needs to be a Twitter model, that's for sure. I also want a Pownce model, and a Flickr model. Already this is becoming tedious! At this point we have two options: continue creating these individual models and fill them with data, or try to find the common bits and group them into Ubermodels of some sort, with some type of field to use as a discriminator. Ugh.
This is the perfect use case for a schemaless database, and CouchDB fits that bill just perfectly. Plus its python support is actually quite mature, and running it on a mac is, quite literally, one click. So now that we've all agreed (we all agree, right?) that we want to use CouchDB with Django, how can we make it happen?
First let's set some database settings:
COUCHDB_HOST = 'http://localhost:5984/'
TWITTER_USERNAME = 'ericflo'
So far, so good. Now let's write some initialization code and put it in to an application in the __init__.py:
from couchdb import client
from django.conf import settings
class CouchDBImproperlyConfigured(Exception):
pass
try:
HOST = settings.COUCHDB_HOST
except AttributeError:
raise CouchDBImproperlyConfigured("Please ensure that COUCHDB_HOST is " + \
"set in your settings file.")
DATABASE_NAME = getattr(settings, 'COUCHDB_DATABASE_NAME', 'couch_lifestream')
COUCHDB_DESIGN_DOCNAME = getattr(settings, 'COUCHDB_DESIGN_DOCNAME',
'couch_lifestream-design')
if not hasattr(settings, 'couchdb_server'):
server = client.Server(HOST)
settings.couchdb_server = server
if not hasattr(settings, 'couchdb_db'):
try:
db = server.create(DATABASE_NAME)
except client.ResourceConflict:
db = server[DATABASE_NAME]
settings.couchdb_db = db
In this code, we're loading the CouchDB client and either creating or connecting to a database. We do a bit of error checking to ensure that if we forgot to add COUCHDB_HOST in our settings file, it will yell at us. So how do we use this? Let's write some data importing stuff!
try:
import simplejson as json
except ImportError:
import json
TWITTER_USERNAME = getattr(settings, 'TWITTER_USERNAME', None)
fetched = urlopen('http://twitter.com/statuses/user_timeline.json?id=%s' % (
TWITTER_USERNAME,)).read()
data = json.loads(fetched)
map_fun = 'function(doc) { emit(doc.id, null); }'
for item in data:
item['item_type'] = 'twitter'
if len(db.query(map_fun, key=item['id'])) == 0:
db.create(item)
This can go inside a Django management command or in a standalone script. Essentially what we're doing is loading the timeline for a user, and then for each item in that response we're setting the item_type to 'twitter'. Then we're checking to see if an item with that current twitter id already exists, and if not, we're creating it.
Now we need a way to query this data. In CouchDB, the way to query for data is using views. Views are stored in the database, so they can be entered manually, but I much prefer to manage views programmatically. Thankfully, Python's CouchDB library and Django give us all we need to make this very, very easy:
from django.db.models import signals
from couch_lifestream import models, db, COUCHDB_DESIGN_DOCNAME
from couchdb.design import ViewDefinition
from textwrap import dedent
by_date = ViewDefinition(COUCHDB_DESIGN_DOCNAME, 'by_date',
dedent("""
function(doc) {
emit(doc.couch_lifestream_date, null);
}
"""))
def create_couchdb_views(app, created_models, verbosity, **kwargs):
ViewDefinition.sync_many(db, [by_date])
signals.post_syncdb.connect(create_couchdb_views, sender=models)
Make sure that this is placed somewhere that will be loaded when Django's manage.py is called. In this case, I put it in the __init__.py file under management/. What we're doing is creating two views--one which is keyed by the item_type (we set this earlier to be 'twitter'), and another which is keyed simply by date. When we run python manage.py syncdb, these views will automatically be re-synced with the database. Using this method, we are able to manage these views quickly and easily, and distribute them in a reusable way.
Now let's create some Django views so that we can visualize this data:
from couch_lifestream import db, COUCHDB_DESIGN_DOCNAME
from django.shortcuts import render_to_response
from django.template import RequestContext
from django.http import Http404
from couchdb import client
def item(request, id):
try:
obj = db[id]
except client.ResourceNotFound:
raise Http404
context = {
'item': obj,
}
return render_to_response(
'couch_lifestream/item.html',
context,
context_instance=RequestContext(request)
)
def items(request):
item_type_viewname = '%s/by_date' % (COUCHDB_DESIGN_DOCNAME,)
lifestream_items = db.view(item_type_viewname, descending=True)
context = {
'items': list(lifestream_items),
}
return render_to_response(
'couch_lifestream/list.html',
context,
context_instance=RequestContext(request)
)
The item view is fairly self-explanatory. We query the db for the object of the specified id, and if it doesn't exist, we throw a 404. If it does exist, we throw it into the context and let the template render the page. The items view is slightly more interesting. In this case, we're using that CouchDB view that we created to query the database by date, and passing that list into the context.
Obviously there's a ton more that we could cover, but these basic building blocks that I've demonstrated are enough to get you started. After this it's mostly all presentational work. I've open sourced all of the code that has been written so far for the upcoming lifestream portion of this site, even though right now it only supports Twitter and Pownce. I plan on continuing work on it to support all of the services that I use. You can track my progress at the project's page.
I'll make sure to blog about this again once the project is more mature, but for now it should be fun to play around with. Are you using CouchDB with Django? If yes, then how are you dealing with that interaction?
All Content


By Cai at 6 a.m. on Nov. 10, 2008
Thank you for that.
One question, when you create the views like this are they treated by CouchDB as permanent or temporary views? I guess permanent as they'll be stored in the database once syncdb is called, but wanted to make sure.
I'm just getting started with CouchDB.
Thanks,
By Eric Florenzano at 6:03 a.m. on Nov. 10, 2008
Views that are created and synced using the sync_many like was done with 'by_date' are permanent views. However if you noticed, I did a db.query() call when fetching from twitter. That is a temporary view.
By Cai at 6:39 a.m. on Nov. 10, 2008
I did notice, but I guess this wouldn't really affect performance at scale as it's only called when fetching tweets.
Thank you.
By Marty Alchin at 9 a.m. on Nov. 10, 2008
I haven't looked into CouchDB at all yet, so I don't really know whether it's the right tool for me. I think I'd like the idea of a schemaless database, but every time I see a function declaration (in this case, a CouchDB view) written inside a string, I get an itch to write a declarative layer on top of it, anyway. I guess I'm still a fan of schemas, just for their explicit nature. Maybe there's some middle ground where people like me can define schemas at the Python level for our own sanity, while letting CouchDB do its thing.
Also, while I can appreciate your dabbling with Django+CouchDB, is there a reason you didn't just use jellyroll? Not hating here, just honestly curious. I'm considering a personal lifestream (that is, one for my own use, not to share with everybody) and I had been planning to use jellyroll, but I'd be curious to know if there are particular benefits of CouchDB I wouldn't be able to get otherwise (schema-writing aside, obviously).
By Eric Florenzano at 9:03 p.m. on Nov. 10, 2008
I'm certain that there's room for someone to write a declarative layer around CouchDB views. In fact CouchDB views are quite versatile--they can even be written in Python. But since this declarative layer doesn't yet exist, this method isn't all that bad.
The reason why I didn't just use jellyroll is because I have a bad case of NIH :) All joking aside though, I think that CouchDB probably does provide a better fit for exactly this type of scenario, and instead of working against it you're working with it. Plus, I just kind of want to do it.
By Horst Gutmann at 9:19 a.m. on Nov. 10, 2008
Nice :-) I'm currently doing something similar with some RDF-databases out there, but our implementation differs a little bit. Why do you use the settings module for your global connection object? Wouldn't an element within your application's `__init__.py` suffice?
By Eric Florenzano at 9:05 p.m. on Nov. 10, 2008
Honestly my choice was probably a bad one. I use settings because I was presupposing that other apps might want to use the same connection to CouchDB. On second thought it might make more sense to just have each app open its own connection to CouchDB since it's just HTTP. I dunno :)
By Jay at 11:04 a.m. on Nov. 10, 2008
Awesome!
Thank you for the post. I will try it out.
By Idan Gazit at 11:15 a.m. on Nov. 10, 2008
Hey Eric,
An alternative to schemaless DB for this particular application is to let somebody else normalize the data. I tried my hand at creating a lifestream app ("djangregator), and initially I did it using generic FK's and pluggable backends.
Eventually I realized that I would forever be writing backends and chasing bugs in same. I turned instead to FriendFeed, which does that work for you and provides a clean JSON feed, with a single date/time standard and sundry goodies like entry batching (e.g. you just uploaded 13 photos to flickr, you probably want to show them as a group). I rolled up a basic django app for that called consonance. It also absolves you of having to install/write API glue libraries for each service.
That being said -- the couchdb approach is really interesting, although I like the ORM too much to abandon it (personally). Thanks!
djangregator: http://github.com/idangazit/djangregator
consonance: http://github.com/idangazit/consonance
By Eric Florenzano at 9:06 p.m. on Nov. 10, 2008
You're right, it would be easier to do FriendFeed, but easier doesn't mean more fun! Honestly you're probably right, if I were pressed for time I'd do the FriendFeed route, but since I have as much time as I want, I decided to go the CouchDB route.
By ian at 3:41 p.m. on Nov. 10, 2008
so.. are you trying to integrate the model API at all, or just disregarding that when you interact with couch?
By Eric Florenzano at 9:07 p.m. on Nov. 10, 2008
Just disregarding it. Integrating that is a project in and of itself.
By Carl at 7:17 p.m. on Nov. 10, 2008
Good article, but
<pre><code> raise CouchDBImproperlyConfigured("Please ensure that COUCHDB_HOST is " + \
"set in your settings file.")</code></pre>
can be changed to
<pre><code> raise CouchDBImproperlyConfigured("Please ensure that COUCHDB_HOST is "
"set in your settings file.")</code></pre>
and it will still work and actually run (imperceptibly) faster. The main reason to change though is that ending the line with \ is delicate and breaks if you accidentally add a space, but since you didn't close your parens, the line isn't over anyway, so you don't need to use it. Plus if you do "spam" "eggs" in Python, the interpreter auto-jams them together into "spameggs".
By Eric Florenzano at 9:08 p.m. on Nov. 10, 2008
Honestly this is one of the bits about Python that I really don't like. Implicit string concatenation, ugh. Personal preference, of course :)
You do bring up a good point though that my way is probably more fragile, but at the same time I'd rather have that than the implicit string concatenation.
By Bob Ippolito at 5:53 a.m. on Nov. 12, 2008
Actually the only dumb part is the line continuation character. You shouldn't really ever use those in Python, we have parentheses that do a better job.
The Python compiler does a perfectly good job folding constant strings that are added together (2.5+ anyway).
<code>
>>> def x():
... print 'foo' + 'bar'
...
>>> import dis
>>> dis.dis(x)
2 0 LOAD_CONST 3 ('foobar')
3 PRINT_ITEM
4 PRINT_NEWLINE
5 LOAD_CONST 0 (None)
8 RETURN_VALUE
</code>
By ilya at 1:15 a.m. on Nov. 11, 2008
Nice article!
With bunch of shortcuts like `get_object_or_404` coding on top of couchdb in django will be pretty amazing!
By Toke at 10:37 a.m. on Nov. 20, 2008
Nice Article!
I have written a very basic paginator for Couchdb ViewResults similar to the one in django.
It can be found at:
http://www.djangosnippets.org/snippets/1208/
By Harvey at 10:12 a.m. on March 3, 2009
what an excellent article indeed! i'm just getting into couchdb myself, and working on a hybrid couchdb + django solution (i still like the geo capabilities of django, but want the super awesome indexing of couch for feature data). surprisingly, it's taking a lot less time than i thought! :)
By wholesale jewelry at 12:31 a.m. on May 15, 2009
Good website,it is useful and helpful.thank you for ur sharing.
By ben10 oyunları at 3:01 a.m. on May 25, 2009
I were pressed for time I'd do the FriendFeed route, but since I have as much time as I want, I decided to go the CouchDB route.
By injection molding at 10:09 p.m. on May 30, 2009
Nice posting, really nice be here!
By aparna at 10:43 a.m. on June 9, 2009
plz cn u suggest in a simple way how to retrieve views of couchdb using javascript pages, that means how to call views from html pages. ur views are very complicated. some simple views woul hlp.
By cheapest wow gold at 4:42 a.m. on June 13, 2009
cool post,I like it very much.
By lingerie at 8:05 p.m. on June 22, 2009
You can't do things like put functions into the database - this would be great as permissions would be as simple as matching Groups to functions.
By jordan shoes at 1:32 a.m. on June 25, 2009
Nice posting, really nice be here!
By jordan shoes at 1:32 a.m. on June 25, 2009
Good website,it is useful and helpful.thank you for ur sharing.
By ugg boots at 1:33 a.m. on June 25, 2009
cool post,I like it very much.
By nike shoes at 1:33 a.m. on June 25, 2009
plz cn u suggest in a simple way how to retrieve views of couchdb using javascript pages, that means how to call views from html pages. ur views are very complicated. some simple views woul hlp.
By tiffany jewellery at 1:34 a.m. on June 25, 2009
You can't do things like put functions into the database - this would be great as permissions would be as simple as matching Groups to functions.
By sare at 4:49 a.m. on June 30, 2009
http://www.fantastic-replica.net
By Steven at 2:39 a.m. on July 2, 2009
Give please. Never read a book through merely because you have begun it.
I am from Australia and too bad know English, give true I wrote the following sentence: "Flea control nematodes and environmentally friendly gardening products are at gardensalive."
Best regards :D, Steven.