Hosting a Django Site with Pure Python

Developing a site with Django is usually a breeze. You've set up your models, created some views and used some generic views, and you've even created some spiffy templates. Now it's time to publish that site for everyone to see. Now if you're not already familiar with Apache, Lighttpd, or Nginx, you're stuck trying to figure out complicated configuration files and settings directives. "Why can't deployment be just as easy as running the development server?", you scream.

It's tempting to just attempt to use the development server in production. But then you read the documentation (you do read the documentation, right?) and it clearly says:

DO NOT USE THIS SERVER IN A PRODUCTION SETTING. It has not gone through security audits or performance tests. (And that’s how it’s gonna stay. We’re in the business of making Web frameworks, not Web servers, so improving this server to be able to handle a production environment is outside the scope of Django.)

Looks like it's time to fire up Apache, right? Wrong. At least, you don't have to.

CherryPy to the Rescue

One of the features that CherryPy touts quite highly is that they include "A fast, HTTP/1.1-compliant, WSGI thread-pooled webserver", however a lesser known fact about that webserver is that it can be run completely independently of the rest of CherryPy--it's a standalone WSGI server.

So let's grab a copy of the CherryPy WSGI webserver:

wget http://svn.cherrypy.org/trunk/cherrypy/wsgiserver/__init__.py -O wsgiserver.py

Now that you've got a copy of the server, let's write a script to start it up. Your choices may vary depending on how many threads you want to run, etc.

import wsgiserver
#This can be from cherrypy import wsgiserver if you're not running it standalone.
import os
import django.core.handlers.wsgi

if __name__ == "__main__":
    os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'
    server = wsgiserver.CherryPyWSGIServer(
        ('0.0.0.0', 8000),
        django.core.handlers.wsgi.WSGIHandler(),
        server_name='www.django.example',
        numthreads = 20,
    )
    try:
        server.start()
    except KeyboardInterrupt:
        server.stop()

Consequences

Now you've got the server up and running, lets talk about some consequences of this approach.

  1. This is a multithreaded server. Django is not guaranteed to be completely thread safe. Many people seem to be running it fine in multithreaded environments, but thread safety may break at any time without notice. It might be an interesting project to convert cherrypy.wsgiserver to use processing instead of threading and see how the performance changes.
  2. This server is written in Python, and as with any other Python program, it will be difficult for it to match the speed of pure C. For exactly this reason, mod_wsgi is probably always going to be faster than this solution.
  3. You can have a completely self-contained server environment that can be run on Mac, Windows, and Linux with only Python and a few Python libraries installed. Distributing this wsgiserver.py script along with your Django project (or with a Django app, even) could be a great way of keeping the entire program self-contained.

Conclusion

Would I use this instead of a fully-featured web server like Apache or Nginx? Probably not. I would, however, use it for an intranet which demands more performance and security than the built-in development server. In any case, it's a nice nugget of information to have in your deployment toolbox.

Syntax Highlighting

Over the past week, I've had several people write me asking how I prefer to do syntax highlighting. It's funny that this question cropped up now, just as I changed the way that it's handled on this blog. The way that I used to do it was what I posted to djangosnippets almost a year ago: use a regular expression to parse out <code></code> blocks, highlight the stuff in-between, and spit it back out.

The problem with that method was that that would require some more sophisticated logic now that I'm using RestructuredText to write all of my posts. Unwilling to think any harder than necessary, I did a quick google search, and the second result was exactly what I was looking for: a RestructuredText directive, ready-made by the Pygments people.

The trick is to put this file somewhere on your python path. Then, in the __init__.py of one of the Django apps that will use syntax highlighting, just import the file. It's that simple! (I love RestructuredText.) But it's not only RestructuredText that benefits from this style of plugin. Markdown, too, has a similar plugin--again provided by the Pygments people.

.. sourcecode:: python

    print "This is an example of how to use RestructuredText's new directive."

I hope that this answers some of the questions that people had. On a similar note, I'm extremely happy to see that people have been finding the Contact Me link on the right side of the page. Please continue to send me any questions and comments that you have for me!

Django Tip: A Denormalization Alternative

In creating an any website with textual content, you have the choice of either writing plaintext or writing in a markup language of some kind. The immediately obvious choice for markup language is HTML (or XHTML), but HTML is not as human-readable as something like Textile, Markdown, or Restructured Text. The advantage of choosing one of those human-readable alternatives is that content encoded using one of them can be translated very easily into HTML.

When one of my friends started designing his blog using Django, it got me thinking about how best to deal with that translated HTML. It seems like a waste to keep re-translating it every time a visitor views the page, but it also seems like it's redundant to keep the translated HTML stored in the database.

Here's my solution to the problem: cache it. For a month. Here's an example, using Restructured Text:

from django.db import models
from django.contrib.markup.templatetags.markup import restructuredtext
from django.core.cache import cache
from django.utils.safestring import mark_safe

class MyContent(models.Model):
    content = models.TextField()

    def _get_content_html(self):
        key = 'mycontent_html_%s' % str(self.pk)
        html = cache.get(key)
        if not html:
            html = restructuredtext(self.content)
            cache.set(key, html, 60*60*24*30)
        return mark_safe(html)
    content_html = property(_get_content_html)

    def save(self):
        if self.id:
            cache.delete('mycontent_html_%s' % str(self.pk))
        super(MyContent, self).save()

What I'm doing here is writing a method which either gets the translated HTML from the cache, or translates it and stores it in the cache for a month. Then, it returns it as safe HTML to display in a template. The last thing that we do is override the save method on the model, so that whenever the model is re-saved, the cache is deleted.

There we go! We now have the HTML-rendered data that we want, and no duplicated data in the database. Keep in mind that this way of doing things becomes more and more useful the more RAM that your webserver has.

On Context Processors in Django

This started out as a response in the comments to James Bennett's latest post, but I think that there's enough here to warrant its own post. If you haven't yet read it, then I suggest you do--it's a well-put argument for Django's application-level modularity and pluggability.

But I do disagree with him on one point. One of the things that he highlights is about "how easy it is for one Django application to expose functionality to others through things like context processors". I don't find this to be true. Currently there are only two ways of adding processors to the list of context_processors for a particular view:

  1. Adding them as an argument to the RequestContext (per-view).
  2. Adding them to the global context processors list in settings.py (global).

What these methods lack is a middle ground: per-app specification of context processors. This is what James Bennett seemingly alludes to which simply doesn't exist. What if I'd like all of the views in my blog app, and all views in flatpages to get a certain context processor list? Currently in Django that is not possible. I do think that there is demand for this, and it's something that probably wouldn't be too hard to add to trunk.

But really, if I can think of this particular use case of context processor loading, I'm sure there are other people who could think of others. For example, what about a different set of processors based on URL, or based on IP address, or something even more strange? What Django really needs is a pluggable context processor loader similar to how it loads session backends, authentication backends, database backends, urls, etc. That way, people could provide their own loaders to do any kind of context processing differentiation that they want.

The only thing that this could do is make Django applications more pluggable--and that's always a good thing! The good news is that PyCon is coming up, and I can try to tackle this during the sprinting days.

UPDATE: Malcolm Tredinnick has posted an excellent followup to this post that suggests a simple solution for those who want to do something similar to application-level context processor loading right now.

OOP and Django

Being a senior in college means many things. It means job interviews and upper-level classes, emotional instability and independent living. It also means countless hours of sitting in uninteresting classes whose sole purpose is to fulfill some graduation requirement. For me, that means lots of daydreaming--about anything other than that class. Recently however, during one daydream, I had a brain wave worth typing up: What's the deal with Object-Oriented Programming and Django?

The Convention

Browsing through the views.py file in just about any publicly-available Django-based application will almost certainly reveal nothing more than a bunch of functions. These functions are undeniably specialized: they take in an HttpRequest object (plus possibly some more information), and they return an HttpResponse object. Although these functions may be specialized, nevertheless they are still just functions.

This should come as no surprise to anyone who has used the framework--in fact, it's encouraged by common convention! Not only does the tutorial use plain functions for views, but also the Django Book, and just about every other application out there. The question now becomes "why"? Why, in a language that seems to be "objects all the way down", does a paradigm emerge for this domain (Django views) wherein functions are used almost exclusively in lieu of objects?

That's not entirely true, sir...

Any time a broad statement like "just about any" is used, the exceptions are what become interesting. The admin application (both newforms-admin and old) is probably the most notable and interesting exception to my earlier broad statement. It's interesting because it's Django's shining star! Other applications which use object orientation: databrowse and formtools. These are some great Django apps which use Object-Orientation in the views.

Looking at those apps which use OOP and those which don't reveals an interesting idea: those apps which strive to go above-and-beyond in terms of modularity tend to be those who end up using classes and their methods for views. Now this same functionality could be accomplished by using plain functions, but they haven't--their functionality was accomplished using classes and methods.

Please keep in mind that what I'm not trying to do is make a value judgement on Object-Oriented programming vs. functional programming vs. any other programming paradigm. Instead, I'm providing an observation about the emergence of a common practice, and trying to analyze its implications.

But wait!

What really is the difference between writing a plain function as a view and Object-Oriented programming? It's completely reasonable to argue that writing a plain function for a view is, in fact, Object-Oriented programming. All class methods take in self as their first positional argument, and all views take in request as their first positional argument. Taking in this argument allows access to state which would otherwise be difficult to access. Changing the order of urlpatterns is equivalent to changing the polymorphic properties of a class and dynamic method lookup.

In essence, one could argue that using a plain function as a view is strictly equivalent to writing a method on the HttpRequest object. Thinking about it in this way, writing a Django application is really nothing more than building up a monolithic HttpRequest object which the user can call different methods on using its API: the URL. To me, this is a really interesting idea!

Off My Rocker

This is the result of extreme classroom boredom--so maybe posts here will continue down this slightly-more-esoteric road for a while. But honestly this was an interesting thought-experiment, and I'd like to get some feedback on what people think as well. Am I totally off base with this analysis? Moreover, do you use true Python "classes" as your views? If so, what benefits does it bring to the table?

Search

Badges

  • django badge
  • apache badge
  • GeoURL
  • XFN Friendly
  • Valid HTML 4.01 Transitional