2009
01.14

thummer is a website-snapshot and thumbnailing utility I am working on, built using django. Read my previous thummer blog post to find out more about the project.

Assumptions

In this article, I am assuming that we are installing thummer on an existing Ubuntu server (8.04 or later) with Apache already installed, and that you are comfortable with editing apt’s sources.list file.

Step 1: Install A “Fake” X-Server

If you do not have a desktop environment installed on your server (probably a good thing!), then you will need to install xvfb, which provides a virtual screen so that we can capture the rendered output of the website:

Xvfb provides an X server that can run on machines with no display hardware and no physical input devices. It emulates a dumb framebuffer using virtual memory.

  • sudo apt-get install xvfb

Step 2: Install CutyCapt

We now need to install CutyCapt (webkit rendering).

First, we need to install the cutycapt dependencies and build tools:

For hardy only: We also need some packages from hardy-backports – add the following lines to /etc/apt/sources.list.

deb http://archive.ubuntu.com/ubuntu hardy-backports main restricted universe multiverse
deb-src http://archive.ubuntu.com/ubuntu hardy-backports main restricted universe multiverse

Now install the required qt4 packages:

It’s now probably a good idea to remove (or comment out) the hardy-backports lines from your sources.list file.

Download cutycapt source, and expand the tarball:

  • cd ~/
  • wget http://cutycapt.svn.sourceforge.net/viewvc/cutycapt/CutyCapt.tar.gz\?view\=tar
  • tar -xvvzf CutyCapt.tar.gz\?view\=tar

Compile cutycapt:

  • cd CutyCapt
  • qmake
  • make

Step 3: Install Django

First install some packages we’ll need later:

For hardy only: The version of django hardy ships with is too old for thummer. Install Jaunty’s version of python-django from the updates repository by downloading the .deb file from Launchpad (remember to keep an eye out for future security updates):

  • cd ~/
  • wget http://launchpadlibrarian.net/17665378/python-django_1.0-1ubuntu1_all.deb
  • sudo dpkg -i python-django_1.0-1ubuntu1_all.deb

Step 4: Install & Configure thummer

Download thummer, expand the tarball, and move the extracted files to /var/www:

  • cd ~/
  • wget http://launchpad.net/thummer/trunk/0.01/+download/thummer_0.01.tar.gz
  • tar -xvvzf thummer_0.01.tar.gz
  • sudo mv thummer_0.01 /var/www/thummer

Make sure subversion is installed:

Download the required site-package, sorl-thumbnails:

  • sudo svn checkout http://sorl-thumbnail.googlecode.com/svn/trunk/sorl /usr/local/lib/python2.5/site-packages/sorl

Create the database, and initial admin user:

  • cd /var/www/thummer/thummer
  • ./manage.py syncdb

Set owner so that the apache process can write to the database:

  • sudo chown -R www-data /var/www/thummer/database

Set owner so that the apache process can write to the media directory:

  • sudo chown -R www-data /var/www/thummer/media

Update the django website settings:

  • sudo nano ./settings.py

Make sure that “XVFB = True“, specify the full path to the database for DATABASE_NAME.

Ensure the full path to the CutyCapt binary is correct.

Step 5: Configure Apache

Ensure that the python module for Apache is enabled:

  • sudo a2enmod mod_python

Create a virtual-host site configuration file:

  • sudo nano /etc/apache2/sites-available/thummer

Paste in the following configuration – replace “thummer.domainname.com” with your desired domain name:


<VirtualHost *>

  ServerName thummer.domainname.com
  DocumentRoot /var/www/thummer/thummer

  <Directory "/var/www/thummer/thummer">
    AllowOverride All
    Order Allow,Deny
    Allow from All

    SetHandler python-program
    PythonHandler django.core.handlers.modpython
    SetEnv DJANGO_SETTINGS_MODULE thummer.settings
    PythonPath "['/var/www/thummer'] + sys.path"
  </Directory>

  # Static Media Content
  Alias /media /var/www/thummer/media
  <Location "/media">
    SetHandler None
    Order Allow,Deny
    Allow from All
  </Location>

  Alias /admin-media /usr/share/python-support/python-django/django/contrib/admin/media
  <Location "/admin-media">
    SetHandler None
    Order Allow,Deny
    Allow from All
  </Location>

</VirtualHost>

Enable the site, and restart apache:

  • sudo a2ensite thummer
  • sudo /etc/init.d/apache2 restart

Bam! Thats It! Enjoy, and let me know how it goes!

Usage

Just use the following URL syntax to reference the image (e.g. in img elements):

http://thummer.domainname.com/[width]/[height]/[crop]/http://url-to-capture/

the value of [crop] can either be 0 or 1, where 0 = scale & fit, and 1 = scale & crop.

e.g. to generate a 300×300 pixel cropped thumbnail of the BBC News website:

http://thummer.domainname.com/300/300/1/http://news.bbc.co.uk/

<img src="http://thummer.domainname.com/300/300/1/http://news.bbc.co.uk/" alt="BBC News website thumbnail" />

And here is the result:

BBC News website thumbnail

Remember you can access the admin interface by going to http://thummer.domainname.com/admin where you can delete snapshots – so that they are regenerated next time they are requested.

Share:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • StumbleUpon
  • Technorati
  • Yahoo! Buzz
  • email
  • Reddit
  • Twitter
  • Yahoo! Bookmarks

Related posts

3 comments so far

Add Your Comment
  1. Hi Matt,

    thanks for the great article.
    I’m new to using python and followed the steps you described above. Now I think the problem is that the Ubuntu 9.04 uses python 2.6 as default. If I try “./manage.py syncdb” I get the error message “Error: No module named sorl” (even. If I change the default version of my system to 2.5 it works, but still the webserver part doesn’t work cause apache insist on using version 2.6. Do you know how I can use your program on Ubuntu 9.04.

    Thanks,

    Peter

  2. Hi Matt,

    I managed to get it to run. I think in python 2.6 the packages have to reside in /usr/local/lib/python2.6/dist-packages/.
    So it seems to work fine but instead of displaying a screenshot I only see e light grey box (perhaps a placeholder). Security settings should be Ok. I turned debug on in settings.py, but I’m not sure were to find the debug output. In the apache error.log I just see this warning:
    /usr/lib/python2.6/dist-packages/mod_python/importer.py:32: DeprecationWarning: the md5 module is deprecated; use hashlib instead
    import md5
    I’m not sure if this is the problem?!

    Thanks,

    Peter

  3. Hi Peter,

    The grey box is indeed a placeholder for when the capturing process fails. This could be because of an issue with Xvfb or CutyCapt.

    Maybe try running these apps directly from the command line to see if they are working?
    xvfb-run --auto-servernum --server-args='-screen 0, 1024x768x24' /path/to/CutyCapt --url=http://website.com/ --out=~/snapshot.png