01.14
thummer is a website-snapshot and thumbnailing utility I am working on, built using django. Read my previous thummer blog post to find out more about the project.
Assumptions
In this article, I am assuming that we are installing thummer on an existing Ubuntu server (8.04 or later) with Apache already installed, and that you are comfortable with editing apt’s sources.list file.
Step 1: Install A “Fake” X-Server
If you do not have a desktop environment installed on your server (probably a good thing!), then you will need to install xvfb, which provides a virtual screen so that we can capture the rendered output of the website:
Xvfb provides an X server that can run on machines with no display hardware and no physical input devices. It emulates a dumb framebuffer using virtual memory.
sudo apt-get install xvfb
Step 2: Install CutyCapt
We now need to install CutyCapt (webkit rendering).
First, we need to install the cutycapt dependencies and build tools:
sudo apt-get install build-essential
For hardy only: We also need some packages from hardy-backports – add the following lines to /etc/apt/sources.list.
deb http://archive.ubuntu.com/ubuntu hardy-backports main restricted universe multiverse
deb-src http://archive.ubuntu.com/ubuntu hardy-backports main restricted universe multiverse
Now install the required qt4 packages:
sudo apt-get updatesudo apt-get install libqt4-dev libqt4-webkit libqt4-svg
It’s now probably a good idea to remove (or comment out) the hardy-backports lines from your sources.list file.
Download cutycapt source, and expand the tarball:
cd ~/wget http://cutycapt.svn.sourceforge.net/viewvc/cutycapt/CutyCapt.tar.gz\?view\=tartar -xvvzf CutyCapt.tar.gz\?view\=tar
Compile cutycapt:
cd CutyCaptqmakemake
Step 3: Install Django
First install some packages we’ll need later:
sudo apt-get install python-django python-imaging python-pysqlite2 libapache2-mod-python
For hardy only: The version of django hardy ships with is too old for thummer. Install Jaunty’s version of python-django from the updates repository by downloading the .deb file from Launchpad (remember to keep an eye out for future security updates):
cd ~/wget http://launchpadlibrarian.net/17665378/python-django_1.0-1ubuntu1_all.debsudo dpkg -i python-django_1.0-1ubuntu1_all.deb
Step 4: Install & Configure thummer
Download thummer, expand the tarball, and move the extracted files to /var/www:
cd ~/wget http://launchpad.net/thummer/trunk/0.01/+download/thummer_0.01.tar.gztar -xvvzf thummer_0.01.tar.gzsudo mv thummer_0.01 /var/www/thummer
Make sure subversion is installed:
sudo apt-get install subversion
Download the required site-package, sorl-thumbnails:
sudo svn checkout http://sorl-thumbnail.googlecode.com/svn/trunk/sorl /usr/local/lib/python2.5/site-packages/sorl
Create the database, and initial admin user:
cd /var/www/thummer/thummer./manage.py syncdb
Set owner so that the apache process can write to the database:
sudo chown -R www-data /var/www/thummer/database
Set owner so that the apache process can write to the media directory:
sudo chown -R www-data /var/www/thummer/media
Update the django website settings:
sudo nano ./settings.py
Make sure that “XVFB = True“, specify the full path to the database for DATABASE_NAME.
Ensure the full path to the CutyCapt binary is correct.
Step 5: Configure Apache
Ensure that the python module for Apache is enabled:
sudo a2enmod mod_python
Create a virtual-host site configuration file:
sudo nano /etc/apache2/sites-available/thummer
Paste in the following configuration – replace “thummer.domainname.com” with your desired domain name:
<VirtualHost *>
ServerName thummer.domainname.com
DocumentRoot /var/www/thummer/thummer
<Directory "/var/www/thummer/thummer">
AllowOverride All
Order Allow,Deny
Allow from All
SetHandler python-program
PythonHandler django.core.handlers.modpython
SetEnv DJANGO_SETTINGS_MODULE thummer.settings
PythonPath "['/var/www/thummer'] + sys.path"
</Directory>
# Static Media Content
Alias /media /var/www/thummer/media
<Location "/media">
SetHandler None
Order Allow,Deny
Allow from All
</Location>
Alias /admin-media /usr/share/python-support/python-django/django/contrib/admin/media
<Location "/admin-media">
SetHandler None
Order Allow,Deny
Allow from All
</Location>
</VirtualHost>
Enable the site, and restart apache:
sudo a2ensite thummersudo /etc/init.d/apache2 restart
Bam! Thats It! Enjoy, and let me know how it goes!
Usage
Just use the following URL syntax to reference the image (e.g. in img elements):
http://thummer.domainname.com/[width]/[height]/[crop]/http://url-to-capture/
the value of [crop] can either be 0 or 1, where 0 = scale & fit, and 1 = scale & crop.
e.g. to generate a 300×300 pixel cropped thumbnail of the BBC News website:
http://thummer.domainname.com/300/300/1/http://news.bbc.co.uk/
<img src="http://thummer.domainname.com/300/300/1/http://news.bbc.co.uk/" alt="BBC News website thumbnail" />
And here is the result:
Remember you can access the admin interface by going to http://thummer.domainname.com/admin where you can delete snapshots – so that they are regenerated next time they are requested.
Hi Matt,
thanks for the great article.
I’m new to using python and followed the steps you described above. Now I think the problem is that the Ubuntu 9.04 uses python 2.6 as default. If I try “./manage.py syncdb” I get the error message “Error: No module named sorl” (even. If I change the default version of my system to 2.5 it works, but still the webserver part doesn’t work cause apache insist on using version 2.6. Do you know how I can use your program on Ubuntu 9.04.
Thanks,
Peter
Hi Matt,
I managed to get it to run. I think in python 2.6 the packages have to reside in /usr/local/lib/python2.6/dist-packages/.
So it seems to work fine but instead of displaying a screenshot I only see e light grey box (perhaps a placeholder). Security settings should be Ok. I turned debug on in settings.py, but I’m not sure were to find the debug output. In the apache error.log I just see this warning:
/usr/lib/python2.6/dist-packages/mod_python/importer.py:32: DeprecationWarning: the md5 module is deprecated; use hashlib instead
import md5
I’m not sure if this is the problem?!
Thanks,
Peter
Hi Peter,
The grey box is indeed a placeholder for when the capturing process fails. This could be because of an issue with Xvfb or CutyCapt.
Maybe try running these apps directly from the command line to see if they are working?
xvfb-run --auto-servernum --server-args='-screen 0, 1024x768x24' /path/to/CutyCapt --url=http://website.com/ --out=~/snapshot.png