Automating Mozilla to create Thumbzilla

I love my job. One of my ideas got the green light a month or so back and I’ve been building it (while juggling other projects) ever since.

It wasn’t a unique idea, but it was one that would compliment our other image-related ad services: Web Thumbnails. Essentially, create a small thumbnail of a rendered website and use that as a visual representation of a link before someone clicks on it.

Something like this:

Ars Technica Ars Technica: The PC enthusiast’s resource
Ars Technica. Power users and the tools they love, without computing religion.
Oh yeah Ars Technica review: The Sims 2. Posted September

arstechnica.com/ – 63k – CachedSimilar pages

(Web result borrowed from Google)

Creating the actual thumbnail is not that impressive of a feat. There are dozens of ways of doing that. The real trick, in my book, is creating an automated method of doing so. No technology is worthwhile, in my opinion, of it doesn’t make some task easier.

With the help of Mozilla and Gtk2::MozEmbed, I build Thumbzilla (okay, the name is cliche and taken, but for an internal project, it works). I almost immediately ran into a pickle of a problem: Javascript. Some of the sites I was attempting to thumbnail did not want to play very well with an automated process. When I logged in to check Thumbzilla’s progress I was greeted with a slew popups, alerts, and confirmation dialogs. Nasty stuff.

Obviously, anything requiring manual intervention is a wrench in the cogworks of my project so I did a little digging. I ultimately found this:

Suppose you’re annoyed by pop-up advertisements and want to prevent all
web pages from opening new browser windows. You can do this by adding the
following line to your Mozilla user preferences file (user.js):

user_pref("capability.policy.default.Window.open", "noAccess");

Throw in a little:

user_pref("capability.policy.default.Window.prompt", "noAccess");
user_pref("capability.policy.default.Window.alert", "noAccess");
user_pref("capability.policy.default.Window.confirm", "noAccess");

and my system is now chugging along quite nicely, with some of the annoyances of Javascript disabled. It will be several weeks minimum before Thumbzilla is publicly unveiled, during which time it will get tested, poked and prodded by various people. Still, I have to say I’m quite proud of how it turned out. I think this kind of visualization could be a useful part of searching the web. Let’s just hope I’m not the only one who thinks so.

Riding the D-BUS

D-BUS is a message bus system, a simple way for applications to talk to one another.

I was looking for a better way of doing inter-process communication (IPC) and I stopped to take a look at D-BUS. After playing with it for a few days and seeing it in action via beagle, I decided it was worth using.

I’ve started work on xchat-dbus, and X-Chat plugin to expose plugin functionality via D-BUS. The advantage to this is that other applications could interface with X-Chat without having to use a plugin and then find a way to communicate with that.

This started out as an idea to create a simple applet that would provide alerts via the notification area. I found one such applet, but it was overly complicated and didn’t work well. I started to write an X-Chat plugin to do it, but then I realized how much easier (and ultimately more useful) a D-BUS plugin would be. Plus, this will let me dig deeper into D-BUS and hopefully I’ll get a better grasp on it’s internals.

Mono, HttpWebRequest and https/ssl

I just spent the better part of a week tracking down bugs in Mono and a custom application of mine (which I’m porting to Mono). I was having a rough time with some code working against a secure server. After reading this, this, and this, it finally works.

Bottom line: If you’re working with HttpWebRequest and trying to connect via SSL, you need to make sure that you tell System.Net to trust all certificates, especially if you’re developing a non-interactive daemon. There’s nothing more frustrating then spending hour upon hour tracking down why your code won’t work like it did under .NET, only to find out a few lines of code fixes it straightaway.

libipod, or what motivates me

I started playing with ipod support in Rhythmbox a while back. I mocked up a little bit of the GUI, but then I made the mistake of looking at the libraries that interfaced with the ipod. There have been some efforts at getting ipod support under Linux, most notably gtkpod. There are a handful of various efforts at making GUI to interface with the ipod, but I found very little effort going into making a stand-alone library to access it.

Following the Open Source tradition, I decided to write my own library to interface with the ipod. I did this with sound reasoning. First, the existing projects that I did find all seemed to be trying to do their own thing (nothing wrong with that), each implementing their own version to accomplish a similar task. While there’s nothing necessarily wrong with that, it feels more `right` to me to create a more generic, independant shared library to interface with the ipod (and specifically, iTunesDB, which is the heart and soul of the ipod). With a shared library, I can integrate ipod support into any number of applications (xmms, Rhythmbox, gnome-vfs, etc) without having to worry about rewriting the same code over and over again, which gtkpod and others seem to be doing.

I talked to a few developers involved in other ipod-related efforts and met some resistance to my idea. Again, the efforts I witnessed seemed to focus on building application-specific support. On that vein, I took a look at my own motivations compared with that of other developers that I know and I came across some interesting bits of information.

Reading through a presentation [3.4M] given by Nat Friedman at GUADEC recently, I found and have shamelessly ripped out the pyramid of usability.

pyramid of usability

I place myself somewhere between core hacker and geek. I’m more than happy to hack away at code but I don’t find as much satisfaction in something that only I can use. I guess I prefer to look at things from an end-user perspective. They don’t care how complicated it is to write code or what effort went into it. They just want it to work. At least I’m not alone in this.

I know where I stand, but what about the community at large? Jorge pointed me to this study on the motivation and effort in Open Source software projects, conducted with a pool of 684 developers across 287 sourceforge.net-hosted projects. They concluded with some interesting results.

44.9% write code because it’s intellectually stimulating.
41.8% write code to improve their skills
33.8% write code because it is needed for work-related projects
29.7% write code for non-work-related projects
11.3% write code to spite closed-source projects
11.0% write code to enhance their reputation in the Open Source community

I can relate to these motivations but I can’t help feeling that something is missing. I don’t expect the code I write to save the world but I would like it to contribute to making Linux more usable to the end user. If I sync my ipod with Linux, it should work the same no matter what frontend I use. I should even be able to sync between whatever application I use in Linux and iTunes on Windows or OSX. It shouldn’t matter what application I use on which platform. It should just work.

So that’s my goal with libipod in a nutshell, as lofty as it may be. It should provide a seamless interface between any application using it and iTunesDB and allow the ipod to work with iTunes or any other compatible software.

apt-checkpoint 0.1 released

I’m happy to announce that apt-checkpoint 0.1 has been released..

I’m excited about this, partially because this is the first “open source” project that I’ve gotten to a releasable state but also because it’s just a damn handy utility. I pimped the project in #debian-devel (on irc.freenode.net) today. A few people said things like “that’d be useful for me” and “this is a good idea, i’ll try it out in a chroot later. actually it will be really useful for chroots when you want to revert changes.”, while another sputtered “packages are not required to support downgrading at all” and “while [ 1] ; do dpkg –get-selections | awk ‘{print $1}’ | grep -v apt | xargs dpkg –purge; done.” and “Use a file system with snapshots.”

Of course, there are various ways of manually downgrading packages. The problem with his suggestion is that, while technically feasible, is not a clean solution. I shouldn’t have to pipe commands back and forth to accomplish what should be a seamless, automated task. Linux isn’t just the playground the technologically elite. Linux needs to be user-friendly, while still allowing us technophiles to get our hands dirty. apt-checkpoint, while rough and cludgy, is just an example of the kind of usability people need. They should be able to upgrade and, if something breaks, seamlessly rollback, akin to restore points in Windows (which are used for the exact same reason).

A few people have already started to play with apt-checkpoint. Hopefully I’ll get some good feedback from them, and a stable 0.2 release will be ready in the near future.

apt-checkpoint (The anti-whiprush)

I finally got tired of OSX so I installed Debian on my iBook. After getting it up and running, I decided to try getting GNOME 2.6 installed from the experimental repository. Boy, was that a big mistake. One of the reasons it is still in experimental is that it’s not being built for all platforms yet. Unfortunately, there seem to be some critical packages still missing for the PowerPC (ppc). After an unfortunate dist-upgrade, I was left with a horribly broken GNOME install and no easy way of getting back. I whiprushed myself.

<StoneTable> `whiprushed
<rewt> To be whiprushed is to bring hell upon yourself by apt-get dist-upgrading without knowing wth you’re doing, and then being pulled over with expired tags and <cut to jail scene> ending up in a jail cell for the night with a heroin addict for a roommate. All because you messed up that Debian system.

The problem:
No way to revert the system back to a point-in-time when the system was working (ala restore points in Windows).

I started thinking about the process I went through to restore my system to the point it had been before I destroyed GNOME. Removing all of the GNOME 2.6 packages I had installed, including their dependancies in experimental. Being left with a broken apt-get and having to remove it, download the .deb from unstable and manually installing it. Becoming intimately familiar with querying out package data with dpkg, and finally reinstalling all of GNOME 2.4 and all of the subsequent packages that got removed in the process.

What a pain in the ass.

The solution:
It hit me as I was driving to work: apt-checkpoint. What we need is a method to create a ‘checkpoint’ that says this is a known good working system at this point in time and records the installed packages, versions, configurations, as well as the original packages (optionally, if available). Then an apt-diff tool could be used to compare the current system with this checkpoint of the working system to pin-point differences, and an apt-rollback tool that could actually restore the known good system configuration to an otherwise whiprushed system.

Maybe it’s just me, but I like to live bleeding edge. I run Debian unstable on all of my desktops and occasionally something gets screwed up, either by my own ineptitude or that of a package maintainer. Shit happens. If I had these tools, though, reversing the change that broke my system would be a relatively simple task (or at least the identification of the broken package/dependancy would be easier to find).

With the solution in hand, I’ve started work on apt-checkpoint, apt-diff, and apt-rollback, an anti-whiprush toolkit to save me (and you) from destroying your system by an ill-advised dist-upgrade. Hopefully I can get a simple, working solution in relatively short order. After all, simple solutions are often the best kind.

Converting OSX Icons to png

I found some Matrix icons that I wanted to use on my Linux desktop. Unfortunately, the only available formats are for OSX or Windows (.ico). I prefer to use png, so I set out to convert the OSX-flavored icons to png.

You will need to download and install a few programs in order for this to work.

First, you need to get a copy of StuffIt for Linux. For those of you morally opposed to using close-source software, shame on you for even considering using Macintosh icons on your Linux desktop. Begone, you.

Next, download yourself a copy of icns2png [Local mirror].

Finally, grab these two scripts: clean and convert. Don’t forget to mark them executable.

Extracting the .bin/.hqx
I downloaded mtrx_icn.bin and saved it to ~/icons. Unstuff this .bin file to seperate the data in to two files.

stone@durin:~/icons$ unstuff mtxr_icn.bin
The Matrix Rebooted Icons.sit.info
The Matrix Rebooted Icons.sit.data
/home/stone/icons/The Matrix Rebooted Icons.sit.info ..
/home/stone/icons/The Matrix Rebooted Icons.sit.data ...........................................................................................................
stone@durin:~/icons$

Next, we need to extract the actual icons themselves. Without setting the parameters to tell unstuff how to treat the file, it will extract 0-byte Icons.

stone@durin:~/icons$ unstuff -e=unix -m=auto -t=on The\ Matrix\ Rebooted\ Icons.sit.data
...
(lots of files being extracted)
...
stone@durin:~/icons$

Fixing the filename
When unstuff extracts the Icon files, it leaves behind a carriage return (\r) embedded within the filename. Whoops.


stone@durin:~/icons$ ls -b The\ Matrix\ Rebooted\ Icons
Icon\r Read\ Me\ Please The\ Icons

I tried various ways of using find, xargs, and perl to do this but failed. So what we have here is clean. Simply, it will remove any control codes, including carriage returns and line feeds, from a filename. I spent way too much time trying to find a solution to this, so I hope it comes in handy for someone else.


#!/bin/sh

if [ ! -n "$1" ]; then
echo "Usage: clean.sh "
exit 0
fi

set -o noglob
find "$1" -name 'Icon*' -print | while read name ; do
newname="`echo $name | tr -d [:cntrl:]`"
mv "$name" "$newname" # do the move
done

~/icons$ ./clean The\ Matrix\ Rebooted\ Icons
stone@durin:~/icons$

Converting to png

Now that the filenames are fixed, we can get to our ultimate goal, converting the icons to png. For this, I hacked up another script, similar to clean.sh, and called it convert.


if [ ! -n "$1" ]; then
echo "Usage: convert.sh "
exit 0
fi

set -o noglob
find "$1" -type f -name 'Icon' -print | while read name ; do
icns2png "$name" # Convert to png
rm "$name" # remove old Icon
done

This one is simple enough that you can probably accomplish the same thing with find, -exec and xargs.

Matrix\ Rebooted\ Icons
Icon2PNG Linux Edition - (C) 2002 Mathew Eis
Converting The Matrix Rebooted Icons/Icon to The Matrix Rebooted Icons/Icon.png...
(repeat the above two lines for each Icon)
stone@durin:~/icons$

And you’re done. All of the OSX Icons have now been converted to png. Happy theming!

Nifty CSS trick

I was doing some design work on a new business web site and fighting some alignment issues with Mozilla and Internet Explorer. After a bit of digging I discovered this conditional comment feature of IE 5 and 6.


<!--[if IE 5]>
<link rel="stylesheet" type="text/css" href="/css/ie5-fejl.css"></link>
<![endif]-->

<!--[if IE 6]>
<link rel="stylesheet" type="text/css" href="/css/ie6-fejl.css"></link>
<![endif]-->

This doesn’t solve the problem I’m working on, but I thought this was a neat enough trick. I’m glad I don’t have to deal with design issues like this on a daily basis. It’s frustrating enough to debug simple alignment issues.