farmdev

An In-Process, Headless Web Browser for Python?

Yesterday, Atul Varma announced he had resurrected python-spidermonkey, John J. Lee's project to run JavaScript in Python. Woo! Spider Monkey is the JavaScript runtime that Mozilla (i.e. Firefox) uses internally. However, it's just an engine and doesn't deal with the DOM or anything about the typical web browser environment JavaScript files usually run in. Roughly, that environment defines the following JavaScript objects:

I've always been interested in python-spidermonkey for the prospect of fully testing JavaScript-dependent web applications in Python alone. This is no small feat and there are many caveats. The first caveat is that if the Application Under Test is running in something too far off from its native runtime then the value of your tests rapidly decreases. In other words, the best way to test a JavaScript-heavy web app is to use Selenium or Windmill to automate and introspect the app in its native runtime: the web browser. I believe Java has Rhino but since it's implemented in Java my guess is its behavior is pretty far off from that of Firefox. We all know and love the quirks of web browsers; this is what makes web development so fun :(

A lightweight version of Firefox — its JavaScript engine, its DOM, etc — could possibly make web testing a lot more efficient. By lightweight I mean no GUI, no visual rendering (although removing this may greatly hinder JavaScript, not sure). Why? At the company I'm at we build and test web applications. Our largest suite of Selenium tests that automate a real web browser takes about 2 hours to run and that's expensive. We are looking into a Selenium Grid solution but if there's a way to simply cut out all the browser overhead that we don't need, this would be a huge score.

And, oh yeah, I haven't even talked about how tests for web apps need to run in Internet Explorer if they are to be worth anything. However, we make some apps for internal company use and thus can mandate the use of Firefox. To solve the cross-browser problem one approach could be to implement all this logic as a driver that can be hooked into the Selenium interface we use, which is all in Python code thanks to Selenium Remote Control. This the exact approach that Webdriver for Java takes &mdash you can simply tell it to use a faster in-process web browser and it will run JavaScript but only via Rhino, which goes back to the non-native runtime problem. In fact, WebDriver will soon be an actual driver for Selenium.

Hmm ... thinking out loud ... maybe the solution is to just run XULRunner as a headless firefox, the way SMILE web scraper does. Maybe I'll try that.

Besides web testing, one cool thing about python-spidermonkey is it may allow for unit testing of JavaScript libs in Python. This means you could create an import hook that loaded *.js files into Python and do all your testing in Python, reaping the benefits of tools like nose. Or, maybe more realistically, automate the testing of custom Firefox extensions implemented in JavaScript.

Have you ever ran unit tests in JavaScript? It's clunky because you need to run them in a browser. That's how JSUnit works anyway, which is otherwise a passable testing environment. To automate such tests you have to use something like Selenium to run your test suite in a real web browser.