Science Hack Day, Turing Tests and Google

For Science Hack Day, I have have been thinking about a topic that was of great interest to me whilst I was at university – artificial intelligence.

Science Hack Day hasn’t actually happened yet, by the way. It’s going on this weekend (19th & 20th June) at the Guardian offices, and there’s still time to sign up if you’re interested. This is an idea I was playing around with, but I probably won’t be doing this at the weekend unless it piques someone else’s (with more linguistic intellect) interest. Feel free to bug me if this is a topic to chat about.

The Turing Test

One of the basic concepts and experiments in the AI world is the now defunct, but intellectually and philosophically interesting, Turing Test. In the simplest terms, the test is around proving intelligence by showing human characteristics through dialogue and natural language, and this is shown through genuine human testers being blindly pitted against either another real human being, or a test program, and guessing as to whether their conversational partner is a human or not. Every year challengers from around the world still compete in this test, and produce complex computer programs that can converse with human beings and nearly fool them into believing they too are human. No one has created a program that can behave accurately, or more often randomly enough, to fool participants completely – which is why it remains an interesting, although essentially irrelevant, problem.

The reason this test is defunct as a gauge of intelligence is pretty obvious in hindsight. Being able to converse like a human being might show that whatever it is doing the conversing can take apart the constituent parts of a sentence and cobble them back together with some new information to fool a human, but it’s not really showing other markers of intelligence – specifically the ability to think. And neither does an entity being unable to converse in this way preclude it from having intelligence – you just need to look around our own animal kingdom and see the wealth of intelligence shown in other organisms that have no verbal language. The ‘Chinese Room‘ is the original thought experiment that describes this specific problem, which you should totally go and read about right now.

Now, I’m not for one moment suggesting that over 2 days (or 2 lifetimes) a person such as myself with no linguistics or complex algorithms training could create a program that could have a go at passing the Turing test and win the Loebner Prize, but I got to thinking about how people interact with the internet in such a way that maybe the Internet itself could be considered to have the capabilities, and the depth and range of knowledge, to show ‘intelligence’ as Turing would have defined it through this test.

Google as an intelligent conversationalist

Go to Google and ask it a question – even better, ask it a question and hit ‘I’m feeling lucky’. Most of the time it produces an ‘answer’ that’s pretty bloomin’ accurate to what you’re looking for. Take a sample of that page that possibly directly answers that question and cobble it into some pigeon English, and would that do as a conversational retort? Reckon it could have a stab at knowing the punchline to your rubbish ‘Knock knock…’ joke? I think it could.

In fact, from the Loebner Prize rules, the sample questions are all easily answerable by Google – the only thing it would struggle with is the memory part, but with Google’s ever growing logging of what kind of information you search for, it’s only a short way from that.

I was googling about trying to find other people who must have been thinking about using search engines for turing tests, and came across John Ferrara in 2008 discussing the user interaction benefits of using search in a way that would produce Turing test-ready results (I particularly like his accurate prediction that ontologies are the way forward – more on that later). Google is clearly doing some really interesting, and without doubt highly complex, things around parsing search terms and working out what the interesting parts of the query are. They’re doing Natural Language Parsing, but just one way – the asker to the responder.

Natural Language Parsers

So, I started digging about on the web for a natural language parser to see if I could maybe package up Google results in one line retorts. In JavaScript. Mostly because I’m a client-side developer, but also because that seemed like a funny idea (one late night after a couple Amstels) and JS can be lightning fast in the right environment. Unsurprisingly – there wasn’t one. I found this nice little ‘parts of sentence’ tagger that someone had ported from another project into JS, and this seemed like a good start, and there’s OpenNLP – the open source hub for NLPs (mostly in Java, Perl and Python). Then Jake suggested I port one of the ones in Python to JS. Ah hah hah, where’s that <sarcasm> element when you need it?

The highly complex NLP part is really only the dressing. It’s the bit that does the fakery and really reacts and responds and produces pretend empathy and is essentially what people who are trying to win the Loebner Prize care about – to be honest, there’s plenty of real people behind machines to talk to than we really need as it is on the internets, let alone adding a bunch of equally inane computer ones – so I’m not really that interested in that to any complex level – I just need something relatively simple.

I am interested in mining the ever growing source of richly marked up data and sources on the web, and presenting them back to a human being in a friendly, natural way. Basically, I want one of those slightly-sinister robot voices talking to me from my computer, as featured in all good sci-fis (maybe less Hal and more Gerty) who can cooly and calmly, for example, present me the probable likelihood of poisoning myself by eating out-of-date eggs or what factor suncream it might be wise to wear to the park tomorrow so that I don’t burn to a crisp. An information supplier and sympathiser that’s smarter than me and knows about more sources of information than I could and can save me a bit of time wading through google results.

Let’s talk

So, on to my fuzzy notion of how this might work, just as a thought experiment at first and maybe a slightly naff proof of concept.

Blindly searching google for sensible responses from any old web page seems foolish. An awful lot of sites continue to be badly formed and unintelligible to machines. The obvious thing to do is restrict searches to sites with well-formed data – microformats and RDF seem like the obvious things to look for. This clearly poses a slight problem in that not all topics exist in well-formed data, but over time, that’ll improve. To make this proof of concept easier, and one that I could feasibly think about building in a weekend, I’m therefore going to limit the topics of interest to data I know I can get at in a well-formed model.

Let’s have a chat about food. I’m going to propose a fictional conversation that I want to create the responses to automatically.

Maybe we want to ask our machine:

Do you know any good vegetarian recipes?.

A good response might be:

Yes, I know 20582746 vegetarian recipes. Do you want to narrow it down a bit?

Yes, I’m looking for a good recipe for a feta and spinach tart.

I have a good recipe for that. Would you like me to give you a link to it, or just tell you the ingredients?

I want to stop there and illustrate a couple of interesting things about these sentences. Firstly, the word ‘good’. How could a machine know if a recipe is good? Well, hRecipe allows for a recipe to receive a rating – the machine could use this to determine whether to describe the recipe it’s found as ‘good’. Likewise, I could have asked it ‘What’s the worst meal you’ve eaten?’ and perhaps it trawls off for the first lowest rated recipe it can find and declares that its least favourite. Kind of makes me think that this machine person would need to be called Legion, because rather than having the opinion of an individual (or rather the opinion of the programmer), it has the crowd-sourced opinion of all web participants.

Great. Does it have tomatoes in it? I don’t like tomatoes.

No. Would you like the recipe now?

Yes, what are the ingredients?

And so on… Having a program read back the parts of a well-formed recipe are really easy. Recipes marked as hRecipe clearly define each of the parts. You could ask it to read you step one of the method, or repeat step 3, or double check what temperature the oven needs to be at. To be honest, you could obviously be reading that directly yourself, but the act of marking up information like that makes it really easy to programmatically extra useful, relevant, information out of a webpage, strap it into some semblance of natural english, and read it out to a person in such a way that a person might believe that a human being was interpreting the page, which they could find more accessible. And that’s the ticket, really. Google search results, or rather the elements derived from rich data snippets, become the lexicon element of the previously mentioned NLPs.

Limitations

What it probably couldn’t do is tell you how it’s feeling or where it lives – the sort of questions and topics that turn up in the logs for turing tests – but really, does it matter? It would probably also get confused really easily by badly formed pages and it would just as happily give you bad, irrelevant or plain gibberish responses sometimes – but all computers will do that – which is a greater reason to make pages as well-formed and parsable as possible.

Even if my notion of a simple friendly-face Google bot couldn’t pass the Turing Test, I bet that if Alan Turing had still been alive at the advent of Google and Wolfram Alpha and the likes, he’d be bloody impressed and be pleased to know that he probably instigated some of it.

Which reminds me – June 2012 will celebrate Turing’s 100th birthday – Pretty sure we’ll need to have an extra special Science Hack Day for that too, don’t you think?

Leaving the BBC, joining Nature Publishing Group

It’s true – I am leaving the BBC! As of June 2nd, I’ll be a front-end developer at Nature.

The last three years at the BBC have been good ones. I think the quality of the output and massive range of products that have come out of the development teams has just been amazing. It feels like everyone I have had the pleasure of working with at the Beeb has been smart, engaged and really got the web and wanted to make cool things.

I’m certainly sad to be leaving. I’ll of course be missing the Glow Super Friends a lot, in particular, but I feel that I’ve made brilliant friends and connections in various corners of the company and there are many I hope to continue seeing a lot of and will no doubt get to work with again in the future. I leave knowing I’m going to miss everyone to pieces, but London really isn’t that big – so they won’t get rid of me too easily, even if I do have to stalk Vesbar.

But ever onwards – the season called for a change of scenery and getting a look at a whole new ecosystem of challenges. I think working at Nature will be great and I can’t wait to get stuck in. I only hope they’re ready for my special brand of optimism.

London Web Standards – slides and further info

Sorry for the delay, but I finally got around to sticking my presentation from last month’s London Web Standards meet-up on slideshare. Slideshare is a bit naff to be honest, but it’ll do for now. If you click through to the talk on slideshare, you’ll be able to get my notes which should hopefully make the pictures more useful. Jake‘s busy syncing up both of our presentations to the videos so that we can show them on the BBC developer blog, so as soon as they’re available I’ll link those up too and you can view me in full hand-flapping, ranting form.

I think I speak for both of us when I say that we really enjoyed the evening – everyone was lovely and friendly and asked really excellent questions. Highly recommendable meet-up, and we’re both intending to try and make it to some of the future sessions.

Some useful links from my stuff:

London Web Standards Talks

Jake and I will be guests at March’s London Web Standards meetup. We’re giving a pair of JavaScript themed talks that should give plenty of fodder for the latter half of the evening’s discussion. I’m doing “Pro bun-fighting“, covering how we manage working on a large scale JavaScript project with a small team, our process, the performance and quality testing we do, and how to integrate group hugs, and Jake will be doing “The events left behind“, talking about the horrors of keyboard events, how to work around them and what’s their future.

Although it’s not a Glow specific talk, we will be using Glow in our examples, so feel free to come along and talk to us about the library too, if you’re interested.

Tickets are available now for the event on Wednesday 31st March at The Square Pig in London.

The Christmas Bunny book prop & illustration work

One of the things I enjoy doing that isn’t web related is illustration, and last week I was asked to create a set of illustrations and a book prop for Patrick‘s short-film, The Christmas Bunny. The film was shot this weekend past, and is in the editing stages, but I thought I’d share some photos of the prop and illustrations.

Tooth Fairy book illustration

Children sleeping book illustration

See the rest of the shots on flickr.

Making the book

For those interested, the illustrations were drawn on white bristol board and inked with fast-dry black pigment liner, and then scanned and printed on to light-weight (80gsm) cream paper and cut to size with a craft knife. I then had some trouble figuring out the best way to attach the pages to the ancient book we found on ebay, without permanently damaging it.

I ended up bracing the illustration and text pages with extra blank sheets on either side, binding the edge with masking tape. Then I used some partially dried glue stick (pritt-like) which I could pinch pieces off and roll into sausage shapes and press into the masking-tape spine, to create a malleable, but strong, join for the pages to move on. No super-glues I had seemed to work as well as this rather Blue Peter-esque technique. The best thing about the glue-stick solution is that it rubs off the paper anywhere that it shows, so the join is seamless.

It was a nice little project and I’m really glad to have been able to contribute to the film in some way. The first two illustrations and title are used as the introduction to the film, with a narrative voice-over and music, and the final illustration is used as the outro. Hopefully I’ll get to do some more illustration work in the future.

Computer engineer Barbie

Barbie has her 125th career – computer engineer! There’s been a few comments around about how Mattel are pandering to further stereotypes – sticking her in a pair of pink glasses is enough to insinuate that she’s now “intellectual”. I don’t think that’s all that bad. The glasses thing, sure, I’m a bit biased, but I don’t see anything wrong with putting Barbie in a pair of specs for her computer engineering job. It’s not an entirely false correlation. Many people who work on computers need glasses because they stare into the pixel void for 12 hours a day. So what? I think it’s kind of cute – and why not portray a computer engineer as cutesy? The fact is, that’s the only wearable “accessory” they felt she needed to portray her new job. That’s right, isn’t it? What more do you want? Computer engineers should look however they like – there’s no uniform. The bluetooth headset is a bit daft, but small details.

Rachel Andrew blogged today about a very sad incident yesterday, where herself and her fellow female speakers were mocked by audience members of Boag World’s live podcast event. Essentially, viewers in the backchannel decided to concentrate on their physical attributes rather than their well educated views, with suggestions that they were far too good-looking and well presented to be there for their abilities alone.

Rachel has rightfully pointed out that such behaviour shouldn’t be tolerated, but she also writes about how women in technology shouldn’t be encouraged to dress down or become more tom-boyish just to feel accepted or to avoid attention.

Barbie has a whole host of more fundamental reasons why she’s probably a poor role-model for little girls (her figure is the obvious one), but I don’t think having her careers be varied and non-traditional is one of them. I’m actually into the idea of a Barbie that helps to say that it’s okay to be as girly-a-girl as you want to be and work in traditionally male dominated industries. And hey, I think glasses look cool.

Writer’s block and Project52

This year, I thought it might be fun to try taking part in something that would get me writing more. Anton Peck started Project52 with a simple aim to produce a blog post a week for all of 2010.

It’s hard. Really hard.

It’s week 5 and although I generally suffer various rage related incidents* over the course of a week, nothing has presented itself as particularly bloggable. Likewise, work has been fairly unspectacular and I’ve not been especially creative so I’m lacking anything of true substance to talk about or teach. Next week should be better, as there’s an upcoming event I’ll be involved with and I’ll have produced some extra-curricular illustrative commisions I’d like to share.

I asked twitter – the natural home for people who don’t know what to talk about – and the suggestions came back that I just get this stupid meta-post over and done with and talk about writer’s block (cheers Olly and Craig).

I like writing. I don’t think I’m particularly good at it, but I can string a few words together into something that vaguely resembles prose. Finding topics that haven’t already been talked about excessively in the web world is just an especially difficult challenge.

Only today, a mailing list I frequent has been discussing how difficult it can be to stay motivated and interested in a field that’s coming out of it’s emergent phase. Finding a cause that doesn’t already have more than enough band-wagoners is rare and finding something unique to add is unusual and perhaps it’s feeling less ground-breaking. There’s less to do for the invidiual as more hands come on deck. Ultimately, this is super for the web but not so good for personal satisfaction, in my opinion. The word “jaded” was used, but I think (and hope) it’s a bit early for that.

Finally, a suggestion from Matt:

@phae Ask for suggestions of what to write about. :)

matthewpenell

So, dear readers (probably, mostly, I should just address this to “mum”), anything I’ve hinted at in the past that you’d like me to elaborate on in the future? I know it’s a cop-out, and it’s lame to ask, but hey… you never know, it might work.

* OK, here’s a little bonus list of things that have made me want to strangle people this week:

  • CSS3 being compared directly to Java Applets – please, people. Lets at least let it out of the stalls before we condemn it.
  • More pro-homeopathy articles, the Pope, the Daily Mail, the usual.
  • Email responses to technical debates that consist of nothing more than off-topic quotations.
  • Latest version of Chrome reporting unexpected background-repeat values in JS-land.
  • iFrames.
  • A guy on the tube who complained about the placement of someone’s feet (they were a cm too close to his) and the crowded nature (it wasn’t very crowded) of the carriage.

A brief word on homeopathy

I’m generally completely non-plussed about petitions and marches and all that freedom of speech type gubbins that angry people get involved with all too easily. I think you should pick your battles and save up your bile and wit for when it really counts. But there’s something about the 1023 campaign that really strikes a chord with me. There were government reviews in 2009 as to whether the NHS should continue to fund homoeopathy, so I think this could be the year we see it finally get cut and I’m happy to help tip the balance by picking a side.

If you haven’t already stumbled across the many manic rants about homeopathy, and why it’s such a ludicrous load of rubbish, then here’s a selection of posts I could recommend (Update: Here’s an excellent one from New Scientist today that covers everything up to now). The videos on the 1023 site alone are good and will help explain things quickly and often hilariously.

This weekend, 1023 has organised an active protest aimed at Boots. Around the country objectors to homeopathy will be necking a whole packet of homoeopathic pillules (sugar pills) to show that there’s no actual active ingredients in them, since they won’t be keeling over (I shall be amongst them). Homoeopaths are making defensive statements already to suggest that it won’t do anything (they know as well as we do that they won’t have any affect), because without a trained homoeopath prescribing the correct pills for the correct illness, it won’t work (something to do with it being like the likelihood of having an allergy, or you have to have the right illness for the right pill for the magic to work… I don’t quite get it). In my mind, that weakens their argument even more, since Boots sell non-prescribed pillules without advice to anyone – so they shouldn’t work for those non-protesting people either.

I love the NHS. It’s one of the main reasons I’m walking around now having a generally jolly good time of it. I think as a nation we’re proud of it and what it provides for us, but as with most things, it’s under-funded. Something like four million pounds a year goes into funding homeopathy treatments and hospitals. If you take a look at the research on homeopathy, it’s just an elaborate placebo effect, and it seems a lot of homeopathists don’t even deny this, and say that it’s the act of caring and talking and the long appointments people get to have that help make them feel better – so I’m all for scrapping the lunacy and putting that money into therapies and councillors. Should have a similar sort of result, no?

Anyway – all that aside, and all the obvious nonsense and I’m still left with my biggest issue with the topic. If you want to take a “them” and “us” approach to the argument, my problem is with some of the people on “our” side. You’ll see comments on articles and posts all the time that go something along the lines of “Who cares? It’s charlatans selling pills to fools“. Sounds fair, right? I don’t agree. Charlatans: yes, generally. Fools: I don’t think so. I think consumers and patients are well within their rights to follow recommendation.

Take Boots for example. Although clearly a commercial entity first, they still have a role in our world as a trusted pharmacy with a brand we recognise. Is it so wrong that people should trust a pharmacy to sell pills that have some efficiacy? Could you honestly say that you understand how the paracetamol or aspirin you take works? What it’s chemical structure is? How it’s produced? What it does to your body? You take them regardless, because you trust that those tablets have been tested to be safe, work and reliable. We’re not expected to be experts on medicine. We don’t have to be, because we rely on trained professionals to direct us. When the NHS provides money to a practice that is unproven, who are we as consumers of the NHS to question what appears on the surface to be a funding-based seal of a approval? Call people fools if you like once they’ve been shown and had to confront the science, but you can’t label general man-on-the-street consumers as those people.

Someone asked me once if it meant we should be up in arms over anti-aging face creams with false claims for all the same reasons and my response to that is I’ll start caring if they falsely claim to cure your illnesses too. Buying a cream and still having a few wrinkles isn’t likely to be fatal, but not taking the prescribed and proven medication, in place of a few sugar pills when you’re seriously ill, just might be.

Cold-calls and Madison Maclean recruitment

Update: I did email Madison Maclean to complain, with a link to this post on the 19th January, and as of a week later, I still have had no response.

Update 2: Today, 19th October 2010, I received an email from the manager of Andrew Holden. He requested that I remove this blog post. Since these events did happen, I won’t be removing the post. I am however adding this comment to say Andrew no longer works at the company (and hasn’t for some time), although I have no confirmation if that was the same person who cold-called me, and whether you choose to work with them in the future is entirely up to you. I was not offered an apology.

Original post

I think it’s a fairly well agreed statement to say that recruitment agents aren’t particularly nice to deal with. Especially when they’re cold-calling you at your place of work with jobs you’re not interested in. I will just say that I have worked with 1 nice agent, who got me the interview for my current job at the BBC, but she had the right knowledge at the right time when I asked for it.

So, a little story about yesterday and what not to do if you’re contacting me.

Around 4.30pm my office phone rang (I don’t publish this number, and I’d have trouble reciting it myself – you can only get it from the outside by calling the reception and being asked to be put through). My phone never rings at work, except for cold-calls from recruitment agents, so I’d already anticipated answering, saying “no thanks” and hanging up. A 30 second call at best.

Instead, I answer and the chap on the end of the line does his usual speil of who he is and where he works and if I’m available to talk at the moment. I answer “No, not really. As you can probably tell I’m at work, and also, you’re a recruitment agent and I’m not looking for work at the moment, so I’ll save you the time and say “no thank you””.

“But you haven’t heard what I want to offer yet…”

Ok, that’s true, but I still wasn’t interested, thanked him for the call and as an ending question, I enquired as to where he got my number.

“From your linkedin profile”.

Now, I’d like to direct everyone to my linkedin profile. Click through to contact me and find the section. What does it say? You have to be a member to take a look, so to save you the time, this is what it says:

Please email me. Calling my company and getting to my desk phone via the switchboard is unappreciated (and this keeps happening, so stop it). *NO* cold-calls from agencies.

To be fair, I added the final line about cold-calls yesterday evening, but the rest was there about not calling my switchboard and to email me.

So, being caught out with a lie, I expected an apology or at least some sort of sign that he had become confused or disorientated, and I point out that I expressly say that I do not want people to phone me. No, instead, he says: “Well, I might have got it from a colleague that you’d previously spoken to, but you’re clearly all over the internet. You’re inviting people to phone you, and you shouldn’t expect people not to. I’m perfectly within my right…”. I’m sorry, what? I *invite* recruitment agents who can’t be bothered to read my profiles properly to cold-call me about jobs I’m clearly not interested in? I correct him and said that wasn’t the case, and he continued to argue the point, at which time I decided this wasn’t worth my time of day, thanked him again for his call and hung up on him mid-sentence. That might have been rude of me, but not half as rude as he was. I quickly vented on twitter, and some of the responses I received were interesting:

Some choice coloquialisms seem warranted. In the King’s English, you might remind them that a combative cold call accomplishes nothing

by erickolb

wow. does this mean you need to have a ‘no recruitment agencies pls’ signature appended to every online post you make?!

by gradualist

there’s something about that argument that strikes me as a bit—for want of a more appropriate adjective—“rapist-y”.

by fatbusinessman

Reading that again, it sounds like a horrible rape defence

by jaffathecake

From what I can tell, part of a recruitment agent’s job is building up a relationship and repertoire with potential candidates. Cold-calling people, lying to, and arguing with potential candidates does not seem to be the fastest way to build a lasting bond.

Identifying the company

I had to google the company to remember who it was that called me, as I’d seen red and forgotten exactly who it was. I’m so used to just saying “no thanks”, hanging up and that being the end of the story. I knew it was Madison-something, but couldn’t recall which. Turns out there’s a whole ton of Madison-something recruitment agencies in London alone, each apparently specialising in IT. It strikes me that there’s too many companies with names Madison-something, too many agencies and too many that specialise in the same thing. So, I can imagine the area is highly competitive.

He had hinted that he’d got my name from a colleague, so I searched my inbox for “Madison” and bingo, I had an email from Madison Maclean in 2008 from an Andrew Holden. I can’t recall if that’s the name of the guy that called me, but he’s certainly the person I responded to with “I’m not currently seeking employment at this time, and probably not ever in the financial services area.” after he emailed me in 2008 with job opportunities in the banking area, so I’m happy to let him take the blame for not taking me off their books (that I never signed up for in the first place).

As a favour to me, could you avoid Madison Maclean if you’re job hunting? Thanks.

JavaScript speed testing tutorial with Woosh

Friend and colleague, Jake Archibald, has been developing Woosh, which is a JavaScript speed testing framework. Essentially, it’s been developed for Glow because we want to make sure that Glow 2 kicks Glow 1’s ass (and any else who fancies a piece), but he’s open-sourced the work to let everyone benefit from it.

I thought I’d run you through how to set up some basic tests and start benchmarking your own code with Woosh. Bear with me, as it’s still quite new to us too.

Setup

Firstly, go and grab the latest copy of Woosh from the Github repo and pop it somewhere to work with it. You’re just running scripts, so there’s nothing to install or configure. Bear in mind that at the time of writing, Woosh isn’t at it’s first version yet – so, not that I’m doubting Jake’s work, you may find the odd bug and if you do, I’m sure logging it in the issues tracker would be marvellous.

If you’re a git user, feel free to include Woosh as a submodule of your own project.

Woosh is primarily designed for comparing libraries, but there’s no reason why you can’t use it to take a benchmark of your existing scripts and then work up optimised versions to compare. If your code can be unit tested well, it can be speed tested just as easily.

Firstly, you need to let Woosh know about the scripts you want to test. You can just add references to each of your scripts using the Woosh.libs property. For each script to test, though, just make sure they have a unique name so you can reference them later (have a sneaky look in Woosh.js to see which libraries already exist and the formats used – infact, if you’re taking your own copy of Woosh, you can just edit your files and add your scripts straight into this file and skip adding them in the test runner page).

Below is how your test runner HTML page should look (also available in the examples directory in the Woosh repo). Notice the reference to your script in the woosh include section, then beneath are links to your individual test files. To make it more manageable, it’s probably best to have one test JS for each script you’re comparing. Remember that Woosh looks for your scripts relative to where you’ve got woosh.js.


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
	<meta http-equiv="content-type" content="text/html; charset=utf-8">
	<title>My Tests</title>

	<!-- include woosh -->
	<script src="/where/youve/got/it/lib/woosh/woosh.js" type="text/javascript">
		woosh.libs['myTestScript1'] = ['/path/to/scripts/myTestScript.js']
	</script>

	<!-- Add any CSS you need for the test, but restrict styles to #htmlForTest -->

	<!-- Add your tests. The first will be treated as the master -->
	<script src="MyTestScipt1-Tests.js" type="text/javascript"></script>

</head>
<body>
	<div id="wooshOutput"></div>
	<div id="htmlForTest">
		<!--
			Put elements you want to use in your tests here. The page will be
			refreshed for each set of tests, so don't worry about one framework
			messing with another.
		-->
	</div>
</body>
</html>

The final item in the example above would be your intitial test script to be benchmarked (MyTestScript1-Tests.js). This is just a JavaScript file which will call woosh.addTests (as further down).

Now you’ve got a choice: You can either make minor changes and incrementally watch the improvements, possibly using the save feature, or you can create a copy of your script. I’d recommend the latter, so create a copy of your script, and add a reference to it with Woosh.libs again and also create a file to add the actual tests to for it.

You can add and compare as many scripts as you like, so long as their methods are comparably the same.

Creating tests

Adding tests is easy and in a way, they become an extension of your unit tests, confirming that the return values or behaviours match across the board.

Test files look like this and you can either put all your tests in each file, with a block for each script, or put each block in it’s own file. So, below would be your contents of MyTestScript1-Tests.js. You’ll need a second for MyTestScript2-Tests.js etc.


woosh.addTests('myTestScript1',
	'Test name identifier 1': new woosh.Test(1000, function() {
		return myFunc();
	}),
	'Test name identifier 2': new woosh.Test(1000, function() {
		return myOtherFunc();
	})
});

Things that matter about these files:

  1. The name identifier needs to match for each of the tests. Woosh isn’t looking for them in order – it’s matching on the names to know which should go into each row. i.e. “Test name identifier 1” should be the same in all test files for matching tests for that function.
  2. The first parameter of addTests should be the name you gave the script in the Woosh.libs command, so Woosh can find your script.
  3. The first parameter of woosh.Test is the number of times a test is to be run. This should be the same for sets of tests for the same thing. If it’s not, Woosh will flag up the test as being unfair.

The value for iteration times is important. It’s large because that’ll help shake out inaccuracies. Woosh will run the test for the number of times specified, then divide the result by this number to give the average run time for that function. You may find that some browsers don’t cope so well with very large numbers of iterations (uh.. IE, we’re looking at you) so don’t go mad with it and think that running it a million times will help your accuracy. On Glow, we tend to aim to run tests from 100 to 10000 times.

Saving tests

You can save one previous set of tests by clicking the floppy-disk icon. It’s just stored in a cookie, and will be over-written if you choose to save another column of tests, but it’s useful if you’re just doing some small changes and just want to compare before and after.

The hard work

Now, of course, it’s down to your hard work. Writing the speed tests is really the easy bit, made ever more so by the simplicity of Woosh. Try your optimisations in the second script and use Woosh to help you benchmark the new script against the old one. All you need to do is load up the test runner page and hit start. The results will pop up as they complete and become colour coded as the results are compared. Keep an eye out for tests that error (they’ll go slate grey) or test titles that turn yellow (the titles click to expand further test information). Either of these can indicate that the test isn’t fair because the iteration value isn’t matching, the return values aren’t the same or a method has failed all together. You should aim to have all your tests running without errors or warnings.

Another thing to note is that you’ll still need to run all of these tests in all of the browsers you want to optimise for. You’ll find massive varience in some cases and it’ll be up to you to decide where to keep the speed. Jake’s Full Frontal presentation covers some of the things to look out for, so that’s definitely worth a look over (most importantly, make sure you’re not running developer tools like Firebug when running your tests, since it’ll skew your results quite heavily).

Further reading

If you want to have a look at some real tests, Glow 2 has a fair few now for some of the basic modules. They’re all up on github, so have a dig around or feel free to clone the repo and run the tests yourself.

The full API has been documented for Woosh, too, although I believe that might be an exclusive as I cannot see reference to it from the github docs at the moment. I recommend taking a look through those to see about running async tests and preparing your code with setup functionality using $preTest, as well as a few other features you might find useful.

On another testing topic, Mat Hampson published an article on A-B testing on the new BBC Web Developer blog.