For Science Hack Day, I have have been thinking about a topic that was of great interest to me whilst I was at university – artificial intelligence.
Science Hack Day hasn’t actually happened yet, by the way. It’s going on this weekend (19th & 20th June) at the Guardian offices, and there’s still time to sign up if you’re interested. This is an idea I was playing around with, but I probably won’t be doing this at the weekend unless it piques someone else’s (with more linguistic intellect) interest. Feel free to bug me if this is a topic to chat about.
The Turing Test
One of the basic concepts and experiments in the AI world is the now defunct, but intellectually and philosophically interesting, Turing Test. In the simplest terms, the test is around proving intelligence by showing human characteristics through dialogue and natural language, and this is shown through genuine human testers being blindly pitted against either another real human being, or a test program, and guessing as to whether their conversational partner is a human or not. Every year challengers from around the world still compete in this test, and produce complex computer programs that can converse with human beings and nearly fool them into believing they too are human. No one has created a program that can behave accurately, or more often randomly enough, to fool participants completely – which is why it remains an interesting, although essentially irrelevant, problem.
The reason this test is defunct as a gauge of intelligence is pretty obvious in hindsight. Being able to converse like a human being might show that whatever it is doing the conversing can take apart the constituent parts of a sentence and cobble them back together with some new information to fool a human, but it’s not really showing other markers of intelligence – specifically the ability to think. And neither does an entity being unable to converse in this way preclude it from having intelligence – you just need to look around our own animal kingdom and see the wealth of intelligence shown in other organisms that have no verbal language. The ‘Chinese Room‘ is the original thought experiment that describes this specific problem, which you should totally go and read about right now.
Now, I’m not for one moment suggesting that over 2 days (or 2 lifetimes) a person such as myself with no linguistics or complex algorithms training could create a program that could have a go at passing the Turing test and win the Loebner Prize, but I got to thinking about how people interact with the internet in such a way that maybe the Internet itself could be considered to have the capabilities, and the depth and range of knowledge, to show ‘intelligence’ as Turing would have defined it through this test.
Google as an intelligent conversationalist
Go to Google and ask it a question – even better, ask it a question and hit ‘I’m feeling lucky’. Most of the time it produces an ‘answer’ that’s pretty bloomin’ accurate to what you’re looking for. Take a sample of that page that possibly directly answers that question and cobble it into some pigeon English, and would that do as a conversational retort? Reckon it could have a stab at knowing the punchline to your rubbish ‘Knock knock…’ joke? I think it could.
In fact, from the Loebner Prize rules, the sample questions are all easily answerable by Google – the only thing it would struggle with is the memory part, but with Google’s ever growing logging of what kind of information you search for, it’s only a short way from that.
I was googling about trying to find other people who must have been thinking about using search engines for turing tests, and came across John Ferrara in 2008 discussing the user interaction benefits of using search in a way that would produce Turing test-ready results (I particularly like his accurate prediction that ontologies are the way forward – more on that later). Google is clearly doing some really interesting, and without doubt highly complex, things around parsing search terms and working out what the interesting parts of the query are. They’re doing Natural Language Parsing, but just one way – the asker to the responder.
Natural Language Parsers
The highly complex NLP part is really only the dressing. It’s the bit that does the fakery and really reacts and responds and produces pretend empathy and is essentially what people who are trying to win the Loebner Prize care about – to be honest, there’s plenty of real people behind machines to talk to than we really need as it is on the internets, let alone adding a bunch of equally inane computer ones – so I’m not really that interested in that to any complex level – I just need something relatively simple.
I am interested in mining the ever growing source of richly marked up data and sources on the web, and presenting them back to a human being in a friendly, natural way. Basically, I want one of those slightly-sinister robot voices talking to me from my computer, as featured in all good sci-fis (maybe less Hal and more Gerty) who can cooly and calmly, for example, present me the probable likelihood of poisoning myself by eating out-of-date eggs or what factor suncream it might be wise to wear to the park tomorrow so that I don’t burn to a crisp. An information supplier and sympathiser that’s smarter than me and knows about more sources of information than I could and can save me a bit of time wading through google results.
So, on to my fuzzy notion of how this might work, just as a thought experiment at first and maybe a slightly naff proof of concept.
Blindly searching google for sensible responses from any old web page seems foolish. An awful lot of sites continue to be badly formed and unintelligible to machines. The obvious thing to do is restrict searches to sites with well-formed data – microformats and RDF seem like the obvious things to look for. This clearly poses a slight problem in that not all topics exist in well-formed data, but over time, that’ll improve. To make this proof of concept easier, and one that I could feasibly think about building in a weekend, I’m therefore going to limit the topics of interest to data I know I can get at in a well-formed model.
Let’s have a chat about food. I’m going to propose a fictional conversation that I want to create the responses to automatically.
Maybe we want to ask our machine:
Do you know any good vegetarian recipes?.
A good response might be:
Yes, I know 20582746 vegetarian recipes. Do you want to narrow it down a bit?
Yes, I’m looking for a good recipe for a feta and spinach tart.
I have a good recipe for that. Would you like me to give you a link to it, or just tell you the ingredients?
I want to stop there and illustrate a couple of interesting things about these sentences. Firstly, the word ‘good’. How could a machine know if a recipe is good? Well, hRecipe allows for a recipe to receive a rating – the machine could use this to determine whether to describe the recipe it’s found as ‘good’. Likewise, I could have asked it ‘What’s the worst meal you’ve eaten?’ and perhaps it trawls off for the first lowest rated recipe it can find and declares that its least favourite. Kind of makes me think that this machine person would need to be called Legion, because rather than having the opinion of an individual (or rather the opinion of the programmer), it has the crowd-sourced opinion of all web participants.
Great. Does it have tomatoes in it? I don’t like tomatoes.
No. Would you like the recipe now?
Yes, what are the ingredients?
And so on… Having a program read back the parts of a well-formed recipe are really easy. Recipes marked as hRecipe clearly define each of the parts. You could ask it to read you step one of the method, or repeat step 3, or double check what temperature the oven needs to be at. To be honest, you could obviously be reading that directly yourself, but the act of marking up information like that makes it really easy to programmatically extra useful, relevant, information out of a webpage, strap it into some semblance of natural english, and read it out to a person in such a way that a person might believe that a human being was interpreting the page, which they could find more accessible. And that’s the ticket, really. Google search results, or rather the elements derived from rich data snippets, become the lexicon element of the previously mentioned NLPs.
What it probably couldn’t do is tell you how it’s feeling or where it lives – the sort of questions and topics that turn up in the logs for turing tests – but really, does it matter? It would probably also get confused really easily by badly formed pages and it would just as happily give you bad, irrelevant or plain gibberish responses sometimes – but all computers will do that – which is a greater reason to make pages as well-formed and parsable as possible.
Even if my notion of a simple friendly-face Google bot couldn’t pass the Turing Test, I bet that if Alan Turing had still been alive at the advent of Google and Wolfram Alpha and the likes, he’d be bloody impressed and be pleased to know that he probably instigated some of it.
Which reminds me – June 2012 will celebrate Turing’s 100th birthday – Pretty sure we’ll need to have an extra special Science Hack Day for that too, don’t you think?