Although Wolfram Alpha (WA) entered the tech ideaspace as a revolutionary new tool, it is the latest iteration of a solution to an old problem – the process of question answering (think ask.com), a part of the larger field of information retrieval (think Google).
Other competitors in this field include Powerset, a startup recently acquired by Microsoft, which retrieves Wikipedia articles using algorithms based on fuzzy logic. Unfortunately, this is currently hardly better than searching for the given keyword on Wikipedia yourself. Another service, lexxe, discovers clusters of similar answers among the websites indexed by Google; a neat idea, but the results it produces are vague and still require human interpretation. Finally there is MIT’s START, which uses a small corpus of websites it knows how to parse and delivers definite, accurate answers within a limited ontology.
WA takes a more ambitious approach. Its corpus is a database of millions (!) of human-curated entries in a pre-defined, computable format. These are likely supplemented with ontological metadata. That its corpus is human-curated turns out to be WA’s biggest strength as well as its strongest limitation.
Until WA’s launch, I had no idea what to expect. I’m too cynical and practical to expect that I could type in “universal physics theory” and have WA analyze its corpus and come up with an answer. I also understood that a human-curated corpus could not be unlimited. I was left not knowing what sorts of queries would be answerable. Wolfram’s explanatory screencast got my hype-machine running: it demonstrated a tool that could answer questions like “how many internet users are there in Europe?” and “weather in champaign when mathematica released”. Although limited in scope, this made WA appear to be a great asset in understanding the world around us in new contexts. And at times it works very well, even with the simplest of searches.

WA on One Trillion
With a minimum of human prodding, it can produce exciting connections and results.
Where WA falls short in practical use, however, is in its user interface and its limited corpus. The limitations of its corpus are no surprise – every bit of data from which it draws is human-curated. To its credit, the examples page lists all the topics on which it has information. Although the human touch is wonderful in that the data is so well structured and more accurate than wikipedia, it’s clearly not scalable if the goal is to approach the limits of all available quantitative data.
The interface problems, however, are less excusable. In addition to having the same limitations we’re used to in natural language processing, it gives almost no useful feedback – did it have problems with keywords? What were the problematic aspects of the query that it could not parse? Could it parse the query but lacked relevant information in its databases?
Whereas Google assumes you have the brain and it has the knowledge, WA tries to cover much of the ground in between. It’s great when it answers “what is the gdp of france / italy” in one query, as opposed to two google queries and some mental math, but it ultimately is frustrating to work with since its ‘brain’ is less capable than our own. This makes using WA like interacting with an autistic savant – it only understands you some of the time, can’t tell you why it can’t understand you, and doesn’t have any regard for things outside of its corpus.
The execution of this project reminds me of Wolfram Tones, Stephen Wolfram’s attempt to make a generative music algorithm in over a dozen music styles. Wolfram Tones was ambitious, yet so over-simplified that its output was almost completely uninteresting. It defined genre by the most rudimentary properties (e.g., guitar+drums instrumentation on a 4/4 beat = rock) and music as any repetitive sequence of notes that followed basic musical theory. The resultant sounds were as offensive to the concept of art and music as they were to the ears. Wolfram Alpha, despite its uses, is similarly offensive to the concept of knowledge.
This has interesting implications regarding the progress of AI. Granted, WA makes no claims of having any AI whatsoever, that developing an AI or using an existing AI system was deemed inappropriate is telling. What this does show, however, is the result of a team of hotshot programmers with the resources of a large bank account and a finite corpus under a defined ontology working towards computer understanding, or a simulation of it. And in that realm, WA is impressive only to people who appreciate how difficult it was to implement what is there. On a purely practical level, it’s a nice tool that can do interesting things with numbers, but it is obvious it does not know why it is doing anything that it does. WA is one of the more ambitious projects in computer understanding, and it makes incremental improvements. Wolfram acknowledges this, admitting that it “almost gets us to what people thought computers would be able to do 50 years ago”. What does that say about Kurzweil’s estimation that we’re only 10 years away from the Singularity?
On Wolfram Alpha and the Singularity
Although Wolfram Alpha (WA) entered the tech ideaspace as a revolutionary new tool, it is the latest iteration of a solution to an old problem – the process of question answering (think ask.com), a part of the larger field of information retrieval (think Google).
Other competitors in this field include Powerset, a startup recently acquired by Microsoft, which retrieves Wikipedia articles using algorithms based on fuzzy logic. Unfortunately, this is currently hardly better than searching for the given keyword on Wikipedia yourself. Another service, lexxe, discovers clusters of similar answers among the websites indexed by Google; a neat idea, but the results it produces are vague and still require human interpretation. Finally there is MIT’s START, which uses a small corpus of websites it knows how to parse and delivers definite, accurate answers within a limited ontology.
WA takes a more ambitious approach. Its corpus is a database of millions (!) of human-curated entries in a pre-defined, computable format. These are likely supplemented with ontological metadata. That its corpus is human-curated turns out to be WA’s biggest strength as well as its strongest limitation.
Until WA’s launch, I had no idea what to expect. I’m too cynical and practical to expect that I could type in “universal physics theory” and have WA analyze its corpus and come up with an answer. I also understood that a human-curated corpus could not be unlimited. I was left not knowing what sorts of queries would be answerable. Wolfram’s explanatory screencast got my hype-machine running: it demonstrated a tool that could answer questions like “how many internet users are there in Europe?” and “weather in champaign when mathematica released”. Although limited in scope, this made WA appear to be a great asset in understanding the world around us in new contexts. And at times it works very well, even with the simplest of searches.
WA on One Trillion
With a minimum of human prodding, it can produce exciting connections and results.
Where WA falls short in practical use, however, is in its user interface and its limited corpus. The limitations of its corpus are no surprise – every bit of data from which it draws is human-curated. To its credit, the examples page lists all the topics on which it has information. Although the human touch is wonderful in that the data is so well structured and more accurate than wikipedia, it’s clearly not scalable if the goal is to approach the limits of all available quantitative data.
The interface problems, however, are less excusable. In addition to having the same limitations we’re used to in natural language processing, it gives almost no useful feedback – did it have problems with keywords? What were the problematic aspects of the query that it could not parse? Could it parse the query but lacked relevant information in its databases?
Whereas Google assumes you have the brain and it has the knowledge, WA tries to cover much of the ground in between. It’s great when it answers “what is the gdp of france / italy” in one query, as opposed to two google queries and some mental math, but it ultimately is frustrating to work with since its ‘brain’ is less capable than our own. This makes using WA like interacting with an autistic savant – it only understands you some of the time, can’t tell you why it can’t understand you, and doesn’t have any regard for things outside of its corpus.
The execution of this project reminds me of Wolfram Tones, Stephen Wolfram’s attempt to make a generative music algorithm in over a dozen music styles. Wolfram Tones was ambitious, yet so over-simplified that its output was almost completely uninteresting. It defined genre by the most rudimentary properties (e.g., guitar+drums instrumentation on a 4/4 beat = rock) and music as any repetitive sequence of notes that followed basic musical theory. The resultant sounds were as offensive to the concept of art and music as they were to the ears. Wolfram Alpha, despite its uses, is similarly offensive to the concept of knowledge.
This has interesting implications regarding the progress of AI. Granted, WA makes no claims of having any AI whatsoever, that developing an AI or using an existing AI system was deemed inappropriate is telling. What this does show, however, is the result of a team of hotshot programmers with the resources of a large bank account and a finite corpus under a defined ontology working towards computer understanding, or a simulation of it. And in that realm, WA is impressive only to people who appreciate how difficult it was to implement what is there. On a purely practical level, it’s a nice tool that can do interesting things with numbers, but it is obvious it does not know why it is doing anything that it does. WA is one of the more ambitious projects in computer understanding, and it makes incremental improvements. Wolfram acknowledges this, admitting that it “almost gets us to what people thought computers would be able to do 50 years ago”. What does that say about Kurzweil’s estimation that we’re only 10 years away from the Singularity?