Thursday, February 17, 2011

Oh, no no no no

I'm reading Final Jeopardy: Man vs. Machine and the Quest to Know Everything, the book about making Watson. In Chapter 1, the author is discussing the problem of simply parsing the question "What is Francis Scott Key best known for?" (a question that derailed Watson's precursor in 2005 tests). After explaining how the computer has to figure out what the question is even about - for instance, is "Key" part of a name or something that Francis Scott made or used? - he points out that a person will either know or not that Key wrote the US national anthem, but won't in either case spend time wondering if pies or locks figure into it.

All well and good, and even interesting. But then he says:
For the machine, things only got worse. The question lacked a verb, which could disorient the computer. If the question were, "What did Francis Scott Key write?" the machine could likely find a passage of text with Key writing something, and that something would point to the answer. The only pointer here -- "is known for" -- was maddeningly vague. Assuming the computer had access to the Internet (a luxury it wouldn't have on [Jeopardy], it headed off with nothing but the name. In Wikipedia, it might learn that Key was "an American lawyer, author and amateur poet, from Georgetown, who wrote the words to the United States national anthem, 'The Star-Spangled Banner.'" For humans, the answer was right there. But the computer, with no verb to guide it, might answer that Key was known as an amateur poet or a lawyer from Georgetown. In the TRec competitions, IBM's Piquant botched two out of every three questions.
Again, this is fascinating. But.


Dammit there is too a verb there. Even if you stubbornly refuse to accept "is known for" as a phrasal verb, if you refuse to take "known" as a participle and insist it's an adjective, what is that is in there? Chopped liver?

I'm sure what he's trying to contrast here is the presence of a good, content-full lexical verb like "write" with the copula, and its function, which some people argue is merely a tense marker. And even if they taught the computer that known is an inflection of know, it's true that "what is FSK best known for?" is probably very hard for a computer to figure out. But that doesn't mean he should say things like "lacked a verb" when it actually has one.

At 8:09 AM, February 18, 2011 Anonymous Mark P had this to say...

I saw that one coming.

Something else I find interesting in a meta kind of way is the different approach that you and I took to reading that passage. You came from a language perspective, which I enjoy as an amateur, but I came from a technical perspective, which is closer to what I do. My chief complaint is the use of words like "disorient" and "learn." Computer programs (not computers as such) don't learn or know or become disoriented any more than a light switch and electrical circuit do. Using those words is a shorthand that I do myself, but they are misleading for people who don't actually know what's going on. The coverage of the Watson/Jeopardy! story that I have seen completely ignores the man behind the curtain. The wonder of Watson is not Watson, but the people who programmed Watson.

At 11:01 AM, February 18, 2011 Blogger The Ridger, FCD had this to say...

I hate to say "Well, sure" but I kind of feel like that. OTOH, I see what you mean: Watson isn't a natural wonder. It's amazing, but it's built, which makes the builders amazing.


