A debate that has hardly died, judging from the number and the fervor of the comments and responses generated by Matt Marshall’s first and second coverage of our approach to search. Analyzing the debate, the posts and comments that popped up pretty much everywhere had a few common themes: “natural language has been tried before and failed”, “keyword search is enough for people’s needs”, and “it doesn’t matter anyway, because users won’t change their behavior”. I thought it was worth expressing my own view on these themes, and to explain why natural language is the inevitable destiny of search.
Can Natural Language really make a difference?
Powerset is a natural language search company. Yes, the road is paved with failed attempts at delivering on the long-sought grail of information retrieval. But whether or not others have succeeded at this task, it surprises me how many out there are really convinced that search is a solved problem. And even amongst those who realize that search is not as good as it could be, many are of the opinion that there’s nothing limiting in the expressiveness of keywordese.
I can summarize most of the arguments made in support of keyword search, around three major axes:
* The vast majority of queries are only a few words long
* There is only so much semantic intent that one could extract from queries such as “Britney Spears”, “beach”, or “digital cameras”
* Whether or not one could build a better search engine is a moot point; delivering better results to queries formulated in natural language won’t work, because it would require users to change their behavior.
I’ll address these topics in order, starting with the last one.
Changing Users’ Behavior
I am probably a pretty good keyword searcher, and yet I have no good way to describe to somebody how I come up with successful sets of keywords that deliver me the information I seek; I have learned and refined my technique over time. And the fact of the matter is that all successful searchers have adapted to the limits of technology. We have trained ourselves – for necessity – to translate our needs into keywords as successfully as we can. And yet, many recognize the limits of search today.
Human history is characterized by an interesting tension between innovation and adaptation. In the early days of human history the rate of innovation was much slower and the rate at which we adapted to the environment and its constraints, much higher. Throughout time, innovation caught up and in this day an age, we are much more likely to – say – devise a cure for a fatal pathogen, than we are to adapt to its effect. Still, when no better solution is available (and especially when the adaptation is only behavioral), people are effective at devising efficient strategies to solve their needs, and would much prefer do so than to give up altogether in their goals. As a consequence, many people have become masters in the art of keywordese searching. Not so much unlike developing a grunting pidgin language, as Barney puts it, to communicate with someone with no common language – you’d advance enough to say some things, sooner or later.
Yes, no one from Google sat users down and told them “two words only and no conjunctions”, but reinforcement training is much more effective than any instruction. People don’t bother giving search engines more (in terms of words, and context), because they’ve learned that in return they get less.
So how hard is it to untrain users? How hard is it to change a behavior? It depends. There are three major metrics that one could use to describe the ease (or resistence) to users changing their behavior.
1. The cost of changing one’s behavior. For example, it seems clear that there is some actual cost in typing a number of extra words or characters. On the other hand, a fundamental dimension of language-based communication is the conversational aspect. This is probably a topic better left to a different post, but it’s reasonable to expect that effective conversational interface may actually reduce the cognitive load associated with typing, as context is preserved from utterance to utterance.
2. The benefit associated with the change. One could reasonably argue that a significantly better search experience, with better and more precise results, more often, would constitute a benefit worth changing one’s behavior.
3. The pre and post energy state. This is probably what is fundamentally different about switching to searching in natural language: it’s not really a switch. How many people do you know who formulate thoughts in keywords? A change from a less intuitive practice, harder to understand and learn, to one that is more natural and easier to adopt is clearly a change with the flow, and not against it
The central idea in bringing to consumers a natural language search experience that (actually) works, is that the change is aligned with what’s natural to people, it is a change from an unnatural way of expressing intent (one that works as well as technology allows for today), to one that is more natural and easily converted into tangible, readable, typeable form.
But it’s not the end of the story. It’s also a change from an impoverished language, one that loses information and expressiveness in the conversion from intent to form (keywordese), to a highly expressive and powerful language – one which everyone is naturally prone to use.
The Query Universe: keywords, questions and all shades in between.
I completely recognize that there are cases in which your two-words vanilla search will do just fine in expressing what you need. Maybe you don’t really know what you want. Maybe you type “Jane Austen” and you just want to be entertained to a carousel of general interest documents that will teach your more about Jane Austen. And yet, what if what you wanted was to know about books that describe and review Jane Austen’s portrayal of the clergy? An encompassing search experience should satisfy users in both these cases, with as little effort as possible. Natural language search doesn’t mean impinging on what’s intuitive and natural to people by forcing some artificial constraint of semantic or syntactic well-formedness. It means using any and all linguistically relevant content that users do include in their queries, and rewarding them for doing so, thus encouraging experimentation, and reversion to a more natural way of phrasing intent than a bunch of keywords.
Note that I am purposefully talking about “natural language queries”, and not “questions”. Questions are just one type of natural language query. An optimized search engine that only answered questions, even if it did so really well, would address just a fraction of what people use search engines for. And many have made that mistake. AskJeeves opened somewhat the door to the market. Before they retired Jeeves, the folks at Ask figured out that users would like to come and ask questions on their website. The initial user response showed them quite right. The AskJeeves developers thought that an editorial approach, manually compiling the best answers to the big head of the questions that users asked, would be a winning strategy. It wasn’t. Although this was some time before the term “long tail” entered the vocabulary of search-savvy folks, it should be clear why. At the time, people didn’t realize that much of the value of search was not in the most common queries, but rather in the long tail of queries. People search for all sorts of things and once they think they can use language for some things, they’ll want to do so across the board. As a matter of fact, others do realize the importance of questions, today. Go to Google and ask “Who shot Lincoln?” and you get a nice a “one-box answer” with Lincoln’s murderer. Google shies away from the editorial approach and mines a nice set of sources which can provide quick answers to questions. But in reality, both approaches are limited, hackish, and brittle. Ask Google “Who murdered Lincoln?” and the one-box disappears. Still, why doesn’t Google publicize this feature much? Probably because telling users “come and ask a question, and we’ll get you some answers, some of the time. But don’t use language in all of your other searches because it’ll get worse…” doesn’t seem like a very consistent marketing message.
The long tail of failed queries
Danny Sullivan cites Google Zeitgeist’s remarkable lack of long queries to represent what users will do with effective natural language search. Query logs are helpful, but the data can be misguiding. The data so far about short queries and past failures of natural language attempts is no indication about what users will really do or not do, as users have never yet been presented with the possibilities of true natural language search. What we do know is that users are attracted to the idea of being able to search using language and that they do so occasionally. Why would Google et al, bother to include some language-like features in their one-box results otherwise?
Moreover, it might be that looking closely at the bottom of query logs (the long tail), you’d find many longer, language-rich queries, all of them more or less about “Britney Spears“, but different enough not to be adding up to a high total of identical strings. But what’s really interesting, continuing our speculation, is that each one of those long queries could very well be the first attempt of a user who was really interested in something very specific about Britney. The only catch is that the vast majority of those queries likely failed to return what users wanted – or returned nothing relevant at all – since keyword search engines often get confused by the additional “noise” that natural language introduces in their statistical models. So – who could blame them – sooner or later, some or many of those users probably just threw their hands up and searched again simply for “Britney Spears“. One can see how the inflated number of “short searches” at the top of the query logs might very well come from the total failure of keywordese search engines to return users what they really wanted.
Managing Users Expectations
And yet, the obstacles are significant. What’s challenging is not that users are satisfied with the status quo, or that they won’t change their behavior. Rather the problem stems from the fact that people know language very well. The real risk is that bringing real language understanding capabilities to search might generate unsatisfiable expectations. Search is like air, and just as we need oxygen, we have an insatiable need for information. When this need is combined with the ease of use of natural language, the stakes become much higher. At Powerset, we realize this. We know that in order to be successful, we must always effectively communicate to users the power and the limits of our technology, controlling their expectations, and striving everyday to amaze them with something they previously thought to be impossible.
The future of search
Finally, a teaser for a future post. A lot about innovating is also about staying ahead of the curve: as technology progresses and speech technologies mature further than they have to date, are we really going to be performing searches by uttering disconnected keywords? At that point, function words and word relationships will necessarily be omnipresent and one would be foolish to ignore their importance…. Language and natural language will likely become pervasive once speech technologies mature and we get used to accessing information in mobile and otherwise encumbered environments (i.e. cars, etc…).