Filed in archive
by tj on February 17, 2004
Technology Review has another must-read article that sheds light on the development of the new generation of search engines.
"Whichever technology hooks tomorrow's Web surfers, its builder will earn enormous influence---and handsome profits. Some 550 million search requests are entered every day worldwide (245 million of them in the United States). By 2007, the paid-placement advertising revenue generated by all these searches will reach about $7 billion...."
"Mining the Deep Web is the mission of another fresh face in the search business---Chicago-based Dipsie. "Google and Teoma only index about 1 percent of the documents out there," says Jason Wiener, Dipsie's founder and chief technology officer. Wiener, a self-taught programmer who ran a San Francisco Web development company until the dot-com crash, has spent the last two years building a more nimble crawler, one that can get past forms and database interfaces. Say you're wondering about the standard equipment on a Mercedes 55SL convertible. At Cars.com, drilling down to the page with detailed product information will take about six steps. Dipsie, however, will have indexed the entire Cars.com database in advance, so it can send you to the same page with a single click. "We don't handle anything that requires authentication with a username and password, but we do almost everything else," Wiener says. He claims that by the time Dipsie's search site becomes publicly available this summer, its index will include 10 billion documents---triple the current size of Google's index."
"Take Microsoft Research's AskMSR program, which Brill and his colleagues have been testing on Microsoft's internal network for more than a year. At its core is a simple search box where users can enter questions such as "Who killed Abraham Lincoln?" and, instead of getting back a list of sites that may have the information they seek, receive a plain answer: "John Wilkes Booth." The software relies not on any advanced artificial-intelligence algorithm but rather on two surprisingly simple tricks. First, it uses language rules learned from a large database of sample sentences to rewrite the search phrase so that it resembles possible answers: for example, "___ killed Abraham Lincoln" or "Abraham Lincoln was killed by ___." Those text strings are then used as the queries in a sequence of standard Keyword-based Web searches. If the searches produce an exact match, the program is done, and it presents that answer to the user."
Permalink: the future of searching
Tags:
technology
searching
entrepreneurship
future
2003
future+searching
venture+capital
please+enter
Trackback: http://publish.creative-weblogging.com/publish/mt-tb.pl/796
Mr Wong
Vote for the future of searching:
|
Rating: 9.00 out of 4 vote(s) cast.
|
Subscribe
Use the search to look for other interesting posts
| RSS | See all blog subscribe options |
|
What is RSS? | |
| Yahoo! |
|
| Addthis |
|
| Bloglines |
|
| Newsletter | |
| Follow us on Twitter! |















