COLLECTED BY
Organization:
Alexa Crawls
Starting in 1996,
Alexa Internet has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the
Wayback Machine after an embargo period.
Crawl data donated by Alexa Internet. This data is currently not publicly accessible
The Wayback Machine - https://web.archive.org/web/20050311205526/http://www.9238.net:80/9238/dongtai20020602.htm
搜索研究院
搜索界动态 20020502-20020531
搜索引擎9238 2002.6
5-31
A New Beta From Amazon.Com: Search and Browse Restaurant Menus
Similar to Google Catalogs, this beta provides users with the ability to search
(using optical character recognition) and browse a selection of restaurant menus
in 6 cities, with more to come. The current cities available are: Washington
D.C., Chicago, New York, Los Angeles, San Francisco, and Seattle. Another useful
and potentially "tasty" use of search technology. Bon Appetit!
http://www.amazon.com/exec/obidos/tg/browse/-/913908/104-4617759-5307134
5-31
Google2002编程比赛结果揭晓,纽约的Daniel Egnor凭“Geographic Search”或得1万美元的头奖。
+ Daniel's project adds the ability to search for web pages within a particular
geographic locale to traditional keyword searching. To accomplish this, Daniel
converted street addresses found within a large corpus of documents to latitude-longitude-based
coordinates using the freely available TIGER and FIPS data sources, and built
a two-dimensional index of these coordinates. Daniel's system provides an interface
that allows the user to augment a keyword search with the ability to restrict
matches to within a certain radius of a specified address (useful for queries
that are difficult to answer using just keyword searching, such as "find me
all bookstores near my house"). We selected Daniel's project because it combined
an interesting and useful idea with a clean and robust implementation. Honorable
mentions
+ Zhenlei Cai, for his project, Discovery and Grouping of Semantic Concepts
from Web Pages with Applications. This effort processed a corpus of documents
and found words and phrases that tend to co-occur within the same document,
producing a list of pairs of terms that seem to be closely related (such as
"federal law" and "supreme court", or "Bay Area" and "San Francisco").
+ Laird Breyer, for his project, Markovian Page Ranking Distributions: Some
Theory and Simulations. This project examined various properties of the Markovian
process behind Google's PageRank algorithm, and suggested some modifications
to take into account the "age" of each link to reduce Pagerank's tendency to
bias against newly-created pages.
+ Thomas Phelps and Robert Wilensky, for their project, Robust Hyperlinks. Traditional
hyperlinks are very brittle, in that they are useless if the page later moves
to a different URL. This project improves upon traditional hyperlinks by creating
a signature of the target page, selecting a set of very rare words that uniquely
identify the page, and relying on a search engine query for those rare words
to find the page in the future. For example, the Google programming contest
can be found using this link.
+ Aaron Peapell, for his project, Genetic Search Algorithm. This project used
a genetic search algorithm to bias a Pagerank-like algorithm in a query-specific
manner, by giving higher weight to links from pages containing all of the query
terms.
+ Dan Blandford and Guy Blelloch, for their project, Index Compression Through
Document Reordering. This project aims to reduce the space requirements of an
inverted index by clustering together documents that are similar before assigning
numerical identifiers to the documents (leading to locality in the document
identifier sequences within the inverted posting lists for the words in the
index, thereby making the sequences more compressible with various types of
encoding techniques).
http://www.google.com/programming-contest/winner.html
5-30
Overture has announced the launch of its French PPC- service for the third quarter
of 2002. In a press release following only hours after Espotting's announcement
of its Spanish operation, they are offering a 50% discount on all accounts opened
by July 22.
http://fantomaster.com/fanews0.html
5-21
Google Toolbar的新功能:
- Combined Search button: providing quick and easy access to all Google search
services from one, compact button.
- Browser control: suppresses pop-up windows that are triggered when users leave
a web site.
- Navigation: enabling users to quickly navigate between websites listed in
a Google search results page, using intuitive Next and Previous buttons.
5-21
新浪发布港、台、北美版新浪的收费登录。
香港(search.sina.com.hk) 5800元/关键词/年
台湾(search.sina.com.tw) 5800元/关键词/年
北美(search.sina.com) 8800元/关键词/年
5-13
近期申请的几个信息检索专利
===
"Automatic user interest profile generation from structured document access
information" #6,385,619
Asignee: IBM
Issue Date: 5/7/02
From the abstract, "A system generates user interest profiles by monitoring
and analyzing a user's access to a variety of hierarchical levels within a set
of structured documents, e.g., documents available at a web site. Each information
document has parts associated with it and the documents are classified into
categories using a known taxonomy. The user interest profiles are automatically
generated based on the type of content viewed by the user."
-
"Seamless integration of internet resources" #6,381,599
Asignee: America Online
Issue Date: 4/30/02
From the abstract, "A mechanism for seamlessly searching and accessing information
available through the Internet and other resources is disclosed. The present
invention maintains a database of file objects available from numerous sources.
The present invention updates the database periodically to ensure the accuracy
and completeness of it. The present invention also may access and retrieve data
from numerous sources when prompted by a single and simple command initiated
by the user."
--
"Display of media previews" #6,370,543
Asignee: Magnifi, Inc.
Issue Date: 4/9/02
From the abstract, "A method and apparatus for searching for multimedia files
in a distributed database and for displaying results of the search based on
the context and content of the multimedia files."
-
"Method of clustering electronic documents in response to a search query"# 6,363,379
Asignee: AT&T
Issue Date: 3/26/02
From the abstract, A method of presenting clusters of documents in response
to a search query where the documents within a cluster are determined to be
related to one another."
-
"Method and apparatus for retrieving documents based on information other than
document content" #6,360,215
Assignee: Inktomi Corporation
Issue Date: 3/19/02
From the abstract, " method and apparatus are provided for retrieving documents
from a collection of documents based on information other than the contents
of a desired document. The collection of documents, which may be a hypertext
system or documents available via the World Wide Web, is indexed."
“我们若能更妥善地搜寻资料,实在已经改变世界。”
返回首页