The Search Results Page - Page 25
June 8, 2001
The search results page should have a sorted list of hits with
the best hits at the top. Some search engines list search scores
next to the hits, but because users don't understand how these
scores are computed they are essentially meaningless. As long as
the best hits are at the top, users can easily start scanning the
list from the top and will automatically see the most important
hits first without wasting time trying to interpret search
scores.
The search results list should eliminate duplicate occurrences of
the same page. In particular, it is quite common to see the
default page in a directory listed multiple times with slightly
varying URLs. On many servers, the following three URLs will
point to the same page:
- http://www.foo.com/bar
- http://www.foo.com/bar/
- http://www.foo.com/bar/index.html
Even though these URLs are distinct in principle (that is, they
could point to different pages under certain conditions), they
should be unified and listed only once in the search results
listing. It is very confusing for users if they click on
different links and get the same result.
Search systems should also explicitly recognize quality in
addition to relevance when prioritizing search hits. For example,
if the site has a FAQ about the user's query term, then the FAQ
page should be listed on the top of the results page even if
other pages have higher relevance scores. After all, it is likely
that a FAQ is of higher quality with respect to answering the
user's questions. It would also be possible to build up a
database of quality ratings for each of the pages on the site
relative to each of the more popular search terms. For example,
every time users follow a link from a results list to a page,
they are asked how well that page satisfies their search, and the
ratings are then saved and used to prioritize the results list
for future searches.
Traditionally, the chunking unit for web search has been the
page. In other words, the search output is a list of pages that
match the user's query. Unfortunately, most of these lists of
pages have no indication of the relation between the pages that
were found. It would be better to structure the search results
relative to the structure of the site. For example, if many pages
were found within a single subsite, then it might be better to
cluster all these hits into a single entry on the search results
page. Sometimes, it may even make sense to chunk the search by
larger units than the pages. For example, an advanced search of a
site with many distinct subsites could initially use the subsites
as the chunks and list those subsites that, taken as a whole,
were good matches for the user's query. The user could then
search these subsites further.
MaMaMedia integrates a structural overview into its search
results pages. When doing a search on the term “bird,” the user
retrieves pages both about birds as animals and about birds as
pets (as well as pages about dinosaurs). The difference in
emphasis between the different pages is made clear while the
child is still on the search results lists, thus eliminating the
need to spend time going to some pages only to read about the
wrong topic.
Page Descriptions and Keywords
Some of the Internet-wide search engines show the author's
abstract of the page instead of trying to generate their own
summary text. In general, I favor this approach because humans
are still better than computers at deciding what a page is really
about and at writing readable text. The page abstract is
contained in a META tag with the name "description" in the page
header. The format for these abstracts is
Deciding the best length of a page abstract for a search-results
listing is a trade-off between providing a good prospective view
of the possible destinations and providing an overview of the
full set of alternatives; long abstracts are better at allowing
users to assess each individual page but make it more difficult
for users to compare destinations without excessive scrolling. In
almost all cases, some form of abstract is necessary because the
page titles alone are not sufficient to allow users to guess what
the pages are truly about.
Page abstracts should be kept short. Most search engines display
only the first 150 or 200 characters of the description text, so
it is best to stay below this limit when writing pages for the
open Internet. Even if you are using your own search engine, it
is still best to have very short abstracts because users are more
likely to scan the abstracts than to read them in full.
In addition to descriptions, it is also common to add a list of
keywords in a META tag in the page header. Typically, the
keywords are not displayed in research results lists but instead
are used only to determine the relative ranking of the retrieved
pages: A page is assumed to be mainly about the terms included in
its keyword list.
The keywords list should include both simple terms (e.g., "bus")
and compound terms (e.g., "double-decker bus") because users
search surprisingly frequently for multi-word terms. In pre-Web
search studies, we used to find that users were overwhelmingly
most likely to enter single-word queries. For example, in a study
of a traditional online documentation system, Meghan Ede and I
found that 81 percent of the users' queries consisted of a single
word. Maybe the overwhelming amount of information on the Web has
forced users to be more precise in their queries. Whatever the
reason, single-word queries are not quite as common as they used
to be. In 1997, I analyzed 2,261 queries from WebCrawler and
24,743 queries from www.sun.com. In both cases, I found
substantially more two-word queries as well as longer queries.
|
Distribution of the number of terms used in search
queries in a traditional, pre-Web system and in two web search
engines.
|
| | Pre-Web Search |
Webcrawler | www.sun.com |
| 1 word | 81% |
43% | 46% |
| 2 words | 14% |
35% | 32% |
| 3 words | 4% |
13% | 15% |
| 4 words | 1% |
6% | 5% |
| 5 words | 0% |
3% | 2% |
Even though the profusion of material on the Web has encouraged
longer queries, it is still true that the vast majority of
queries are one or two words. Such ultra-short queries made up
more than three quarters of the searches in my sample. The lesson
for web designers is the need to use focused and highly
descriptive keywords in your META tags, because keyword searches
are the way most users will find you. Also, you need to add
keywords for all the main synonyms for your topic. In particular,
add alternative keywords for any terms used by your competitors
to refer to the kind of product you are selling. For example, a
page about hard disks should have the acronym DASD as a keyword
because many traditional IBM customers will be used to calling
disks DASD (direct-access storage devices).
It is unfortunate that people tend to use short searches because
search engines are better at finding relevant pages the more
information they have about the user's needs. Typically, the way
to provide more information about your needs involves specifying
additional search terms, including synonyms or alternate phrases.
Doing so is hard, and people are notoriously bad at thinking of
synonyms. Also, of course, natural laziness encourages users to
type as little as possible. Because of these problems with
traditional keyword search, search engines need to take on more
of the responsibility for allowing users to enhance their
searches.
Advanced Search - Page 24
Designing Web Usability
Search Examples - Page 26
|