Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


The Search Results Page - Page 25

June 8, 2001

The search results page should have a sorted list of hits with the best hits at the top. Some search engines list search scores next to the hits, but because users don't understand how these scores are computed they are essentially meaningless. As long as the best hits are at the top, users can easily start scanning the list from the top and will automatically see the most important hits first without wasting time trying to interpret search scores.

The search results list should eliminate duplicate occurrences of the same page. In particular, it is quite common to see the default page in a directory listed multiple times with slightly varying URLs. On many servers, the following three URLs will point to the same page:

  • http://www.foo.com/bar
  • http://www.foo.com/bar/
  • http://www.foo.com/bar/index.html

Even though these URLs are distinct in principle (that is, they could point to different pages under certain conditions), they should be unified and listed only once in the search results listing. It is very confusing for users if they click on different links and get the same result.

Search systems should also explicitly recognize quality in addition to relevance when prioritizing search hits. For example, if the site has a FAQ about the user's query term, then the FAQ page should be listed on the top of the results page even if other pages have higher relevance scores. After all, it is likely that a FAQ is of higher quality with respect to answering the user's questions. It would also be possible to build up a database of quality ratings for each of the pages on the site relative to each of the more popular search terms. For example, every time users follow a link from a results list to a page, they are asked how well that page satisfies their search, and the ratings are then saved and used to prioritize the results list for future searches.

Traditionally, the chunking unit for web search has been the page. In other words, the search output is a list of pages that match the user's query. Unfortunately, most of these lists of pages have no indication of the relation between the pages that were found. It would be better to structure the search results relative to the structure of the site. For example, if many pages were found within a single subsite, then it might be better to cluster all these hits into a single entry on the search results page. Sometimes, it may even make sense to chunk the search by larger units than the pages. For example, an advanced search of a site with many distinct subsites could initially use the subsites as the chunks and list those subsites that, taken as a whole, were good matches for the user's query. The user could then search these subsites further.

MaMaMedia integrates a structural overview into its search results pages. When doing a search on the term “bird,” the user retrieves pages both about birds as animals and about birds as pets (as well as pages about dinosaurs). The difference in emphasis between the different pages is made clear while the child is still on the search results lists, thus eliminating the need to spend time going to some pages only to read about the wrong topic.

Page Descriptions and Keywords

Some of the Internet-wide search engines show the author's abstract of the page instead of trying to generate their own summary text. In general, I favor this approach because humans are still better than computers at deciding what a page is really about and at writing readable text. The page abstract is contained in a META tag with the name "description" in the page header. The format for these abstracts is

Deciding the best length of a page abstract for a search-results listing is a trade-off between providing a good prospective view of the possible destinations and providing an overview of the full set of alternatives; long abstracts are better at allowing users to assess each individual page but make it more difficult for users to compare destinations without excessive scrolling. In almost all cases, some form of abstract is necessary because the page titles alone are not sufficient to allow users to guess what the pages are truly about.

Page abstracts should be kept short. Most search engines display only the first 150 or 200 characters of the description text, so it is best to stay below this limit when writing pages for the open Internet. Even if you are using your own search engine, it is still best to have very short abstracts because users are more likely to scan the abstracts than to read them in full.

In addition to descriptions, it is also common to add a list of keywords in a META tag in the page header. Typically, the keywords are not displayed in research results lists but instead are used only to determine the relative ranking of the retrieved pages: A page is assumed to be mainly about the terms included in its keyword list.

The keywords list should include both simple terms (e.g., "bus") and compound terms (e.g., "double-decker bus") because users search surprisingly frequently for multi-word terms. In pre-Web search studies, we used to find that users were overwhelmingly most likely to enter single-word queries. For example, in a study of a traditional online documentation system, Meghan Ede and I found that 81 percent of the users' queries consisted of a single word. Maybe the overwhelming amount of information on the Web has forced users to be more precise in their queries. Whatever the reason, single-word queries are not quite as common as they used to be. In 1997, I analyzed 2,261 queries from WebCrawler and 24,743 queries from www.sun.com. In both cases, I found substantially more two-word queries as well as longer queries.

Distribution of the number of terms used in search queries in a traditional, pre-Web system and in two web search engines.
 Pre-Web Search Webcrawlerwww.sun.com
1 word81% 43%46%
2 words14% 35%32%
3 words4% 13%15%
4 words1% 6%5%
5 words0% 3%2%

Even though the profusion of material on the Web has encouraged longer queries, it is still true that the vast majority of queries are one or two words. Such ultra-short queries made up more than three quarters of the searches in my sample. The lesson for web designers is the need to use focused and highly descriptive keywords in your META tags, because keyword searches are the way most users will find you. Also, you need to add keywords for all the main synonyms for your topic. In particular, add alternative keywords for any terms used by your competitors to refer to the kind of product you are selling. For example, a page about hard disks should have the acronym DASD as a keyword because many traditional IBM customers will be used to calling disks DASD (direct-access storage devices).

It is unfortunate that people tend to use short searches because search engines are better at finding relevant pages the more information they have about the user's needs. Typically, the way to provide more information about your needs involves specifying additional search terms, including synonyms or alternate phrases. Doing so is hard, and people are notoriously bad at thinking of synonyms. Also, of course, natural laziness encourages users to type as little as possible. Because of these problems with traditional keyword search, search engines need to take on more of the responsibility for allowing users to enhance their searches.

Advanced Search - Page 24
Designing Web Usability
Search Examples - Page 26


Up to => Home / Authoring / Design / Usability




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers