Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


suPerlative: The Site Map

The site map is generated by cmap.pl. It takes a list of all the index.html paths, created by using the UNIX 'find' command. It uses a neat algorithm (i.e. I'm quite proud of it) to layout a table showing the hierarchy of the site directories. As it only lists index files, the map isn't quite fully comprehensive - but at 1500 pages for the site, a full site map would be very large and hard to use. Since we structure the site very thoroughly, with usually only a handful of files per directory, the map actually gives quite a good overview.

To see how it works, consider this fragment of a 'dir' listing:

/Authoring/
/Authoring/CGI/
/Authoring/CGI/Input/
/Authoring/CGI/Output/
/Authoring/CGI/Process/
/Authoring/DB/
In the map table, 'Authoring' should span at least 5 rows; and 'CGI' should span 3: Input, Output, and Process. The program reads each line, and steps through the pathname components, e.g. Authoring, CGI, Input. As it goes, it creates a name by sticking the components back together (w/o the '/'), e.g. AuthoringCGI. Each time it sees a name, it counts it (incrementing an associative array indexed on the name). So AuthoringCGI will receive a count of 4; subtract one to get the number of rows to span. This number is then used for the 'rowspan' attribute in the corresponding table cell. Here is an outline of the algorithm:-
  1. Open the input and output files. The input file is a list of paths to index.html files.
  2. Output an ht header and start the table.
  3. Read in the directory of index.html files, line by line.
    1. Skip over directories we don't want on the map.
    2. Add the path to a list array.
  4. For each item in the list (after sorting):
    1. Split the pathname into components.
    2. If there are not more parts than in the previous entry, then the previous row is finished.
      • For each component in the previous entry:
        • Append to the name and add to its count.
    3. Save some data for each item:
      • depth in hierarchy;
      • whether at end of row;
      • item name; etc.
  5. For each item except first two: (why 2? first is that previous loop added spurious 'previous' entry, second is that the home page top level isn't really needed.)
    1. Open the corresponding index.html file
    2. Get the Title from the file.
    3. Print the table cell.
    4. Print the row separator if it's the last row item.
  6. Print end of table; empty cell allows to complete last row validly.

suPerlative: Log File Analysers
suPerlative Web Construction !
suPerlative: Development and Public Servers


Up to => Home / Software / Perl




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers