suPerlative: The Virtual Library
Historically, the first script written was for the
WWW Virtual Library (or 'Vlib').
The WDVL was originally a small collection of links to pages on other
sites. As it grew it became clear that keeping all the information in
HTML text was going to become a maintenance intensive chore, and that
some kind of 'database' would be a much more efficient way to go.
So, a flat file format was designed, and the UNIX ndbm routines
were used to create and mantain the file, and to read from it when
regenerating the HTML pages. After a while, the use of ndbm was dropped
because the disk space usage was way out of proportion to the actual
space needed.
Each page in the Vlib (except the main index) has a form for user
submissions to that category. Originally there was just one single form
and the user was given a menu to select the category, but we found that
many submissions were very poorly categorized. Forcing the user to at
least visit something in the Library seemed to help.
Note that this form also insists that you answer that you have read the
instructions. We noticed as the web became more popular that the number
of inappropriate entries increased enormously - in fact, most entries
had nothing to do with web development. Forcing the user to work a
little more helped because the abusers were only interested in quick and
easy promotion..
The form entries are appended by the CGI program
wdvl.cgi
to a file (Vlib.txt) which is processed every week or so by the
new program.
The format of Vlib.txt is
| Group |
Category |
Title |
URL |
Description |
Email |
all on one record, items tab-separated.
The new program generates one page listing the
latest submissions; we manually visit every one and reject typically
half for not being relevant or for inadequate quality.
We edit some submissions, e.g assigning them to a more appropriate
category, removing hype, etc.
The update.sh program appends the
remaining entries to the master file, Vlib.txt.
These are sorted and the dh program reads the sorted
file and generates the HTML pages in their appropriate subdirectories,
corresponding to the major groups.
It also creates index.html files in those directories, not directly
needed by the Vlib structure (except as 'uplinks' in the page bottom
navigation menu) but for the site map program.
There is an extremely simple links checker for the Vlib entries.
It scans the Vlib.txt file and calls the
LWP::Simple
library routine head to check if each URL is OK or not.
Those entries that are OK are copied to Vlib.OK, the rest to Vlib.NOK
for more detailed manual investigation or another pass of vlinks
to see if the problem was transitory.
#!/opt/bin/perl
#
use LWP::Simple;
open (VL, "<Vlib.txt")||die$!;
open (OK, ">Vlib.OK")||die$!;
open (NOK, ">Vlib.NOK")||die$!;
while (<VL>) {
$rec = $_;
($group, $cat, $title, $URL, $desc, $email)
= split (/\t/);
if (head( $URL ))
{ print OK $rec; }
else { print NOK $rec; }
}
There is also a form for updating current entries; these are handled
manually with no script support.
suPerlative: The Site Map
suPerlative Web Construction !
suPerlative: Development and Public Servers
|