Skipping the Learning Curve - Page 2
August 28, 2000
Generating PDF document on-the-fly from within your Perl scripts does pose
several challenges. The first challenge is acquiring familiarity with the
PDF specification. Several Perl modules are available which can create a
PDF binary stream, but each of these modules does require some degree of
understanding as to how PDF documents are structured. Martin Hosken's
Text::PDF
and Fabien Tassin's
PDF::Create
are two good examples -- both modules could be used flexibly by someone
with an understanding of PDF, although even despite that learning curve,
possess their limitations (such as PDF::Create's inability to include
images in the PDF document).
A more fully rounded solution can be found in
PDFLib, a commercial package for
generating on-the-fly PDF which includes a Perl interface. For commercial
use, though, this product does come with a price tag, at least $500 at that,
and although sophisticated, the library still possesses a learning
curve beyond our current aspirations.
The simple fact is, we don't feel like learning the PDF specification. Sure,
it might be valuable, as would becoming an ASE certified auto mechanic for
the next time the car makes that funny noise -- but there's only so much
time to do what you need to do!
The Kindness of Strangers: HTMLDOC
Fortunately, the folks at
Easy Software Products
have already learned the PDF specification and written a tool that is nearly
perfect for our needs:
HTMLDOC, released freely under
the GPL or
GNU General Public
License. In fact, HTMLDOC is a full-blooded HTML conversion tool with a
graphical interface that can convert web pages to either Postscript
or PDF format. Wisely, though, the Easy Software angels built in a
command-line interface with capabilities that let us integrate the
functionality of HTMLDOC into our Perl scripts, making for a nearly
seamless PDF-generating back-end.
You will need to obtain HTMLDOC to implement our simple solution to
generated PDF from Perl scripts. Fortunately, binaries of HTMLDOC are
available for a wide variety of platforms, including Linux, Solaris, and
Windows. For the most part, the installation instructions are simple and
straightforward, as illuminated at the Easy Software
documentation
page. However, we did find two points worth noting in our own
installation of the .tar.gz distribution of HTMLDOC on a RedHat 6.1 Linux
system:
- The htmldoc-1.version-platform.tar.gz archive does not
create its own subfolder, so you might want to create a subfolder for the
installation and move the archive into there before decompressing it.
- When installing HTMLDOC on a remote host via a telnet connection, the
setup routine failed because it wants to be in a graphical X
environment. However, we were able to install the binary by launching the
htmldoc.install script also located among the installation files;
e.g.
prompt> cd htmldoc-install
prompt> ./htmldoc.install
You can test the HTMLDOC installation with a simple commandline call:
prompt> echo "test" | htmldoc --webpage -t pdf -
A successful test will yield several lines of garbled-looking output, which
itself is not important, except to prove that the program is running and
producing output.
The Perl You Need to Know Part 16: A Simple Approach to PDF - Page 1
The Perl You Need to Know
HTML Becomes PDF - Page 3
|