|
- Compaq's Web Language and system is designed for rapid prototyping
of Web computations. It is well-suited for the automation of tasks on
the WWW.
- Compaq's Web Language's emphasis is on high flexibility and
high-level abstractions rather than raw computation speed. It is thus
better suited as a rapid prototyping tool than a high-volume production
tool.
- Compaq's Web Language is implemented as a stand-alone application
that fetches and processes web pages according to programmed scripts.
- Compaq's Web Language is a high level, imperative, interpreted,
dynamically typed, multi-threaded, expression, language.
- Compaq's Web Language's standard data types include boolean,
character, integer (64-bit), double precision floats, Unicode strings,
lists, sets, associative arrays (objects), functions, and methods.
- Compaq's Web Language has prototype-like objects.
- Compaq's Web Language supports fast immutable sets and lists.
- Compaq's Web Language has special data types for processing HTML/XML
that include pages, pieces (for markup elements), piece sets, and tags.
- Compaq's Web Language uses conventional control structures like
if-then-else, while-do, repeat-until, try-catch, etc.
- Compaq's Web Language has a clean, easy to read syntax with C-like
expression and Modula-like control structures.
- Compaq's Web Language supports exception handling mechanisms (based
on Cardelli & Davies' service combinators) like sequential
combination, parallel execution, timeout, and retry. Compaq's Web
Language can emulate arbitrary complex page fetching behaviors by
combining services.
- Compaq's Web Language speaks whatever protocols Java supports, i.e.
HTTP, FTP, etc.
- Compaq's Web Language can easily fill in web-based forms and
navigate between pages.
- Compaq's Web Language has HTTP cookie support.
- Programmers can define HTTP request headers and inspect response
headers.
- rogrammers can explicitly override mimetypes and DTDs used when
parsing Web pages.
- Proxy support.
- Support for HTTP basic authentication (both client and proxy
authentication).
- Compaq's Web Language 'understands' HTML, XML and plain text
mime-types.
- Compaq's Web Language uses a DTD-based HTML parser for extensibility
(HTML 2.0, 3.2, and 4.0 DTDs included).
- Compaq's Web Language has relatively robust page parsing that
attempts to make a faithful representation of Web pages.
- Compaq's Web Language supports a markup algebra for extracting
elements and text from pages, and functions for manipulating the content
of a page. Extraction functions include extracting all elements of a
specific name, all occurrences of PERL5 regular expressions, and all
occurrences of simple element patterns.
- Elements and patterns are mapped onto piece objects in Compaq's Web
Language, and allow the direct access to markup attributes.
- Markup algebra allows the expression of complicated access patterns
easily (for example, "extract all the images in the third row of the
table (that contains the word 'abc'"), and so on).
- Compaq's Web Language can handle overlapping elements internally.
(Page manipulation is not based on an internal tree-like representation
of markup.)
- Page manipulation functions include modifying attributes, deleting
elements/tags, copying elements/text, and replacing elements/text.
- Compaq's Web Language allows programmers to look at both the markup
structure of a page and the raw text (without any tags).
Standard modules supplied with
Compaq's Web Language include:
- File manipulation for writing or downloading pages to disk.
- Displaying pages in your web browser, checking which pages are being
viewed in Netscape, and instructing Netscape to navigate to a specific
URL (Windows only).
- Multi-processing with workers, jobs, and job queues.
- General string manipulation including PERL5 regular expression
searches.
- Routines to split and glue together URLs.
- An easily customizable multi-threaded web crawler.
- A multi-threaded web server that allows the direct execution of
Compaq's Web Language functions with full access to HTTP state.
- Java servlet support.
- Examples to access information from public services like AltaVista,
Yahoo!, etc.
Java Support and
Integration |
- Compaq's Web Language is written in nearly completely in Java. (The
Browser access module needs access to a few Windows API calls; Compaq's
Web Language is completely portable on UNIX platforms.)
- It is possible (however not recommeded) to directly code against
Compaq's Web Language's API (thus not writing Compaq's Web Language
scripts but still using its functionality).
- Very easy to add bridges from Compaq's Web Language to Java code.
Java objects can be called directly from Compaq's Web Language code
without extending the Compaq's Web Language system (see module Java).
 |
© 1998-1999 Compaq Computer
Corporation | |
 |