HTMLEXP

Macro Expansion Utility

htmlexp.py is a little program I wrote to help me create web pages. Essentially, it's just a macro processor ... but, with various libaries it can be expanded to really make the creation of various textual documents much easier and consistent.

A note on why:

I started out preparing my web pages with M4 macros. Workable, but painful.
Next I found a nice program written by Will Duquette called expand.tcl. I'd still be using this program except for the fact that it needs user functions/macros to be written in TCL. Nothing wrong with TCL, but every (very occasional) time I use it I need to relearn its syntax.
I do just about everything in Python these days, so I decided to hack my own program.

NOTE: By default the macro delimters [ ] are used. However, in this document they have been replaced by << and >> so that examples can be shown.

Installation

To install the program just put it somewhere in your path. I'd suggest you take off the .py extension to make calling a bit easier. For example, "cp htmlexp.py /usr/local/bin/htmlexp"

Usage

htmlexp.py reads a source file and transforms it by expanding macros to a new file. By default, it writes a new file with a .html extension. Existing files with the same name are silently overwritten unless you use the -w command line flag).

The program treats any text contained in the macro delimiters (a matched pair of [] by default) as "something to be expanded".

Command Line Options

The behavior of htmlexp can be modified from the command line with the following options:

-r file Read an additional rules file. The rules file must consist solely of python code It is read and evaluated in the same manner as the PFILE macro. Any functions defined in the file will be available to the source file being processed.
Note: If you want to avoid a lot of rules files there are a number of ways to workaround:
Include the function definitions in the source/text file using the [ PYTHON ] macro.

Include a file with definitions using [ PFILE ].
Have a master rules file (loaded with -r) and in that file have one or more lines which look like:
execfile('/home/bob/src/bv/htmlexp/MACROS.py')
.
execfile() is a builtin Python function which executes a file. The file must contain only valid Python code. Note that a complete and expanded path is needed for this to work. If you want to use "~" notation insert the line:
import os
at the top of the file and use a command line like:
execfile( os.path.expanduser( '~/src/bv/htmlexp/MACROS.py' ) )

-w By default an existing output file is overwritten. This flag will cause an error if the new file already exists.

-o file By default the new file will be the input fileaname with any extension (eg .ehtml) removed and the extension .html added. This option sets an absolute filename to write to.

-b "start end" By default the macro start and end values are '[[' and ']]'. You can modify this in your source file with the [[ BRACKET ... ]] macro or use the command line option. You must quote the values (eg: -b '"[[ ]]"' ) so they are interpreted as a pair.

-d Prints documentation for the defined macros and variables. If you pass a "-r file" argument the macros in that file will be listed. Each macro and variable which is in UPPERCASE letters will be displayed. To make further processing with htmlexp.py simple each is displayed enclosed inside << and >> macro brackets and the macros SEC, FUNC and VAR. These macro definitions will need to be written if you wish to use the output for anything useful. See the bottom of this file for an example.

A Brief "How It Works"

After doing its initial setup the program reads your source file line by line. Anything enclosed in [[ ]] brackets (or whatever you've set the start/end delimiters to) is considered to be a macro. The first word in a macro must correspond to a previously defined Python function or variable. Nested macros are fully supported. The innermost macro is expanded first and the line or lines are then rescanned until all the macros are processed. For example, consider the following lines:

Hi. You can send me email at [[MAILTO [[MYEMAIL]]"Bob's Mailbox"]]
or send a postcard to to my PO Box.

The program will first expand the macro [[MYEMAIL]] then, with the new value inserted, it will process [[MAILTO bob@mellowood.ca "Bob's MailBox"]]. The result will be:

Hi. You can send me email at Bob's MailBox or send a postcard to to my PO Box.

Not only is it much easier to create HTML code using a macro or templating method, it also makes it much easier to maintain a site with consistent formats, email addresses, etc.

When scanning the source file ANY lines starting with "## " will be skipped. These are considered to be comment lines. Note: this is two "#s followed by a space.

To see what is happening, have a look at the source file for this page. It's in the download as "htmlexp.ehtml".

Macro Arguments

When a macro is invoked everything between the start/end delimiter is passed to a specially written "splitting" function. This function scans the data and returns a list of items back to the caller. In every case the very first argument or word in the macro must be a previously defined Python function or variable.

When scanning a macro items are grouped or split at every white space character (spaces, tabs, newlines, etc.). You can override this default grouping by placing parts of the argument in single quotes (''), double quotes ("") or curly braces ({}).

In the case of a simple variable it is simply expanded as a string. For example, the macro "[[ROOTDIR]]" would be expanded as "../..".

In the case of a function, the parsed data is passed to the function as a Python list. So,

[[SEC This is a Header]]

Would be passed to a function with the name "SEC" with the argument list ["This", "is", "a", "Header"]. Note that:

[[SEC "This is a Header" ]]

would be result in the list ["This is a Header"] being passed. In the case of the macro [[SEC]] the results will be nearly the same .. in this case a difference will be that the internal space in the quoted string will be preserved; in the unquoted string duplicate spaces will be changed to single spaces and other whitespace characters (linefeed, tab, etc.) will be converted to single space characters.

Special Macros

Sometimes you don't want the line splitting code to work its magic on the stuff being passed in a macro. In this case prefix a single '#' to the first word (usually the name of the function being called). This is useful when defining functions and when you have long arguments you really don't want split into words. Example:

[[ #SEC This is a Header]]

will result in the string "This is a Header" being returned.

Macro calls inside a #FUNC will still be expanded.

Quoting

Many languages support quoting: ie. prefacing a special character with a '\' or other marker. htmlexp doesn't. So, the rule to remember is simple: no quoting is done.

If you find that certain characters are being swallowed, especially in Python code, you might want to redfine the [[ ]] brackets you are using; define your Python functions in a separate file; or send me mail :).

Writing Fuctions

If you know a bit of Python you'll find that writing new functions for htmlexp.py is pretty simple. Just be sure to follow the recommended guidelines. Rather than a detailed howto, we suggest you examine the source to see how it's done. But, a few guidelines and "gotacha's":

Each function show define exactly one argument. This argument is passed to the function by the parser as a Python list.
A function must return a string, even if there is no data. Ending a function with "return '' " is fine.
If you call previously defined function from a new function, be sure to concatenate the arguments into a list. Even if there is only one argument, it must be a list. For example, to call the function MYH3 you would use " MYH3(['stuff to print']) ". The []s create a list with a single item. In most cases you can use a tuple just as well. This makes defining function with the PYTHON macro a bit easier. So, instead of calling with [a,b,c] use (a,b,c).
Be charitable in how you accept arguments. For example, the MYH1 function really just needs a single argument (the stuff to set in a H1 line). One could be picky and reject a call which doesn't have exactly one string. However, it's much easier (and kinder) to pack all the data passed into one string. Again, look at the source file MACROS.py for examples.

When defining macros inside a source file you may need to change the macro delimiters. The most common problem is the simple fact that Python uses [] as a list maker ... if you get errors when compiling code which looks okay, see if your Python definitions have any nested []s in them—and change the macro delimiters from [[]] to something else.

Bugs

Plenty. If you find one let me know. If you have a fix, that's even better.

Where to get it

This program is hosted on my personal web site: http://www.mellowood.ca

Defined Macros

The following macros are defined in the main program and in the default MACROS.py file.. This data was automatially generated extracted with the the command "htmlexp -d". You should look at the file "htmlpy.ehtml" and the output of the "htmlexp -d" command. Very powerful combination!

Functions

BRACKETS()

Redefine the brackets used to define a macro.

       args - <start bracket> <end bracket>

COMMENT()

Ignores its argument.

DEFINE()

 Define a variable. If you just pass a single arg then the
        variable will be defined with a EMPTY value.

DLBOX()

Create a box for a download.

       args - <title> <filename> <location name> <text>

DLLINE()

Download line for files. Pretty specific for bvdp.

       args - <filename>

ECHO()

 Print a variable(s). Variables not defined are silently ignored. 
        All variables are expanded and the result in concatenated.

GETSYS()

 Execute system command and return result. 

        args -  system command to execute (ie. "ls -l")

HRULE()

Return a new paragraph and hrule line.

       args - none permitted.

IMAGE()

Format an image link.

       args - <image source> [Alt-text] [stuff]

IMGLINK()

Format an image as a url link. 

       args - "url" "name of image" [options]
              options are joined can have details like "border=2", etc.

IMGLINKMAIL()

Format an image as a mailto link.

       args - "email address" "name of image" [options]
              options are joined can have details like width="50%", etc.

INCLUDE()

Include a source file or files. 

       args - <filename> [filename] ...

LINK()

Return a basic html link. 

       args -  "<http address> [printable text].
               if no printable text passed the address is
               used as the printable.

MAILTO()

A mailto link. 

       args -  "email address" [printable text].
               if no printable text passed the address is
               used as the printable.

MYBOX()

Create a box with MYTBSTART and MYTBEND.

       args - <location> [text]

       Note: unlike MYTBSTART the location is madatorty.

MYH1()

A H1 Header line with increased font size (deprecated, use css).

       args - any number permitted.

MYH2()

A H2 Header line with increased font size (deprecated, use css).

       args - any number permitted.

MYH3()

A H3 Header line with increased font size (deprecated, use css).

       args - any number permitted.

MYTBEND()

 End a block. Just a consistant way to say "/div".

MYTBSTART()

 Start a block. Just a short cut for "div class=box".

        args - [location]  creates a local link point

PAGEHEADER()

Create a HTML page header. This will set up the various html top-of-file
       data and calls a function to create a tabbed menu. This is specific to
       my web pages. You should write your own!

       args - single arg with the title of the page.

PFILE()

Import a macro file. Must contain Python functions and variables.

       args - <filename>

PYTHON()

Call Python.  Can be used to define functions on-the-fly. 
       
       Note: all the python code should be passed in a single arg,
             so don't forget to bracket or use '#PYTHON'. Example:

             [ PYTHON '
             def FUNC(p):
                return "whatever" ' ] 
       
       Note: it is important to respect Python's indentation rules!!!

       args - <python code>

RLINK()

Return an html link relative to the root directory. 

       NOTE: the current value of ROOTDIR is inserted into the address.

       args -  "http address" [printable text].
               if no printable text passed the address is
               used as the printable.

SETROOT()

Set the root directory. Value accessed with the [ROOTDIR] command.

       args - <dirname>

Variables

MYDOMAIN = "mellowood.ca"

MYEMAIL = "bob@mellowood.ca"

ROOTDIR = "."

VALMAIL = "val@mellowood.ca"

This page "htmlexp.html" was last modified on Sat Jul 12 10:39:04 2014