I've been experimentation with woopra.com An internet statistics tool. Which requires a bit of javascript code to be included to each page to operate. This really is simple with increased dynamic sites with universal headers or footers although not for totally static html pages.

Cleaning it once a to operate round it using a mixture of Apache rewrites and SSI's to "Wrap" the static html using the needed code. For instance...

I made the next changes to my apache config

    RewriteEngine On
    RewriteCond %{REQUEST_URI} !=test.shtml
    RewriteCond %{IS_SUBREQ}  false 
    RewriteRule (.*)\.html test.shtml?$1.html

The exam.shtml file consists of...

    <script type="text/javascript">
       var XXXXid = 'xxxxxxx';
    <script src="http://xxxx.woopra.com/xx/xxx.js"></script>

    <!--#set var="page" value="$QUERY_STRING" -->
    <!--#include virtual= $page -->

The concept was that the request arriving for


could be rerouted to


the the shtml would then range from the original file in to the response page.

Regrettably it does not quite act as planed :) can anybody see things i am doing wrong or possibly suggest an alternate approach. Can there be any apache modules that may perform the same factor. Ideally that may be set up on the per site basis.



I believe that mod_filter_ext may be the module you're searching for. You are able to write a brief Perl script for instance to place the JS code within the pages and register it to process HTML pages:

while (<>) {
    print $_;

You can make use of something similar to sed to do the substitution.

When the pages are static, why can you change them quickly rather than preprocessing all pages on the site, adding the bit of requiered javascript to them of these? This really is easy and most likely more efficent (you most likely convey more pageviews than pages to alter)

This may be done a plenty of way. I recommend a little perl to to inline alternative.

@Pablo Alsina

why can you change them quickly rather than preprocessing all pages on the site

You will find numerous explanations why you might want to leave the initial static files unchanged.

  1. They might fit in with another person. Eg administratively altering the files submitted by another user
  2. They might be being auto-produced by another system you don't want/cannot change.
  3. You might want to have the ability to enable/disable/customize the extra data instantly. You won't want to need to re-parse a whole site each time (might be hundreds of 1000's of pages)
  4. You may be doing the work for that technical challenge :-)


ok the technique above's greatest issue is it might break your html validity by putting a script tag outdoors the <html> tags

i'd accept others on the pre-process go beyond your html files like a sed/awk script

heres a fast example presuming the script part could be added prior to the </head> which the </head> is at the beginning of a newline


cd /var/webserver/whatever/

grep -r '<\/head>' */*|grep "^.*\.html*:" >/var/tmp/tempfile.txt
((lines = $(wc -l /var/tmp/dom-tempfile.txt | awk '{print $1}')))
if [ $lines -gt 0 ]
 while read line; do
 sed 's/<script type="text\/javascript"> var XXXXid = "xxxxxxx"; <\/script><script src="http:\/\/xxxx\.woopra\.com\/xx\/xxx\.js"><\/script><\/head>/^<\/head>/g' $line>/var/tmp/tempfile.htm
 mv /var/tmp/tempfile.htm $line
 done < <(sed 's/\(^.*\.html*\):.*$/\1/' /var/tmp/tempfile.txt)
exit 0