[Webchange-list] Status Report

Vince M Hudd vince at softrock.co.uk
Tue Apr 22 19:08:16 BST 2008


Hi all,

With the Wakefield show looming I thought it about time I posted a
quick status report.

Essentially, the software is just about finished enough to be useable
- provided the required useage doesn't include any of the things I
haven't yet done. (Hoho). Off the top of my head (because I'm 5 miles
from the sources) these things are:

Firstly, incomplete or missing items that were in previous versions

1) The indexing facility

   I've held off doing this because in previous versions, it just
   'indexed' things as it found them. This time I want to sort
   them alphabetically before writing the output file. This won't
   prove difficult, but it will have to wait until a few weeks
   (probably) after the show.

   An additional change is that I might allow a file of 'index terms'
   to be specified, so that as well as indexing everything in
   certain tags, it will additionally search for the terms in the
   file and index those.

2) The filename changer

   There's no specific reason I haven't yet done this. It just hasn't
   been implemented yet, and will now have to wait until after the
   show.

3) The 'file updates' feature, which inserts the dates of files,
   doesn't yet handle directories (wherein it returns the date of the 
   most recent file). This is a fairly simple bit of programming, and
   I only missed it out to save time; I just wanted to get the basic
   date inclusion sorted.

4) The sprite splicer. This was always very limited and, frankly, not 
   very good and I'm not even sure is worth the effort - did anyone 
   ever actually use it? I can keep it on the to-do list if anyone
   really wants it (the menu entry to call it is there) - and if so,
   I'll try to produce something a little better than before; but 
   doing stuff with graphics isn't really my strong point. (WebChange
   - and most of my library code - really being geared towards text   
   manipulation).

5) The JPEG extractor. Similarly, the menu entry is there, though I've
   written nothing for it as yet. Like some of the above items, it
   should be fairly quick and easy to do once the show is out of the
   way.

   However, this might very well disappear to be replaced by something
   more generic; when I wrote the new search functions and put them
   in my standard library, I did so with more than just searching text
   in mind - so it might be possible to use this (in fact when the 
   jpeg extractor appears, it *will* be using this!) and, with a 
   little thought as to how it's implemented, create 'definitions' of 
   how particular types of file appear and can therefore be extracted.

   Possibly.

6) The 'search only' button, which returned a list of files containing
   the search term. With Seek'n'Link available, I'm not sure if its
   worth including this.

7) The site statistics, which returned the numbers and sizes of
   various filetypes isn't yet done. I want to make this more useable
   (and customisable) than before, so some thought needs to go into
   it,

Secondly, new features planned but not yet added

1) A new script language. It currently uses (more or less) the same
   script language as before, which I've now decided is ugly and needs
   replacing. However, even if I didn't change it completely, there 
   are things I can do to improve it. Most notably, the facility to
   call subroutines - or even other scripts - but that's by no means
   all.

1.5) Connected to this, the choices available from the front end might
     change.

2) A customisable toolbar. At the moment, this is just a blank space 
   at the top of the window, but what I'm going to do is fill that
   with a series of blank icons. If a script file is dragged onto
   one of these icons, that script will be associated with the icon
   and clicking that button will run that script.

3) Some new back-end processes: A newline compactor, a 'breadcrumb'
   inserter (requested about a million years ago), a file renamer
   which /also/ alters any links to the file, and a (local) link
   case changer. There might be others in my notes that I can't
   remember ATM.

Those things not implemented to one side, the following is a general
overview of some of the changes that have been made. Note. most of the
changes are internal, the result of a complete rewrite in C resulting
in much tidier source code which will be easier to build on in future.

A good example of an internal change that has no effect on the user is
in the file/size/date insertion: these were previously implemented as
three separate binaries, each using a broadly similar core of source
code. All three are now implemented via the same core function, so
improving that for one improves it for all three. Related to this,
they were also implemented using their own mechanism for finding the
FileUpdate/FileSize/FileInclude tags, but they now do this by simply
making a call to my standard wildcard search function - the date code,
for example, calls the search code for:

 <!-- FileUpdate:* -->*<!-- /FileUpdate -->

And uses a standard function call to read the file reference (the
first wildcard) and the currently included data (the second wildcard).
Previously, because my search code was only implented in my search and
replace binary, it couldn't do that and had its own code to find the
tags and extract the information.

Anyway, that's an aside - to the general overview of changes:

1) It's now 32bit. Okay, that one was obvious. :)

2) Previously there was a 'front-end' !RunImage file, and a 
   subdirectory of "Binaries" - the script interpreter, and one binary
   for each process. There is now a subdirectory called "RunImage" 
   containing just two binaries: !FrontEnd and !BackEnd. !FrontEnd is 
   the equivalent of the old !RunImage file.

   !BackEnd is  the equivalent of the all the old binaries, rolled 
   into one. There are a number of benefits from doing this:

   Firstly, because each individual binary used my standard library,
   as well as some broadly similar code to one another, the overall 
   size of the binaries folder was getting quite big. Now the
   similar code is actually the /same/ code, called appropriately,
   and the library code is only there the once. The file size has
   dropped to about 10% of what it was - from around 500KB to around 
   50KB. (The front-end is also around 50KB IIRC)

   Secondly, by combining the back-end into a single entity, the
   script processing is a touch more reliable. Previously, if a script

   called a particular process and something went wrong /in/ that
   process, the relevant binary would report an error and quit; but 
   because they /were/ all separate, the calling program (ie the
   script interpreter) would carry on - so it might then go on to
   call another process. Because the previous one hadn't been able to
   complete, the results of doing this could be undesireable.

   Now that the whole of the back-end is all rolled into one, however,

   a particular process deciding that it can't carry on causes the
   script to stop. This is much more sensible.

3) In the search and replace, the wildcard support is now enhanced
   over what was there before. For details of what wildcard options
   are now supported, I'll refer you to the Seek'n'Link manual,
   since that uses the same search functions:

   http://www.softrock.co.uk/resources/manuals/seeknlink/index.html

   However, Seek'n'Link is search only - so the wildcard support in
   the replace string works as follows:

   ?n; to insert the nth single character wildcard match (ie those
       matched with ? in the search string.

   *n; to insert the nth multi character wildcard match (those
       matched with * and ~n; in the search string).

   In both cases, n can be anything from 0 to 255 IIRC.

   A vertical bar | followed by an asterisk * inserts an asterisk, and
   followed by a question mark inserts a question mark. Followed by
   a letter, it will insert a control character (a = 1, b=2, etc)

4) Instead of dragging a script to the main window to run it, you
   drag a script to the 'script' icon, where it's held until you 
   click "Run" - this script is remembered from one session to the
   next until its replaced. This is a pre-toolbar idea, so could
   perhaps be removed once that's implemented.   

5) The script language. As I said above, this is largely as it was
   before, but not exactly.

   Firstly, the way the settings are assigned values has changed.
   Previously you'd use "ChangeSetting setting.name to 'value'" (or
   the shorter "Set" instead of "ChangeSetting"). Now you just use
   setting.name='value'

   Secondly, a similar change has been made to the RunProcess/Run
   command: You now just put the process name on its own.

   Thirdly, there was a bug that meant the main script actually had
   to be two scripts. That bug hasn't been implemented this
   time. :)

   Fourthly, the number of recognised conditions have been expanded
   IIRC previously there was null(), set() and unset() (and possibly  
   yes() and no(), which were the same as set() and unset() - can't 
   remember offhand). Now there are:

   null() - check if specified string is "" (contains nothing)

   yes() - check if the specified string contains "Yes" (any case)
   set() - as above

   no() - check if the specified string contains "No" (any case)
   unset() - as above

   exists() - checks if the file specified exists

   open() - checks if the file specified is open

   closed() - checks if the file specified is closed

   running() - checks if the app specified is running

   For all of these, prefixing the condition with a pling (!) means
   checking the reverse; so !running() checks that the specified
   app is NOT running.

   Any similarity between any of these conditions and any of those
   in WaitUntil is entirely intentional. ;)

   Fifthly, there's a parser that is used everywhere that a piece of
   text is 'read' - so between the brackets in the conditions, the
   right hand side of the 'equals' sign when changing a setting,
   and the text presented to the user in a prompt. Any combination
   of actual text enclosed in "s, and setting names can be used.

   For example, if you set (say) replace.changefrom_1 to "Y" and
   replace.context to "s" and then use the following in a condition:

   yes(replace.changefrom_1 "e" replace.context)

   the condition will return as true - the specified text contains the
   word "Yes"

   However, note what I said above about the script language being
   likely to change.

6) There's a new process - one to strip leading and/or trailing spaces
   from every line in a file. (A few others planned, but this one 
   done).

7) The 'Alt text' process now works in a subtly different way. 
   Previously, you chose from a menu whether to insert the word
   'Picture' the word 'Image' or the leafname, and an option to
   enclose that in square brackets. Now you specify a string in a
   writeable icon, using %f to represent the filename. So, if you
   used "Image" and brackets before, specify "[Image]", or if you
   had the leafname and brackets, now specify "[%f]"

8) Previously, WebChange only knew about the existence of a small 
   number of filetypes. There are now three file /categories/, and
   what fits in those categories is under user control.

   There are "markup filetypes" - such as HTML, XML, etc.

   There are "plain text filetypes" - eg TXT.

   And there are all other filetypes.

   The user can add filetypes to the first two definitions, and 
   everything that isn't in one of them is deemed to be in the third.
   This is done because some processes apply to all text based files,
   which includes both plain text and markup types, whereas some
   apply to mark up only.

   I may expand this to have a "graphical filetypes" list as well - 
   and it might be using these definitions that the site stats (see
   the not yet done section) is best implemented.

9) For the processes that add extensions and set the filetypes
   according to the extensions, WebChange now uses Mimemap - this
   automagically means it now maps extensions for *all* the mappings
   your computer knows about.

   However, a problem with Mimemap is that it doesn't provide the
   facility to return a 3 letter extension if there is a longer one
   (eg ask it for the extension for &FAF and it will tell you it's
   /HTML - it won't return /HTM (Simply stripping the last letter
   isn't a solution - that would make /JPEG /JPE rather than /JPG

   So, to get around this, there is a long<->short mappings file
   that the user can modify.

10) Because with many of the processes it's impossible to know in 
    advance how much memory would be needed for the altered files, the

    processes now build the altered files in a temporary file on disc.
    Slower, but a more robust approach.

11) And saving the most important change until the last: It has a nice
    shiney new logo/icon.

    (Which has been on the WebChange website for some time if you want

    to see it: http://www.webchange.co.uk )


Cheers,

-- 
Vince M Hudd
Soft Rock Software
http://www.softrock.co.uk



More information about the Webchange-list mailing list