Saturday, November 04, 2006

Double line spacing in Dissertations

It is standard practice to require dissertations to be double-spaced. In the age of word processors, one wonders why this practice persists. This must surely be a legacy from the days when dissertations were typewritten and students were allowed to make corrections in situ rather than re-print. Word processors have eliminated that problem. Double-spacing might also be suitable for drafts which require annotation for editing, but final copies of dissertations are not used in this way and modern techniques such as commenting do this job.

(Primo Levi's 'Chromium' in 'The Periodic Table' has several great examples of the way solutions to old probelms hang around even when the problem not longer exists.)

I have seen readability expressed as another reason for double-spacing. It's not easy to track down the evidence for this. Kruk and Muter(1984) report "single spacing produced reading that was 10.9% slower than that produced by double spacing" on a 'video' screen. Weller (2004) reports on a previous study that "It was discovered that single spacing of text requires more eye fixations per line and therefore fewer words are read per fixation, which increases reading time (Kohler, Duchnicky & Ferguson, 1981)." But Mills and Weldon (1987) report that this same study showed a 2% slowing of reading rate, hardly convicing evidence.

Other writers express the view that this convention is out-moded e.g. the author of the Tex manual : link

Princeton University now accepts single-spaced printed copies although ProQuest (an online dissertation repository) still requires double-spacing for the electronic copy ( and hence all printed copies therefrom). It is claimed that this improved on-line readability.

A PhD Student in the US has calculated that 20,000 reams of paper would be saved if ProQuest accepted single-spaced.

Double-spacing is often mandated in legal practice but even here the convention is challenged.

Monday, October 23, 2006

Declarative Processes

I am getting back to the declarative process work I began a couple of years ago. Peter Marks and Ben Moseley's session at SPA2006 got me started again.

I now have a partial implementation of a DVD rental application written in a declarative process style using eXist and XQuery. The essence of the design is to retain all events and compute the current state by the evaluation of predicates over the event trace. The rules are hard-coded in eXist functions in this demo. Rules are mostly XPath expressions. For example
  • To find the current address for a person $p given SetAddress events
    • events($p)[SetAddress][last()]//address
  • To find the address at $date
    • events($p)[SetAddress][@date < $date][last()]//address
  • To check if a member is suspended given Suspend and Reinstate events
    • exists( events($p)[Suspend or Reinstate][last()][Suspend])
Some of the advantages of this approach are clear:
  • the state of the system can be evaluated at any time in the past by filtering the event stream for events before the specified date
  • if the rules are held as data, the state can be computed synoptically (i.e. according to rules applicable at the time) - but how should they be encoded - as executable XQuery expressions?
  • changes in rules are supported, allowing re-interpretation of the past
  • subjective rules are supported although only within the event language created
  • 'what-if' analysis is supported, allowing questions to be asked about the impact of changes in the rules
  • rules are robust to insertion of other events
  • the event history is often required to understand what has happened and why so there is no addition storage requirement
Some disadvantages are also clear:
  • The computational cost is higher than simply fetching a simple value
  • Writing declarative rules is not easy and will need a higher level language
  • The approach requires a global model of the system, so its applicable to systems but not to distributed systems
  • It's not yet clear what range of behaviour is expressable

Much more work is needed to see how well this model will fit with other processes. Some applications include:

  • lightweight processes - e.g. academic processes such as setting and marking coursework
  • descriptive models of real processes and their subsequent analysis for process improvement
Prior work

The Alloy language (Daniel Jackson, MIT)


Peter Marks and Ben Moseley - Functional Relational Programming, SPA 2007, FRP site

Work on rule-based language has had a rough ride - several projects in this area (e.g. BRML) are defunct. Some activity in R2ML and RuleML.

Friday, August 04, 2006

America

Time to forget about University and the project. Tomorrow we're off to the USA for adventure and catching up with friends and relatives.

Here is a link to a GoogleEarth overlay containing the main places we are visiting:

Trip on Google Earth

Photos from the Minnesota trip are now up on Flickr: http://www.flickr.com/photos/paddlers9/

Sunday, June 04, 2006

eXist examples

I now have somewhere to host my example XML database examples.

http://www.cems.uwe.ac.uk/chriswallace/index.xql

This supercedes the previous hosts for my teaching applications on a friend's server and on the exist-db demo server.

There are a few there, but further development will have to wait till the end of the term. Few typos I've just spotted :- (

One problem I faced was to allow a visitor to view the Xquery files themselves. I initially saw this as a eXist problem, needing an eXist solution, and indeed Adam, one of the eXist developers, put in a feature to support this. However I slowly realised that if I wanted to put up a number of examples (there are about 9 in various stages of development) , I should develop a schema for an application index page, and individual XML configuration files instead of writing unique pages. I can then use XSLT to display the page. If I then wrote an Xquery to fetch a query and display it , I could use the application configuration XML to authorise access to the scripts.

This experience prompted or reminded me of a couple of computing adages:

  • The only numbers of importance in computing are 1, 2 and many- with its meta counting variant: 1, 2, Schema
  • Use of the configuration file to generate the web page and to authorise access to the script as an example of the Shanley principle - in which one component serves multiple purposes see Michael Jackson, Software Requirements and Specifications, Addison-Wesley 1995, p29-30].

In future, I want to enhance this configuration file to be a description of the application architecture and information flow, so I can also generate, via GraphViz, a clickable SVG image of the application.

That's the problem with eXist and XQuery: this technology greatly reduces the time needed to create an application, but it increases the range of ideas for applicatins which can be easily implemented even more!

Monday, March 13, 2006

RSS

Discovered bbc backstage this morning whilst researching for my lecture on Web services and REST. As an experiment, I thought I'd add a page of RSS-derived links to the FOLD. I shouldn't still be surprised by this, but it only took a few lines to add a primitive page.

I've put the demo up on the exist demo site. There is also a version which generates Voice+XML to make a real newsreader. It needs Opera 8 with the speech extension and there is only a selection of two American voices but it handles the task very well.

Monday, March 06, 2006

Cellular automata and Processing

I've just re-discovered my Processing application which implements the generalised Conway Game of Life. The interface allows you to set any rule - for each of the counts of alive neighbours, whether a dead cell lives or a live cell dies. You can also run Conway's rule or generate a random rule.

Here it is

and here is my Processing page

What it doesnt have is any way to register interesting rules the viewer may discover. I discovered a nice one today: l2d=134&d2l=157 which shows constantly varying interesting behaviour - perhaps for a screen saver. Also l2d=125&amp;d2l=3568&id=0.1 which is a slow converger to nearly all white with a few blinkers.

So two jobs to do: One to allow finds to be registered and commented on, the other to take automata specification from the URL in a form like that above. Both seem very Web2.0 - the first enabling participation, the second a unique url for every resource.

Apart from being a exercise in writing Processing, I use it in a lecture on Processes and Emergence.

'Processing' is a poor brand name (try searching for it!) [like eXist in that regard] but a wonderful little language for animations. Developed by two design students form MIT, Ben Fry and Casey Reas, it is open-source software with a really nice little IDE, Java-based (it generates a Java applet) with some great examples of both animations and information display. A much better introduction to programming than full-on Java I think and a candidate for a first language, partly because it places emphasis on algorithms and change, not structure and stasis. Its also much simpler than Flash Actionscript as well as non-propriatory.

Here are some of my favourite Processing examples

Zipcodes by Ben Fry - locates a US Zipcode
Flight Patterns by Aaron Goblin - visualising the flights of planes in the US on a single day
Dreamlines by Leonardo Solaas - a Flickr/Processing mash-up

Saturday, February 11, 2006

Family Relationships

The family album site really needs generalized deduction. The deductions about dates and ages are hand coded:

deduce the date of the photo:

photo.deduceddate =
(photo.date,
{ some subject s | s.birthday and photo.s.age | s.birthday + photo.s.age},
)[1]

deduce the age of the subjects:

all s: subject |
s.deducedage = (photo.s.age, photo.deduceddate + s.deducedage)[1]

One neat trick would be to calculate relative relationships: if the viewer is x and the subject is s then what is s to x ( great great grand uncle or even the more verbose - your father's father's father's brother)

This calls for a Prolog engine perhaps

Possibilities include
  • PyLog a Prolog translater for Python
  • A number of Prolog engines listed here but this looks very old.
Or RDF and Jena?
[sites which are undated are just terrible - I must remember that on my own site]

Sunday, February 05, 2006

Exist NDX web server

Paul has generously provided me with space on his server to install exist in its Jetty configuration. This provides me with an external site which I can use with GoogleMaps. So far I've been putting up the family history site.

Developing on this site is much more immediate that with my typical FTP enabled site. With the Java client, I can work directly on the database , editing code, uploading scripts and photos and executing test queries. There is also a rather safer web interface. Great for rapid development. I was working on improving the XSLT and looking at the problem of aliases (maiden names usually) . I'm so used to using the oh-so-useful generalised = in Xquery that I forgot that Xalan is still on XSLT 1.0 with XPath 1.0, whereas XQuery uses XPath 2.0. This generalisation is the main difference to hit me but its so useful and a pain to have to replace with the inferior contains(). In fact I'm really struggling with alias still.

One idea is to develop this family history with my brother Richard and my nephew, Reddyn, in New Zealand, who I have sadly neglected. Here we could all work together in a very Web 2.0 collaboration. Hope they run with the idea.

Tag Clouds

Just been playing with generating tag clouds, based on the UCAS keywords which I obtained before Christmas. These are now integrated into the view of programmes, but perhaps all such relationships are worth reversing - here keyword to programme - and the obvious way to do this is with a tag cloud. Ian reminds me of my play with Postscript, generating exam result lists in fonts scaled to the mark itself. The big plus here was that when students were sorted into ascending order of the overall average, you could see at a glance how well each separate exam correlated with the overall.

My tag cloud uses xquery and xslt. First pass to compute the tags and counts, then compute min and max counts from this computed element, and a scaling factor, all in Xquery. XSLT generates the cloud, computing a fontsize based on the count. All a bit slow but it does the job.

I've also just realised a feature of most clouds I've seen - double description [Bateson]. That is tags are not only scaled in size but intensity is increased as well. Newzingo uses a 7 point scale, not the continuous variation I'd assumed from my Postscript days, where both font size and red intensity increase together. Clouds differ in the link styles - default isnt appropriate since it's just a page of links, but just what style is best? I've set the title attribute on my tags, containing the course codes themselves for a quick check but this seems to conflict with hover styling.

But I wonder how to use the comparative idea here - so that keywords in one faculty can be contrasted with those in another, in order to visualise the difference between the two faculties. Perhaps the trick would be a single map which can be dynamicly switched between the two - perhaps even programmed to do so, like a flicker star-map comparator. I would need to get the two maps to register, so that there is a single merged map, but words can be turned on or off - perhaps I'd have to drop the size scaling since this will alter the position of text. Just coluring the text by coding for the datasets and varying the intensity proportional to importance.

Since our new VC seems likely to be merging faculties, this might be a useful tool for him to see which ones to merge - quelle horreur - its just the way this visualisation might be used!

Friday, January 20, 2006

More Google mashups

More play with map-based mashups. This evening, using programmableweb's listing , I found Flash Earth, a Google Earth mashup. Here's the link to a narrow siq in the Wadi Rum area in Jordan, one of several with old inscriptions carved in the sandstone.

This memory game is a neat idea, using yahoo images to populate a layout of images to play pairs (or memory).

But my favourite is Community Walk Last night I started to create a walk around the university campus - just a single, rather bland view of our building - perhaps this is not such a good idea if we want to attract students. Later I started one for our trip to Jordan. Just three photos uploaded at present, and I'm uncertain about the location of one of these - I can''t seem to find the location of Petra. I do wish I'd keep the GPS data since I had the Garmin with me to check heights and distances, but at that time, the advantages of geo-coding photographs just hadn't occured to me. Maybe a student project to make a camera and GPS lashup. What would be really nice would be to add a fluxgate compass too, so the photo is coded with the direction in which the camera was pointing as well.

Just found another UK mashup of all UK schools It shows the location of all UK schools, locates the nearest 20 to a specified location, highlights the marker from a list of school names and their Ofsted rating (at least for Independent Schools).

Finally on this bit of research, located Barry Hunter's work on at nearby.org.uk which offers a REST interface for conversion between a range of different coordinate systems, e.g. Postcode to Lat/long. Will be very handy for the FamilyHistory mashup, although the postcode is only down to sector level - not enough accuracy for this application, and I suppose there's not much extra work in hand-coding this number of locations.

Tuesday, January 17, 2006

Beginner's Mashup

I've been playing with integrating maps into my family history photo album. The idea is to place each photo in time, place and (through the people in the photo) in the web of family relationships.

The first attempt uses an XML file of photo meta-data, transformed with a simple XSLT on the client-side. This transformation is to a table of data with links to the raw jpegs, map positions (courtesy of Multimap) and years (thanks wikipedia). No relationships yet. Hardly a mashup, since the context data is linked only with URLs.

The second attempt uses Google maps. Now the map is embedded in the photo page, much closer coupling afforded by the use of Javascript. The step of including the Google Javascript source is coupled with authorisation via URL parameters- a neat application of the Shanley principle. The XSLT caused me a bit of pain here when generating the Javascript and it's still not right: the transformed HTML works fine in IE6 and if generated client-side with XALAN, but its no good on Firefox which just spins.

I now have to think how the students will use and develop this example: their server is inside our firewall and so Google won't be able to issue a key. They need a server in the DMZ for this I guess but we need that for the new Internet Applications Development module for September this year.

Sunday, January 15, 2006

The title

The title is an echo of Alfred Russel Wallace's Line through the Malay Archipeligo which he postulated to explain the differences he observed in bird and animal species. The line runs between Borneo and Sulawesi, and between Bali and Lombok, Asian species (such as tigers) to the west , Australiasian species (such as marsupials) to the east. Nowadays the theory of tectonic plates and land bridges during ice ages provides an explanation for this disjuncture.

My appropriation is hardly warranted: ARW is no relation of mine, and I'm a teacher of computing, not natural history. I have crossed the Wallace line however, when a few years ago, my wife and I circumnavigated in our yacht Perdika of Bristol. It was in fact quite a struggle. The current flow through all these straits is fierce and we crabbed our way from Lombok to Bali, pointing fully 40 degrees north of our track over land, and later when motoring northwards up the strait, finding the inshore counter-current frighteningly close to the Bali shore.

The title also reflects the nature of a blog, a time-line of writing. I'm interested in time-lines and the representation of complex event-spaces at present, partly as a domain to use in my second year course on data structures which this semester will be looking at XML and XML databases, eXist in particular. I've also started to do a bit of genealogical research into my personal 'Wallace Line' and discovered how addictive this detective work is. From census records available online at scotlandspeople.gov.uk, I find that my great-grandfather (or rather one of four of course) Ebenezer Wallace was a wine merchant in Edinburgh in the 1880's, a fact which makes my own consumption of good wines now feel like a family duty.

The first word

I teach internet computing - therefore I must blog.