Thursday 26 March 2009

Drawing with Lines

This Chapter examines the nature of lines and how they can be used to create art using the line drawing facilities provided by Processing. Initially we look at the broadest possible description of a line and then get down to the nitty-gritty of making a program produce a required result by drawing lines.

Not all lines are straight. A geodesic, the shortest distance between two points on the surface of the Earth, is a curve. Add random noise to a line and it can take on an aspect of a lightning strike, flickering flames, a rough sea, rain falling on a lake and other representations of chaotic behaviour or liquid flow according to the frequencies present in the noise. Any kind of line, straight, curved or distorted, can suggest an edge, a boundary or a link between two locations.

Alternatively we may want a simplified view where the purity of straight lines or the intersections of many lines are used to represent underlying curvature or some more complex shape. A grid of lines can define a two or three dimensional space where equal spacing represents the world as we see it but other spacings can give the effect of non-linearity. We can show distances and dimensions that would otherwise be too small or too large to see. We can also employ the additional dimensions of light, colour, hardness, discontinuity (dots and dashes) and blur to achieve the effect we want.

There are few straight lines in nature and even fewer orthogonal straight lines. A straight line has its own special significance when used representationally. An icon of human-kind one might say.





















Satelight image of the nazca lines
from Wikimedia Commons NASA/GSFC/MITI/ERSDAC/JAROS, and U.S./Japan ASTER Science Team



The Line Drawing Functions

Having got to grips with coordinates, setting colours and suchlike you will probably find line drawing relatively simple. Unlike some less enlightened environments Processing has only a single line drawing function and three line attribute setters.

The line() function

Line(), is overloaded for drawing in 2D and in 3D. A line function with four arguments draws a line on the picture plane, the screen for current purposes.
   line(x1, y1, x2, y2)

draws a line from the point (x1, y1) to (x2, y2).

To draw in 3D we must first setup a 3D drawing environment. Two 3D environments are available and as detailed in XXXX they are invoked by passing an extra argument to the function that sets the window size at the start of the program. You can use either


  size(x, y, P3D);   or   size(x, y, OPENGL);
size() can also be called with a P2D argument to get faster 2D drawing than the default JAVA2D environment provides: size(x, y, P2D). At the time of writing P2D did not support all the facilities of the default mode for line drawing. Try its effect using the simple line drawing program we come to in a moment and check the line drawing and attribute setter functions on the Processing website for current details.

With a 3D environment in place use:
  line(x1, y1, z1, x2, y2, z2);
to draw a line between points (1) and (2) in three dimensional space. The extra coordinates z1 and z2 are the distances of points behind the picture plane. Their depth back from the front face of the screen is another way of thinking about it. If a line is located behind the picture plane and/or tilted in the depth dimension, it will be shown shorter than its length drawn flat on the screen.

The line() functions accept floating point values and can be called with floating point or integer coordinates. Fractional values are rounded to the nearest pixel but casting arguments to integer can be sometimes be useful to smooth jagged parallel line endings e.g:

   line(floatX1, floatY1, (int)floatX2, (int)floatY2); // smooth endings at X2, Y2


Line attribute setters

Three functions set the 'style' of the line that will be drawn by the next and subsequent calls to line(). The effect of calling any one of these functions remains in place until the same function is called again. To be able to revert to a previous value we need to make sure the old values are stored. Line style setters and many other functions that modify the drawing environment have a corresponding variable in the current instance of the Processing application interface, called 'g'. To store the current line colour, for example, we can write
  int oldColour =  g.strokeColor.
Accessing variables using 'dot' notation is discussed later, particularly in Chapter ZZ: Object Oriented Programming. To revert to the old colour write
  stroke(oldColour);
The facility to store and to revert to old settings allows a section of code or a function to draw using settings applicable to the task in hand and restore old settings afterwards - an aid to abstraction discussed later in this Chapter in connection with parametric graphics.

The stroke() line colour setter accepts the same colour and alpha opacity values as the other Processing colour setters detailed earlier in XXXXX. Two other setters affect line drawing: strokeWeight() sets line thickness, strokeCap() the shape or relative location of the end of a line. The program in Listing CC.1 demonstrates the effect of the line style setters.


/* set window size and drawing environment */

// declaring constants static final makes sure they
// cannot be altered by mistake later in the program
static final int WIDTH = 200;
static final int HEIGHT = 200;
static final int OS = 20; // lines offset from edges

void setup() {
// size(WIDTH, HEIGHT, P2D);
// PD2 mode not fully functional in Processing 1.0.2
// so use:
size(WIDTH, HEIGHT);
}

/* draw in continuous mode */
void draw() {
// fill the window with charcoal grey
background(0.25 * 255);

/* draw a 45 deg line top-right to bottom-left */
// set line colour to greyscale white
stroke(255);
// set line width to 12 pixels
strokeWeight(15);
// make the ends of the line squared off
strokeCap(SQUARE);
// draw the line
line(WIDTH - 2*OS, 2*OS, 2*OS, HEIGHT - 2*OS);

/* draw two lines same length, parallel to it */
// rounded ends
strokeCap(ROUND);

/* draw a red line 20 pixels wide above */
strokeWeight(23);
stroke(255, 0, 0); // colour red
line(WIDTH - 3*OS, OS, OS, HEIGHT - 3*OS);

/* draw a green line 20 pixels wide below */
stroke(0, 255, 0); // colour green
line(WIDTH - OS, 3*OS, 3*OS, HEIGHT - OS);

/* draw a blue line 50% opaque, width 20,
top-left to the mouse cursor */
stroke(25, 25, 255, 255 / 2);
strokeWeight(25);
// extend line past cursor point
strokeCap(PROJECT);
line(30, 30, mouseX, mouseY);
}

Listing CC.1











Figure CC.2 Listing CC.1 running.




Running the program note the effect of the attribute setter functions strokeWeight() and strokeCap() for changing line thickness and the shape or relative location of the end of a line as shown in Figure CC.1. Use the mouse to move the blue line over the other lines and note alpha transparency generally and where the lines intersect. Graphics behind an area of transparent colour may appear lighter or darker according to relative brightness values. Try changing the blue line to white, stroke(255, 255, 255, 255 / 2); colours behind are now lightened.

Parametric Grids


The effect of drawing a single line tends to be minimal and it is more likely that many lines will be needed to produce a required effect. In this section we look at a strategy for doing this that extends the repetition facility provided by loops.

Another name for arguments is parameters and parametric graphics is a common feature of vector graphics applications. Using this approach a drawing task is abstracted (taken out) from the main part of the program and done by passing arguments to a function that draws the graphic. Many similar objects can be drawn with different sizes, locations, colours, and other properties determined by values passed to arguments.

Initially we experiment with parametric grids and to do this with a minimum of work the initial step is to set up a basic grid drawing program for testing them out. Optionally the basic structure can be extended to produce actual drawings that include grids. Listing CC.2 shows the outline:


/* set window size and drawing environment */

static final int WIDTH = 900;
static final int HEIGHT = 600;

void setup() {
size(WIDTH, HEIGHT); // default environment
doDrawing();
}

/* draw the drawing */

void doDrawing() {
// charcole grey background, again!
background(0.25 * 255);
// white grid lines
stroke(255);
// one pixel wide
strokeWeight(1);

// draw a grid with number of cells xCells * yCells
int xCells = 13;
int yCells = 19;
drawGrid(0, 0, WIDTH, HEIGHT, xCells, yCells, true);

}

void drawGrid(int left, int top, int width, int height, int xCells,
int yCells, boolean drawOutline){

}

Listing CC.2


As it stands the outline will compile and run but does nothing except opening a window with a grey background. Sections of functionality will be factored off using calls to drawGrid()to draw as many different grids as we require. Such power should be used carefully so the first thing is to decide exactly what we want a drawGrid() function to do and make sure all drawGrids variations we write do just that.

Arguments left and top are where the top-left corner of the grid will be positioned. Width and height specify the size of the grid in pixels, xCells and yCells the number of areas the grid is divided into in the horizontal and vertical directions respectively. Cells are to be made to fit the grid exactly by distributing any odd pixels evenly across cell widths and heights. If draw outline is passed true a one pixel wide line is to be drawn around the grid within the size given. All this sounds complicated but is actually very easy when we divide up the work using functions.

Drawing grids in 2D


The code for a 2D drawGrid() can be something like this:

void drawGrid(int left, int top, int width, int height, int xCells,
int yCells, boolean drawOutline){

// precondition for drawing the grid
if (width <= 0 || height <= 0) return;

// compute the coordinates of the rightmost an bottom pixels
int right = left + width - 1;
int bottom = top + height - 1;

// draw the grid
drawVertGrid(left, top, right, bottom, width, xCells);
drawHorzGrid(left, top, right, bottom, height, yCells);

if (drawOutline){
// save current line width
float curWeight = g.strokeWeight;
// draw outline with a line 1 pixel wide
strokeWeight(1);
line(left, top, right, top);
line(right, top, right, bottom);
line(right, bottom, left, bottom);
line(left, bottom, left, top);
// restore line width
strokeWeight(curWeight);
}
}
Listing CC.3

Preconditions for execution

Writing functions it is wise to check that all possible values passed as arguments can be dealt with by the code in the function body. What happens, for example, if drawGrid() is called with zero or negative dimensions passed to width and/or height?

If we see there is a problem a number of options are available. In this case a fully justifiable option is to simply ignore it. We are not trying to create a flawless machine and an emergent property of a program may turn out to be a spectacular work of art. On the other hand, if the code simply does not work a lot of time can be wasted trying to track down the bug.

Another option is to pass the problem on to other functions. Perhaps the grid will be drawn with all or part of it off the screen. This eventuality can be passed on for line() to deal with.

The third option is to write code at the start of the function that checks input is within range. These checks are known as preconditions: execution of other code in the function body is conditional on the checks being passed. There are three ways of dealing with a failed precondition: the function can return immediately to the caller; likewise but it returns a Boolean or some other value indicating success or failure; alternatively an error condition can be created. Java error conditions are dealt with by throwing and catching exception objects, certainly beyond the scope of this Chapter. For drawing grids a simple conditional return statement is appropriate:
      if (width <= 0 || height <= 0) return;

Note that other problems such as zero or negative numbers of cells are passed on to the grid-line drawing functions.

Minimizing coding

Another useful strategy is to avoid writing code that does the same thing more than once. This makes the code run faster but more importantly can make it shorter and easier to read. We are going to draw numerous lines from the left to the right of the grid and from its top to the bottom so compute right and bottom before doing anything else:
      int right = left + width - 1;
int bottom = top + height - 1;


The next task is to draw the gridlines, done by two functions:
      drawVertGrid(left, top, width, bottom, xCells, constrain);
drawHorzGrid(left, top, right, height, yCells, constrain);
Finally a one pixel wide outline is drawn around the grid if this option is set having first saved the current line thickness, as shown in Listing CC.3. Drawing the outline, and in calls to the drawing functions, the values of right and bottom computed on entry save numerous calculations.

So, you may be thinking, if working out this stuff early is such a good thing why not do it in the doDrawing() function, pass these values to drawGrid()and maybe make some savings in other functions called from doDrawing()? There are a few reasons for not doing this but the most important relates to abstraction. Working in doDrawing() we should not have to consider the needs of drawGrid() or any other functions called, only what they can do to get the drawing done in terms of the size, shape and location of the objects drawn.

Drawing uniform cells

The tasks to be done by the grid-line drawing functions are: check preconditions, compute cell size, initialise the drawing position to the first grid line and finally, loop while the drawing position is less than the last pixel, drawing each grid line and incrementing position by cell size. Coding these tasks to draw vertical grid lines gets:


void drawVertGrid(int left, int top, int right, int bottom, int width,
int xCells){

// precondition: if negative or zero cells exit now
if (xCells <= 0) return;

// compute cell width
float cellWidth = (float)width / xCells;

if (cellWidth < 1)
cellWidth = 1:

for (float x = left + cellWidth; x < right; x += cellWidth){

// x is the x-coordinate of the vertical line drawn in the window
line(x, top, x, bottom);
}
}


There are two points to watch out for computing cellWidth. The first one is division by zero. Java and Processing allow division by zero for floating point calculations without throwing a fit but an INFINITY result is usually best avoided (do it with integers and a divide by zero exception is thrown). In this case a zero divisor has already been excluded by the precondition: xCells must be greater than zero. Negative and zero values for width were excluded by drawGrid() making certain that cellWidth is positive. If cellWidth was zero or negative adding it to x in the loop would never make x equal to or greater than right and the loop would not exit.

Writing mission critical software - perhaps designed to avoid crashing our lauch vehicle in a nearby swamp or bringing a spacecraft to a perfect landing two miles below the surface - it would be wise to do similar checks in both the caller and the called function. As things stand a single check will do nicely. Excluding a negative value for xCells in drawVertGrid() rather than drawGrid() includes the option that other versions of drawVertGrid() will process such values to get a particular visual effect.

The second point is that Java’s multipurpose division operator returns an integer result when both operands are integers. The drawGrid() specification requires that cells fit the grid. To make this work, cell dimensions need to be floating point values so that when added together fractions of a pixel periodically add up to a whole one. The integer argument width is typecast to(float) getting a floating point result complete with any fractional part. Lastly, if cellWidth is purely fractional, the entire grid will consist of minutely overlapping lines and may take a while to draw. In this case cell width is set to unity.


The drawing loop


Using a for loop keeps lines of code to a minimum. You will recollect that the format of a for loop is generally:

for (initialize loop variable;
condition for entering or re-entering the loop;
action to be done at the end of each pass through the loop){

// stuff to be done in the loop goes here

} // end of the loop

At the start of the loop a line at the left of the grid is not required so the loop variable x is declared and initialised to draw the first line at left + cellWidth . Declaring the loop variable in the loop declaration makes it accessible only within the loop, a safety advantage over a while loop in many cases because its value after exit can be somewhat obscure.

Java for loops are not the simple counting loops seen in some other languages. Basically a for loop provides a concise way of writing a while loop by filling in the items inside the round brackets. Ideally the condition for entry or re-entry should specify a range of loop variable values that allows access to the loop - not a single value that when reached by incrementing or decrementing the variable terminates looping: loop while x != right; x += cellWidth, for example. Using an inequality exit condition there is a danger that x will be initialized greater than right permitting unplanned entry or that, due to a fractional initial value and/or an increment greater than or smaller than unity, x steps over an assumed exit value. Result: the loop loops-for-ever. The condition x < right avoids these problems.

The last part of the for loop declaration is executed after each pass through the loop. Here it adds cellWidth to the loop variable using the shorthand += operator to take it to the coordinate of the next grid line (approximately). In this case 'stuff to be done in the loop' is simply to draw the current line going top to bottom. The function that draws horizontal grid lines is similar:

void drawHorzGrid(int left, int top, int right, int bottom, int height,
int yCells){

// precondition: if negative or zero cells exit now
if (yCells <= 0) return;

// compute cell height
float cellHeight = (float)height / yCells;
if (cellHeight < 1)
cellHeight = 1;

for (float y = top + cellHeight; y < bottom; y += cellHeight){

// y is the y-coordinate of the horizontal line drawn in the window
line(left, y, right, y);
}
}

Drawing some grids

All done! Make a separate copy of Listing CC.2 by starting a new sketch and pasting the code into the window. Replace the dummy drawGrid() stub with the one in Listing CC.3, type in the drawVertGrid() and drawHorzGrid() functions, save the file with a name of your choice and run the program.
Try changing the number of cells in either or both directions by altering the variables shown in Listing CC.2, make the grid larger or smaller in either or both directions, change its left, top origin and so on. If you only want grid lines running in one direction pass zero for cells in that direction. Optionally, put additional calls in the Listing CC.2 code to draw several grids in different locations, perhaps with different colours and opacities using loops to change line style for calls to drawGrid(). Figure CC.3 shows an example.


Figure CC.3 Effects produced by drawing multiple and overlapping grids.


Drawing unequal cell sizes


Grid line spacing can follow any rule: logarithmic, quadratic, cubic, Fibonacci, etc. or can be random. Another version of drawVertGrid() draws grid lines with logarithmic spacing:


void drawVertGrid(int left, int top, int right, int bottom, int width,
int xCells){

if ( xCells <= 0 ) return;

int m = xCells + 1;
float logCells = log(m);
for (int i = 2; i <= m; i++) {
int x = left + round(log(i)/logCells * width);
line(x, top, x, bottom);
}
}


In this case the loop variable is an integer and values from 2 to m are submitted to the computation to get offsets from the grid's left coordinate. Working out line intervals relies on logm(x) = log(x) / log (m). Log(1) is zero, the grid's left origin, so m is (numCells + 1). Alternatively you may wish to ignore these facts and just look at the nice dynamic that logarithmic spacing gets for any number of lines drawn across the face of a grid.


Figure CC.4
Logarithmic and Circular Grid parametrics.


Using a rule more complex than logarithmic the easiest way to make cell sizes follow the drawGrid() specification for a specified number of cells with an exact fit to the grid is to compute relative sizes and store them them in an array. If predetermined relative sizes are needed to give an effect these values can be assigned when the array is created. The drawing loop simply uses the ratio between the sum of the sizes in the array and the dimension of the grid to work out line positions. Doing this needs slightly more code that the grid-line drawing functions we have investigated at so far and we can return to this variation later.

Posting Computer art and graphics section

Having wasted hours pasting text and graphics into the Blogger editor only to have have all the formatting collapse into bold italic decided to post Chunk 36 in bits.
Will do this starting from the last section and blog title as headings.

Monday 16 March 2009

Sorting Blog

Time to try posting something more substantial. Here is one prepared earlier.

This article discusses the merits and otherwise of comparison sorting algorithms in general use based on timings and comparison count data. It was found that performance of a fully optimized quicksort degraded when faced with so called median of three killer data compared with random ordered data but not to the extent that might have been expected. Mergesort was able to sort data with this configuration substantially faster than random data out performing all other algorithms tested and was only marginally slower than quicksort on random data. As noted elsewhere mergesort has an advantage over quicksort when there is any relationship between the key used to obtain an existing ordering and the required key and it is suggested that its additional space requirement is unlikely to be a problem where memory is not at an absolute premium. Mergesort sorts a 1,024,000 item list with a median of three killer configuration in less than 1/3 the time taken by a library function apparently engineered to deal with this type of data and appears to be a median of three killer-Killer.

Comparison Sorting: The State of the Art

One of the problems I have set myself currently is to implement an algorithm to comparison sort a list of random ordered data faster than quicksort. That difficulty comes from a Delphi Magazine article [1] where I suggested that it could be done. This blog reports findings from the article and examines relative performance of quicksort and mergesort to prepare for a look at sorting algorithms written in Java. You never know, something might just drop out.

The main points of the Delphi article were that memory usage was no longer the critical factor it had once been for general sorting on PCs and workstations and that mergesort was the algorithm of choice for sorting certain large data sets. Naturally the scenario under consideration related to Delphi where heap management did not include compaction and the machine stack provided a low overhead location for a buffer, mostly irrelevant to Java but not so the other findings. A library class TList came with a Sort method that accepted a comparison function and used a concise but unoptimized version of quicksort. TList exposes its underlying dynamically allocated array as a read property so that any sorting method can be applied to it.

Mergesort beats quicksort even when carrying the full key . . .

The article quoted a case where a list in StaffNumber within Name order was extracted from a database and sorted - StaffNumber within Name within DepartmentNumber. Numbers were integers and Name was a single-word string. Actual surnames from the UK were used, occurrence weighted based on some token occurrence data. As a stable sort, mergesort could re-sort the list on a single key. Around an 80% sort time reduction was found: mergesort on DepartmentNumber only vs. quicksort on the full key.
An interesting point that came to light was that mergesort still showed better performance even when the full key was used by both algorithms. As conjectured at the time and since confirmed, mergesort's stable characteristic reduced the number of times nested keys were accessed to decide a comparison. Timings for these unoptimized sorting algorithms running on a current desktop machine, maximum department size 500, are:

Mergesort can beat quicksort hands down for many sorting tasks but for a data set with existing ordering unrelated to the required key quicksort has the edge.

Mergesort vs. Quicksort

Mergesort and quicksort are the current algorithms of choice for in-memory comparison sorting. They both divide the list to be sorted into sublists and this approach makes adaption for multiprocessing relatively simple. Quicksort, devised by C.A.R. (Tony) Hoare [2] uses relocation of elements above or below a partition value, initially for the complete list and subsequently for each partitioned sublist. Mergersort, sometimes said to be based on an idea proposed by John von Neumann, uses repeated binary division of the list into sublists, sorts small sublists and merges adjacent sorted sublist pairs.

Quicksort works in place and requires little additional storage but is not stable: relative locations of same key elements are modified. Stable variants of quicksort need storage areas outside the list and using a buffer smaller than the complete list will likely get worse performance than mergesort. Mergesort is inherently stable but an implementation that provides efficiency approximating to quicksort needs a buffer to store items during a merge. Merging with minimum element relocations and therefore best efficiency requires a buffer half the size of the complete list.

Quicksort Nemeses

Quicksort has an average time complexity of O(n log n): on average it will take 2n ln(n) comparisons to sort a list. But, if the heuristic for choosing a partition value or 'pivot' is to use the first element of the list, basic quicksort will use minimum (n2 – n) / 2 comparisons to sort a reverse order list or to check that an in-order list does not need any elements moving. This is O( n2 ) 'quadratic' complexity, characteristic of the most basic sorting algorithms. Sorting 10000 elements in reverse order takes around 49,995,000 comparisons as opposed to the 184,207 average. An obvious solution is not to use the first element. Why not the middle element for example? The problem with setting the partition value in this way is that there will always be some list arrangements that get quadratic complexity.

Various schemes have been proposed to select a good pivot, including random selection. The more complex a selection scheme the higher its overhead is likely to be and it is hard if not impossible to show that a particular scheme will get complete immunity for all possible input configurations. Median of three selection carries a low-overhead and appears to get superior partitioning for a wide range of list configurations [3]. Three elements are selected, generally the first middle and last element, compared and/or sorted. The median supplies the partition value. The quicksort that I need to beat in the Faster Than Quicksort game uses median of three engineered to get an average overhead of less than one comparison per sublist.

In 1997 David Musser presented an analysis [4] that showed there were a significant number of list configurations that would drive a median of three selection quicksort quadratic: 'Median of Three Killers'. His remedy was to make the sorting algorithm monitor its recursion depth introspectively and if a preset depth was exceeded revert to a guaranteed O(n log n) sort, heapsort or perhaps mergesort. The dual mode algorithm was to be called 'introsort'. More recently it has been suggested that engineered median of three killers could be used to bring down a web-server in a denial of service attack. The current C++ STL sort method is an introsort implementation that reverts to heapsort making it easy to include an example of this approach in a test application, maybe. Ralph Unden [5] has posted a Java class that outputs lists of median of three killer integer values and his paper provides further explanation of the problem with a minimum of mathematical analysis.

Relative Performance

Time taken to sort a list results from the data item comparisons and assignments used together with the amount of supporting processing carried out. Supporting processing includes computation of factors relating to the current list, updating indices, function calls and so on. Counting comparisons etc. gets a platform independent indication of performance but doing this can only be indicative. Suppose that swapping two items is done using a register as a temporary location for example, does this count as two assignments or three? Processing will take about 2/3 of the time needed to swap the items with one item stored temporarily in a buffer. Changes to the ratio of comparison cost to assignment cost and the effect of CPU cache misses are other factors.

The method preferred here is to show actual timings taken under the rules of the Faster Than Quicksort game, which I think give a realistic assessment of relative performance at least for native code on the PC. These rules are simple and designed to make tests relate to working with records and objects rather than lists of numbers. The dynamic is different because even basic comparisons take substantially longer than assignments.

The rules are: Sorting is to be done using a comparison function so that complex keys can substituted. The list holds references to data objects with at least a 32 byte memory footprint so that substantial portions of the in memory arrangement are not covered by a single CPU cache line. Object memory location must not have any sequential relationship with sorted list location. Armed with the timings we can then try to find out what went wrong using comparison counts and other statistics.

Algorithm Timings

Table 1 shows timings for sorting random order lists on a single integer key in milliseconds rounded to 3 decimal places. The rel. columns are performance relative to the fastest (tail call optimized) tcoQuicksort figures in the first column. The test machine is a fairly typical desktop: dual core Pentium® at 3.3GHz, 2GB RAM. For each list size, sortings for 50 different lists, 25 generated using a Random() function and reversing the order of the list, are timed. Each timing uses an initial pass which is discarded and between 5 and 10 timed passes, fewer passes being used for larger lists (no worries, the test application does all this not me). The figure shown is the average but the maximum and minimum times were also recorded, any marked discrepancy leading to retesting to confirm or reject the initial timing.

Results were pretty much as expected. On random data all the sorting methods do considerably better than O(n log2 n) would suggest. No doubt other list configurations would restore the balance. The penalty for using full recursion over tail call optimization is quite small maximizing at 9% for a 4000 item list.

Stable quicksort with its huge list-size buffer finally creeps ahead of its rivals for the largest list. It seems likely that this effect results from separate location of items equal to the partition element by this implementation, excluding them from the sublists passed to the next level. Random generation puts a proportion of records with equal keys in all the lists and for a large list their number may become significant. The maximum key value used is size – 1, hopefully getting a similar proportion of same keys for all list sizes. An in-place quicksort can be implemented to provide separate location and exclusion [6] but some additional overhead is inevitable.

The timings shown are probably a bit unfair on mergesort. Currently the problem is not lack of merge optimizations but may be the assembly language block-move routine used to deal with out of order sublists, still the same as for the Delphi Magazine article. It was always quite crude but changes to CPU architecture since P III have made it even less suitable. Java sort functions use mergesort with a list sized buffer and it will be interesting to see whether performance can be improved using current optimizations and a half-size buffer.

The data for unoptimized TList.Sort shows that optimization is always worthwhile for random data particularly for smaller list sizes. Processing an in-order or reverse order list, Sort fairs better because the partition element is not swapped out to reduce partitioned elements by one. The effect of leaving the partition element in place, a design decision made in [6], needs further investigation.

Timings for the C++ STL sort() function are not shown. It was clearly going to be impossible to get sort() from the current version of the STL to compile on the author's system without far more work than time available permitted, if at all. Data obtained using sort() from the template library supplied with Borland Developer Studio 2006 was so far out of kilter with other timings that it was felt that some effect other than the quality of the algorithm must be responsible. This is a pity because a comparison of timed performance on median of three killer lists would have been informative. As a substitute numbers of calls to the comparison function provide a stand in.

Median of Three Killers

Results are show in Table 2. The four left-hand columns show calls for lists of random data. Comparisons for median of three killer configurations are on the right.

Killer configurations have an adverse effect on tcoQuicksort and TList.Sort. The quicksort used by TList.Sort shows almost quadratic behaviour, a bit unfair seeing the implementation tested uses a middle element pivot not median of three. The effect on tcoQuicksort approximately doubles the number of comparisons needed to sort a list of the same size. The fully recursive version suffers the same fate: its implementation is identical except for the tail call optimization. Taking list size to the next level, sorting a list with 2,048,000 elements tcoQuicksort uses 47,299,055 comparisons on random data and 106,537,288 comparisons on a killer sequence. The ratio remains approximately constant and is not quadratic. Mergesort and the library sort() use fewer comparisons on a killer configuration than on random data, mergesort drastically so. Actual timings showed mergesort sorting the list on an integer key in better than 1/3 the time taken by the library function, tallying with the ratio of comparisons used, and this margin is of course even more substantial for a complex key.

Conclusions

Quicksort provides fast sorting of randomly ordered data but can be affected adversely by certain data configurations. Optimizations may contain this problem for certain configurations or aggravate it for others. For the reason given tcoQuicksort uses around 66 million comparisons to sort a two million item list in reverse order as opposed to about 40 million used by TList.Sort.

Mergesort is a better all-rounder and can sort data ordered with an existing relationship to the required key faster than quicksort. Apparently it is also a Median of Three Killer-Killer! It may be possible to improve performance of the implementation tested here by writing a better block move routine but, because items located at opposite ends of the list to that required are moved late, mergesort can never surpass quicksort for sorting random ordered data.

It seems likely that best proofing against adverse data combined with best sorting speed can be achieved using a tail call optimized quicksort that checks for out of balance partitioning after the initial call and subsequently. An immediate switch to mergesort would reap the benefit of its superior efficiency on killer and other adverse configurations. A mergesort buffer can be allocated on the machine stack at any time during processing: a buffer only relates to a single recursion level and space beyond the stack pointer can be used by the merge routine on the way back to the root of the merge tree. The allocation overhead is minimal, ideally only a matter of checking stack headroom. When there is insufficient headroom to to allocate an optimum size buffer an alternative merging routine that uses available space can be called. If a language does not provide intrinsics for efficient stack allocation perhaps it should do so.

The downside of using two algorithms is that code size is increased. It may be that double sorting time on some data configurations for large lists is acceptable using an optimized quicksort. Alternatively random selection of median of three elements might provide a simple alternative. A local randomization scheme need not be elaborate, say setting the initial seed using a value from the current machine clock and a linear congruential generator on a smallish prime. This approach needs further investigation.


References:

[1] Martin Humby. Buffered Sorting and Stack Buffers. Delphi Magazine Issue 113: January 2005.
[2] C. A. R. Hoare, Quicksort. Computer Journal Vol 5 (1): 10-15. 1962.
[3] Robert Sedgewick, Implementing Quicksort Programs, Communications of the ACM Vol 21(10) 847–857 October 1978.
[4] David R Musser, Introspective Sorting and Selection Algorithms, Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY 12180, U.S.A. 1997.
[5] Ralph Unden, > here < and Implementierung und Untersuchung des introspektiven Sortieralgorithmus Introsort, Berufsakademie Stuttgart - Fachrichtung Informationstechnik, student paper April 2007
[6] Jon L Bently, M Douglas McIlroy: Engineering a Sort Function. Software – Practice and Experience, Vol. 32(11), 1249-1265 November 1993.

No Chance!

No chance! - of it working that is. Still get the blank page. With the XML stuff in the code, looks like another 'impedance mismatch' to me. In this case between the XHTML and Blogger's blog engine rather than an application trying to talk to some geriatric DBMS. To paraphrase James May: "Microspeak - Impedance mismatch, English - Cobblers!".

The fact that it works in Preview but not otherwise takes this into the really annoying category but got more important things to do than mess around with it. Found a stretchy version of Minima with the offer of replacing the title bar with a graphic. Computer generated sun, chaotic trees - seems appropriate maybe? What do you think?

Saturday 14 March 2009

Changing Column Widths

Don't intend to access this blog from my mobile or would probably have signed up to Twitter so decided to change the column widths to something more reasonable. Clicked on 'Customise' (My dictionary spells it Customize but what can you expect) Edit HTML. Copy the code for Minima, change some numbers, copy it back, click on Preview and . . . Perfect! Just what I wanted! Nearly fell off chair, but then: View Blog: nothing, nix, nil, a blank page.

Go back do it again, scratch head, get a cup of tea. Some time later . . . eventually get an error page: 'This blog does not have any posts. Please add at least one post in order to preview.' Well, they asked for one so here it is. Will it work now? Wouldn't put any money on it!