Once I had my input board up and running I wanted to measure the latency of the signals being generated. The simplest way to go about this was to generate a set of probe results using different point ordering at different feed rates.
I ran height probes on the same set of points in a forwards (from origin to edge of work area), reverse (edge of work area back to origin) and random sequence. The reason for using different point ordering is simple - it makes it easy to spot cumulative errors. If everything is working correctly the Z values recorded for each X/Y point should be very close to each other regardless of when that point was probed. If errors are accumulating (due to missing steps for example) the forward and reverse probes would show a reverse slope and the random probe would look, well, random.
The scripts and tools I use here don't really justify a complete repository by themselves so I've put them in this Gist. To test the Z axis I used a script called probearea.py to generate the g-code to do the probe and another script called levelcheck.py to generate the heightmap images.
So, assuming that all three probe files look the same I can combine them to average out differences in latency. If I run multiple sets at different feed rates I could then calculate the actual latency time by seeing what the difference in travel distance is across the different samples. It seemed fairly simple apart of a bit of number crunching at the end of the process and everything started out fairly well. Here are the probe results I got with a probe feed rate of 64 mm/min:
As you can see the results are fairly consistent. In the image the black areas are the 0 level, blue levels are below 0 and green above - the display range is -1.6 mm (brightest blue) to 1.6 mm (brightest green) but the results for these probes only go to +/- 0.8mm. My bed seems to bow in the middle across the X axis which matches with how it looks.
It didn't take very long to run into issues unfortunately. I was unable to successfully complete a probe at 128 mm/min - I use the 'G38.2' instruction to probe and set the limit to -1.6mm, this instruction stops the program with an error if the limit is exceeded before the probe makes contact. I put this error down to the latency being too large to trigger the probe contact before the tip had passed the limit. I decided to skip ahead run the set at a 32 mm/min feed rate. Here are the results:
Now this set of images matches up with what you would expect if steps were being missed on the Z axis - specifically if more steps are being missed during the retraction movement than on the probe movement.
This is a bit confusing because I didn't change the retraction feed rate across the probe sequences, only the probe feed rate. One possibility is that steps have been missed all along but at a relatively equal rate - this would result in a consistent (but inaccurate) set of probe values. Running the probes at the slower feed rate results in less missed steps in that direction and leads to the accumulation of errors that is very visible in these images.
The image above shows how this can happen. The probe starts 5mm above the surface and moves down until it touches. If the slow movement of the probe doesn't miss any steps it does actually move the full 5mm. When the probe retracts at the faster step rate and misses some steps it only physically moves 4mm but LinuxCNC thinks it has moved the full 5. When the slow probe occurs again it finds that it is only 4mm above the surface making it seem as if the surface is 1mm higher than it really is. Of course these numbers are exaggerated, in reality it is only a fraction of a millimeter that is being lost with each sequence but over a large number of operations these add up and make it look as if the board is sloping in one direction.
The incident that caused me to replace the control board in the first place may have caused some damage to the Z axis mechanical components so I needed to run some more tests to isolate the problem.
To do this I set up a sequence that would probe from a center point to positive and negative limits. This would give me a similar scenario - a controlled probe move in one direction followed by a rapid return to the center point. The physical set up is a bit of a kludge as you can see in the photo below. The distance between the limits isn't of great importance though - all I need to do is to see if I can get a consistent measurement of that distance over multiple operations.
Rather than use a script to generate gcode for me I wrote this one by hand - you can see it below. It requires you to manually jog the probe to approximately the center of the probe target before running the program. You also have to adjust some variables at the start of the script to set the feed rate to probe at and the approximate distance between the probe targets - this value should be slightly larger than the actual distance.
G20 (Use Inches) G91 (Set Relative Coordinates) G17 (XY plane selection) (Variables) #500=1.2599 (Feed rate) #501=3 (Distance between points) #502=30 (Number of probes in a sequence) (Subroutine for probes) o100 sub #100=0 o101 while [#100 LT #502] G38.2 X#1 Y#2 F#3 G28 #100 = [#100+1] (increment the test counter) o101 endwhile o100 endsub (Save the current position) G28.1 (PROBEOPEN RawProbeLog.txt) (Probe in positive Y direction) o100 call  [#501 / 2] [#500] (Probe in positive X direction) o100 call [#501 / 2]  [#500] (Probe in negative Y direction) o100 call  [-#501 / 2] [#500] (Probe in negative X direction) o100 call [-#501 / 2]  [#500] (PROBECLOSE) M02
The G28.1 command saves the current position and all movements are done relative to this point. The subroutine o100 performs most of the work for the probe - it performs the number of cycles defined in #502 where each cycle probes in the X/Y direction specified and then uses a G28 command to return to the stored mid-point. For this test I did 30 cycles (this comes close to the travel distance in the Z axis test) in both directions of each axis - Y positive, X positive, Y negative and X negative.
To process this data I didn't use a script, I simply imported the log file into Google Charts and calculated the distance of each probe point from the first one done (it is only the difference we are interested in).
The first set of results (at 64 mm/min) shows that the drift is still present even at this rate. The line graph provides a lot clearer picture than the height map image does. The movement in the X negative direction is a little concerning and will require a bit more investigation. The wobbles in the data are most likely due to the probe trigger latency (which is what I was setting out to measure in the first place).
The next set of data (at 32 mm/min) is consistent with the values I saw on the Z axis - a very pronounced error accumulation over the series. The reason the lines are not symmetrical around the X axis is due to the center point not being exactly in the center.
All of this data points to a definite problem with the controller - the errors are much the same across all three axis, don't seem to be related to the travel direction and are corelated with the feed rate. The control board I am using is known to have issues with noisy stepping signals and timing (and thankfully there are a number of fixes that can be applied). It looks like my next step is to break out the oscilloscope and start measuring that aspect.
Collection this data has taken a long time but at least now I have a set of procedures that I can use to replicate the process and a baseline set of data to measure improvements against.