Richard, thank you. My problem is I suck as ASM and I don't understand half of what's going on there. It might very well work, in which case I'd be lucky but if it doesn't or I need to modify it I basically have no idea what I'm doing. I should use it as a learning exercise though as I do feel the need to better understand ASM.

tumblweed, wow, that was an easy yet effective modification (and it works perfectly fine) - just what I was hoping for.
Went down from 7.6ms to just above 3ms for a 81 matrix wide display. More than double the speed, cool!

One thing threw me off on a tangent for a while...
Code:
OFFSET_INCR CON NumberOfDisplays + 2
Since NumberOfDisplays is already a CONstant (with the value 81 in this particular case) I replaced
Code:
Offset = Offset + OFFSET_INCR
with
Code:
Offset = Offset + NumberOfDisplays + 2
expecting the compiler to replace both versions to
Code:
Offset = Offset + 83
It did not...your version with, the constant defined executes at 3.03ms, the other one in 3.45ms

I took 15 minutes for the penny to drop and now I need to go thru my code and see where else I've made that silly mistake.

Thanks again, both of you guys!

/Henrik.