Well, to get true linear accel, you either need a long lookup table or a complicated calculation--just as you've already worked out.

I've tried a calculation (look on embedded.com for a few acceleration formulas), and it takes up to 105uS to do it in Proton, for each pass through the calc (on a 40MHz 18F). PBP would be similar, or maybe a couple uS slower. Either way, you are limited to about 9khz for a single axis or 4.5 for two, but you'll have to calculate them out of the interrupt so they are prepped for use inside it. (You will also have to deal with context saving, which you short-cut on your posted code.) I've considered buffering the calculations in an array so when the motors are going slow you can calculate a lot of values so when it is going fast you have some done ahead of time. You'd be limited by RAM, though.

Since I was more concerned about simplicity and overall travel than smooth motion, I did it a little different. I used a delay scale, then by trial & error came up with a few Case statements that shifted it left (increasing the delay offset) or right (reducing the offset) depending on the current speed. I just tuned it by ear (literally), listening for the motor accel to sound smooth. The problem with adding/subtracting a constant to the time delay is that at higher speeds the change is drastic but at slower speeds the same change is hardly noticeable. That's why I did the select/case to shift it. You end up with stepped acceleration, but it sounded smooth and worked for me.

I wouldn't do it for cnc, though. For CNC you need one of the two previously mentioned accel methods.