Looking at the library code, if the clock is set for > 16MHz (or 24MHz in some cases) it uses a longer internal delay before asserting some of the control pins.
Try this - lie to the compiler and tell it you're running faster than you really are:
DEFINE OSC 32 'set oscillator speed

Don't change your OSCCON setting... leave it so you're still actually running at 16MHz.