You say that the pulse duration should be as short as possible, do you really require 50% dutycycle? What are you doing with the pulses?

This is (as we're discussing in the other thread) a classic read-modfy-write trap. Since you have LAT registers available on that device you should use those instead of PORT (makes no difference to the speed though).

You might be able to use one of the peripherals in the PIC to generete pulses in hardware, feed them back into a counter and use a CCP module to stop the output at the correct count.

/Henrik.