It seems that comparing to 25K20, the 887 executes less instructions...? Is it possible?
Apparently....
First of all, as pedja pointed out, the number of instructions will vary slightly depending on what commands are being used by the actual code. DT-Ints saves many variables by default and some more if they're being defined. DT-Ints 14 seems to allocate 41 bytes for storing context (including W, STATUS, PCLATH and FSR) while DT-Ints 18 allocates 68 bytes.
Out of the 68 bytes allocated 22 is never used if LONGs are NOT enabled. So if I'm not mistaken worst case for 14bit devices is 41 bytes and for 16bit devices (without LONGs) it's 46 bytes so yeah...I don't think there's anything wrong with your test.

What surprised me a bit though is that the entry is slower than the exit, while on the 18F it's the other way around. Also, in order to be truly accurate you really need to count instructions in the listing since measuring it like this can cause it to differ a couple of instructions. But visualizing it like this is easier :-)

/Henrik.