PDA

View Full Version : What's the interrupt latency and/or overhead when using DT-Ints?



HenrikOlsson
- 18th August 2016, 20:38
With this post I'd like to visually demonstrate what kind of latency and overhead to expect when using DT-Ints. I'd like to point out that I am i no way criticising DT-ints, they are wonderful to work with and I could not imagine PBP without them but when you need to interrupt at high speed they are not that great and I wanted to put some numbers on it.

To measure this I wrote a very simple piece of code, all it does is toggle an output as fast as it can. Then a periodic timer interrupt was introduced and in the actual interrupt service routine a short pulse on another output was created. Here's what that looks like on a scope:

8292

The top (blue) trace shows the output normally toggling at a fast rate. The bottom (yellow) trace shows the pulse generated by the code in the interrupt service routine. If we zoom in a little bit we can put some cursors on the screen and measure the latency:

8291

As soon as the interrupt trips the main code stops executing (output stops toggling) but OUR code in the interrupt service routine (which is what generates the pulse on the yellow trace) doesn't start to exectute until around DT-Ints have had to time to store a copy of the system variables which in this case takes around 7us. This PIC is running at 64MHz which means the latency on entry is roughly 112 instruction cycles.

As soon as the falling edge of the pulse occurs we're done in the ISR and DT-Ints now has to restore all the system variables, clearing the interrupt flag and so on which takes slightly longer than on entry, around 8us or 128 instruction cycles:

8290

So, even though we're only spending 5us in the actual interrupt service routine the main routine is suspended for almost 21us - at 64MHz clock frequency. Even if no code at all was actually executed in the ISR the save and restore process that DT-Ints needs to perform takes, at the very least, somewhere around 240 instruction cycles. Now, 240 instruction cycles at 64MHz is "only" 15us which may no seem like much but if you're trying to interrupt at, lets say 25kHz you only got 40us between each interrupt. Spending 15 of those traveling to and from the interrupt plus, lets say, 5us in the ISR makes the 64MHz PIC "feel" like a 32MHz PIC. Any software timed commands in the main code (like PAUSE, SERIN, PULSIN etc) will off by a factor of 2.

The above tests were done on an 18F25K20 running at 64MHz, using PBP 3.0.8.0 with support for LONGs disabled. With LONGs enabled the 15us overhead increases to roughly 18us or around 290 instruction cycles.

Here's the code I used:

DEFINE OSC 64
DEFINE LOADER_USED 1 ' We're using a bootloader.

INCLUDE "DT_INTS-18.bas"
INCLUDE "ReEnterPBP-18.bas"

ASM
INT_LIST macro ; IntSource, Label, Type, ResetFlag?
INT_Handler TMR0_INT, _TimezUp, PBP, yes
endm
INT_CREATE ; Creates the interrupt processor
ENDASM

ANSEL = 0
ANSELH = 0
TRISB = %11111100 ' PortB.0 and PortB.1 are outputs.

' Configure TMR0 to roll over at a rate of 31250Hz (every 32us)
T0CON=%11000010 ' TMR0 on, 8bit mode, 1:2 prescaler

@ INT_ENABLE TMR0_INT ; Enable INT0 interrupts

Main:
LATB.1 = !LATB.1 ' Toggle PortB.1 as fast as we can
Goto Main

TimezUp:
' Create a short pulse on PortB.0
LATB.0 = 1
PAUSEUS 5
LATB.0 = 0
@ INT_RETURN

/Henrik.

tumbleweed
- 18th August 2016, 22:48
That's good info Henrik. Saving the PBP system vars takes a lot of time!

Do you have the same measurement done with an ASM-type interrupt handler?

pedja089
- 18th August 2016, 23:59
As far as I know, if you have more than one INT_Handler, then it will take more time to do overhead checking on what label to jump, even if only one interrupt occurred. And further down the list it will take more time to jump to that label.
ASM int header is faster "only" for save and restore time needed. Which is not small at all.
From Re-EnterPBP-18 file, we can see that it need maximum 34 word to be saved and restored(can be less, depending on conditional assembling). Variables are grouped, so only one banksel instruction are needed.
To save or restore 34 words(68 bytes) we need 136 instructions(one to load to W and one to load to F). I didn't check if PBP uses MOVFF, so maybe there is room for improvement.
So worst case: just to save and restore PBP variables takes 280 instructions(272 movX instructions, +2 call, +2 returns, +2 banksel)
Henrik's test program didn't have any IF then so didn't have any of T variables used, that is why it was faster.

HenrikOlsson
- 19th August 2016, 22:20
I was going to say that the actual overhead can and will be different depending on what commands your program is using but compared to what's already there we're not talking about much. If it's really critical then my suggestion is to either measure it yourself or use the worst case number, which Pedja says is 280 instructions when not using LONGs.

As for declaring the interrupt type as ASM while still using DT-ints the total (measured) overhead is around 7.6us - a difference of 8us or 128 instruction cycles. But if you're going to write the interrupt handler in assembly then you don't really need DT-Ints anyway and the overhead will become very small.

Finally, the measurement shown here are all approximate, counting instruction cycles is Always a more accurate way, this was juse meant to visually show what's going on and put some sort of number on it for people who may be wondering.

/Henrik.

Art
- 20th August 2016, 12:06
Nice :) Mostly, knowing the time you lost is important, rather than the fact you lost the instruction time itself,
and all you really need to do is measure it. I think everyone should trim down anything like that.
If you were only using DT Elapsed timer, and needed the second tick without counting Hours, Mins, Secs,
you might as well remove the checks and get the instruction time back as any given project develops.

Ioannis
- 22nd August 2016, 12:33
Nice job Henrik.

It would be very interesting to see the timing on a modest PIC like F886 or similar. Will try this when in lab.

Ioannis

HenrikOlsson
- 22nd August 2016, 13:13
Please do and post the results.
It's going to be slightly worse on an old 14bit device since they don't have automatic context save/restore like the 18F devices do for high priority interrupts.

/Henrik.

Ioannis
- 22nd August 2016, 14:32
Exactly because of that I am sure it will be much worse. Soon...

Ioannis

Ioannis
- 28th August 2016, 00:25
Finally I did the test on a 16F887 at 8MHz, and came up with a 49 usec to enter ISR and around 45 to exit.

8296

The code I used is this:



wsave var byte $70 system
wsave1 var byte $A0 system
wsave2 var byte $120 system
wsave3 var byte $1A0 system


INCLUDE "DT_INTS-14.bas"
INCLUDE "ReEnterPBP.bas"

ASM
INT_LIST macro ; IntSource, Label, Type, ResetFlag?
INT_Handler TMR0_INT, _TimezUp, PBP, yes
endm
INT_CREATE ; Creates the interrupt processor
ENDASM

TRISB = %00000000 ' PortB.0 and PortB.1 are outputs.

@ INT_ENABLE TMR0_INT ; Enable INT0 interrupts

Main:
portB.4 = !portb.4 ' Toggle PortB.1 as fast as we can
Goto Main

TimezUp:
' Create a short pulse on PortB.0
portb.5 = 1
PAUSEUS 15
portB.5 = 0
@ INT_RETURN



It seems that comparing to 25K20, the 887 executes less instructions...? Is it possible?

Ioannis

HenrikOlsson
- 30th August 2016, 16:31
It seems that comparing to 25K20, the 887 executes less instructions...? Is it possible?
Apparently....
First of all, as pedja pointed out, the number of instructions will vary slightly depending on what commands are being used by the actual code. DT-Ints saves many variables by default and some more if they're being defined. DT-Ints 14 seems to allocate 41 bytes for storing context (including W, STATUS, PCLATH and FSR) while DT-Ints 18 allocates 68 bytes.
Out of the 68 bytes allocated 22 is never used if LONGs are NOT enabled. So if I'm not mistaken worst case for 14bit devices is 41 bytes and for 16bit devices (without LONGs) it's 46 bytes so yeah...I don't think there's anything wrong with your test.

What surprised me a bit though is that the entry is slower than the exit, while on the 18F it's the other way around. Also, in order to be truly accurate you really need to count instructions in the listing since measuring it like this can cause it to differ a couple of instructions. But visualizing it like this is easier :-)

/Henrik.

Ioannis
- 30th August 2016, 16:43
I tested and measured it many times and it was consistent exactly as you observed. Entry always takes longer.

Ioannis