PIC Math - the need for speed


Closed Thread
Results 1 to 19 of 19
  1. #1
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,604

    Default PIC Math - the need for speed

    Hi,
    First post here for me, I've been on the pBasic-l list for quiet some time though. Anyway, I'd love some input on the following.

    Basicly, what I'm doing is a 'realtime pulse stretcher'. As soon as an input goes low I set an output high and starts TMR1. When the input goes high I stop TMR1, get the count, add 30% to it and then delay that amount of time before I turn the output off. So I tried this:

    NewPulseTime = OldPulseTime / 100
    NewPulseTime = NewPulseTime * 130
    Delay = NewPulseTime - OldPulseTime

    That works but it takes to long (Measured it to about 600uS). So I tried this instead.

    NewValue = Value >> 6 'Divide by 64.
    NewValue = NewValue * 83 'Multiply by 83. 83/64=1.29, close enough
    Delay = NewValue - Value 'Calculate the delay.

    That works too and is good enough but it seems to take almost 320uS to run.
    Does anyone have any ideas how to speed this up except for runnig the PIC faster, that is.

    Thanks,
    /Henrik Olsson.

  2. #2
    Join Date
    Sep 2003
    Location
    Vermont
    Posts
    373


    Did you find this post helpful? Yes | No

    Default

    The most important thing you omitted is how long the pulse is that you are measuring. 320 uS may be inconsequential for a 10 second pulse.
    If timing is critical, then your best bet is to use an assembly interrupt that constantly looks at the input pin and calculates the stretch amount accordingly. If your processor is dedicated to the pulse stretching function, then only the number crunching needs to be in assembly. If you have a limited scope of pulse width, you could create a lookup table to return the delay, but the accuracy diminishes as the number of possible input values increases. How close to 30% of the value does the output pulse need to stretch?
    A lookup table might take 12 cycles to impliment. This would be your fastest route without bumping the clock up.
    It might be easier to do it in hardware. With the appropriate RC value on an unused pin, when the pulse comes in, you turn on your output pin and RC pin to start charging the capacitor. When the pulse stops, turn the pin attached to the RC network into an input, and test it to see when it becomes a low to the processor. The longer time it has to charge, the longer it takes to discharge to a "LOW" value. As long as you have the right values for your RC charge and discharge components, this should be quite accurate.

    Hope this helps,
    Ron

  3. #3
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,604


    Did you find this post helpful? Yes | No

    Default

    Thanks Ron,

    The incomming pulses are between 0.5 and 50ms and I need 0.1ms resoulution.The frequency varies from 4 to 50Hz, the pulselength wont be more than 20ms at 50Hz, of course.

    Right now I'm just looping tight waiting for the pulse to come and then
    set my output. That short delay is no problem. I then start TMR1 and go
    back looping, waitng for the pulse to end, grab the TMR1 value and do the math.
    It's the math that takes a little longer than I like. Right now it won't
    work correct for pulses shorter than ~1.1ms.

    The 30% is not critical it can be 28% or 32% but it needs to be consistant over the whole range ie not 28% for a 2ms pulse and 32% for 5ms pulse.

    If I'd use a lookup table wouldn't that need 495 entries? (50-0.5*10) And wouldn't it take a different amount of time finding the answer depending on where in the table it's located? Would you mind filling me in on how it would work, if it's still an available solution with the span and resoultion I need.

    I thought perhaps someone had an idea on how I could speed up the math even further than I managed with some secret tricks ;-)

    Thanks again!
    /Henrik Olsson.

  4. #4
    Join Date
    Sep 2003
    Location
    Vermont
    Posts
    373


    Did you find this post helpful? Yes | No

    Default

    OK. Since you are using timer 1 anyway, how about this... It will need a little ML, but not much. When your input goes high, set your output high. Set timer1 with no prescaler,to interrupt on overflow, and put overflow - 50 in it's register. When it overflows, the ML interrupt routine adds 15 to your pauseus variable, and starts it from overflow - 50 again. This goes on until your input goes low. Use pauseus with the variable you have been adding to, and when it runs out, flip your output low too. If you are running at 4 MHz, this will give you 50 uS resolution. The upper limit will be dictated by the 16 bit size of timer1. You should be able to do this up to about a 110 mS pulse length.
    You might be able to write it all in PBP, but at the lower pulse widths, it might not have great resolution.

  5. #5
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    Henrik,

    Try this ...

    Delay = OldPulseTime */ $004D

    That should take 48 Instruction cycles, or 48uS @ 4Mhz
    <br>
    DT

  6. #6
    Join Date
    Sep 2003
    Location
    Vermont
    Posts
    373


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by Darrel Taylor
    Henrik,

    Try this ...

    Delay = OldPulseTime */ $004D

    That should take 48 Instruction cycles, or 48uS @ 4Mhz
    <br>
    OK Darrel,
    I give up. What are you doing here. It sure beats my solution in simplicity, but I am not getting any where near 1.3* OldPulseTime. This equation returns the middle 16 bits of the product correct?

  7. #7
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,604


    Did you find this post helpful? Yes | No

    Default

    Ron and Darrel,
    Thank you for helping me with this!

    Darrel, let's see if I understand this correct. The */ is like multiplying by fractions of 256 right? So */$004D is like multiplying by 1/256*77 which is 0.3, yeah... I like it. I was hoping for something like this but to be honest I hadn't (or still haven't, perhaps) grasped the concept of the */ operator.

    Ron, I like your idea too. I'll think I'll go that way in case Darrel's idea doesn't work. But after really trying to understand what the */ does I think it will work.

    Thanks again guys!
    /Henrik Olsson.

  8. #8
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    You Got It!

    */ $004D - Just like you said, it multiplies times 77/256 or 0.30078125 &nbsp; It's not perfect, but it's close.

    And, it can work with bigger numbers too. For instance, to calculate NewPulseTime from your previous example, you would multiply times 1.30078125

    NewPulseTime = OldPulseTime */ $014D ; * 1.30078125

    Now you can see why it's easier to express the numbers in Hex. If you separate the high and low bytes 01 4D. The 01 represents whole numbers and 4D is the fractional part.

    */ $0280 = * 2.5
    */ $10C0 = * 16.75
    <br>
    DT

  9. #9
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,604


    Did you find this post helpful? Yes | No

    Default

    Darrel,
    Thanks again! How did you come to the conclusion it would take 48 cycles? By examining the .asm file? I'm asking because I just made a quick'n'dirty test like this:

    Value = 20000
    Start:
    GPIO.0 = 1 'Set output
    Delay = Value /*$004D
    GPIO.0 = 0 'Reset output
    PauseUs 700
    Goto Start

    But when scoping the output I get about 250uS. The pause is dead on 700uS so I know the oscillator is working correct. Now 250uS is alot better than my previous 'all time low' of 320uS, so at least I'm heading the right direction. ;-)
    Any ideas?

    I really appreciate the help!
    /Henrik Olsson

  10. #10
    Join Date
    Jul 2003
    Location
    Sweden
    Posts
    237


    Did you find this post helpful? Yes | No

    Post

    You could get 31.25% just by one addition and a couple of shiftoperations.
    Code:
    Dummy1          VAR WORD
    Delay           VAR WORD
    OldPulseTime    VAR WORD
    
    Dummy1 = OldPulseTime >> 2  'Divide by 4 (=0.25)
    Delay = Dummy1 >> 2         'Divide by 4 again(=0.0625)
    Delay = Delay + Dummy1      'Add them to get 0.3125
    Don't know how fast it will be, but it feels like it should be faster than 48 cycles. I also have a gutfeeling that the following code will be even faster.
    Code:
    Dummy1          VAR WORD
    Delay           VAR WORD
    OldPulseTime    VAR WORD
    
    Dummy1 = OldPulseTime >> 1  'Divide by 2 (=0.5)
    Dummy1 = Dummy1 >> 1        'Divide by 2 (=0.25)
    Delay = Dummy1 >> 1         'Divide by 2 (=0.125)
    Delay = Delay >> 1          'Divide by 2 (=0.0625)
    Delay = Delay + Dummy1      'Add them to get 0.3125
    I can't run any of the code right now, but perhaps Darrel could measure theese with one of his ingenious macros?

  11. #11
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    I determined it was 48 cycles by using this procedure ...

    instruction execution time
    http://www.picbasic.co.uk/forum/showthread.php?t=365

    I've double checked, and it's definately 48. Although my hardware is different, 18F452 @ 20Mhz, it shouldn't make that much difference.

    For your test routine I get these results:
    Code:
    Value = 20000
    Start:                     ' Cycles   uS@20mhz
        PORTD.0 = 1            '    1         .2
        Delay = Value /*$004D  '   48        9.6
        PORTD.0 = 0            '    1         .2
        PauseUs 700            ' 3501      700.2   
    Goto Start                 '    2         .4
                                 -----     -----
                  ' Loop Total   3553      710.6
    And, this I didn't expect. Here's the results from Ingvar's first example
    Code:
                                ' Cycles   uS@20mhz
    Dummy1 = OldPulseTime >> 2  '  31        6.2
    Delay = Dummy1 >> 2         '  31        6.2
    Delay = Delay + Dummy1      '   4         .8
                                 -----     -----
                  '      Total     66       13.2
    But, his second one looks pretty quick. As long as .3125 is close enough.
    Code:
                                ' Cycles   uS@20mhz
    Dummy1 = OldPulseTime >> 1  '   5        1.0
    Dummy1 = Dummy1 >> 1        '   3         .6
    Delay = Dummy1 >> 1         '   5        1.0
    Delay = Delay >> 1          '   3         .6
    Delay = Delay + Dummy1      '   4         .8
                                 -----     -----
                  '      Total     20        4.0
    Last edited by Darrel Taylor; - 10th October 2005 at 21:46.
    DT

  12. #12
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,604


    Did you find this post helpful? Yes | No

    Default

    Darrel, Ingvar,
    Hmm, this is interesting.
    I tried Ingvars 1'st code and measured it to 82uS. Then I tried the second one and measured it to 22uS which is great! 31.25% works fine for me.

    I'm still wondering though why the numbers doesn't match. Ingvars first code should take 66uS but I measure it to ~80uS. His second one should take 20 and it does, close enough. Then we have Darrel's 48 cycle code that takes ~250uS on my hardware. I just verified that again to make sure.

    I would think my scope was out of calibration if it wasn't for the fact that 700uS pause I have measures correct on the scope.

    The application hardware is 12F629 but the testing above is on a 16F877 @ 4MHz.

    Thanks guys!
    /Henrik Olsson.

  13. #13
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    Apparently, your Scope works Really Well.

    I ran the numbers again for a 16F877 this time, and the results are rather surprising.
    Code:
    Start:                     ' Cycles   uS@20mhz
        PORTD.0 = 1            '    1         .2
        Delay = Value /*$004D  '  243       48.6
        PORTD.0 = 0            '    1         .2
        PauseUs 700            ' 3502      700.4
    Goto Start                 '    4         .8
                                 -----     -----
               '    Loop Total   3751      750.2
               ' Without Pause    249
    Code:
                                ' Cycles   uS@20mhz
    Dummy1 = OldPulseTime >> 2  '  37        7.4
    Delay = Dummy1 >> 2         '  37        7.4
    Delay = Delay + Dummy1      '   6        1.2
                                 -----     -----
                  '      Total     80       16.0
    Code:
                                ' Cycles   uS@20mhz
    Dummy1 = OldPulseTime >> 1  '   5        1.0
    Dummy1 = Dummy1 >> 1        '   3         .6
    Delay = Dummy1 >> 1         '   5        1.0 
    Delay = Delay >> 1          '   3         .6
    Delay = Delay + Dummy1      '   6        1.2
                                 -----     -----
                  '      Total     22        4.2
    I really didn't expect that much difference between the same code compiled for the 2 different chips.

    Way to go Ingvar.
    Last edited by Darrel Taylor; - 10th October 2005 at 23:57.
    DT

  14. #14
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,604


    Did you find this post helpful? Yes | No

    Default

    Darrel,
    Wow, thanks!
    I agree, that's a bit strange isn't it? It's gonna be interesting to see how fast it will run on the 12F629 where it actually will reside and operate when it's finished. That timing routine of yours seems to come in handy from time to time!

    Thanks again!
    /Henrik Olsson.

  15. #15
    Join Date
    Jul 2003
    Location
    Sweden
    Posts
    237


    Did you find this post helpful? Yes | No

    Thumbs up

    Darrel,
    You forget that your 18F uses it's hardware multiplier. The 16 series does not have one and needs to do a software multply, that's a HUGE difference. I'm a little surprised that the ">>2" takes that much longer than ">>1". I suspected around 10-20 cycles difference, not 60. It should take a little longer since it uses a looped code rather than the straight code ">>1" produces. Strange, perhaps i'll look into it sometime ............ or not.

    Henrik,
    You should make sure that the variables reside in the same memorybank. You do that by using the "BANK" statement. If you don't it will be slower since the pic needs to switch banks between each statement. Declare your variables like .....
    Code:
    Dummy1          VAR WORD BANK0
    Delay           VAR WORD BANK0
    OldPulseTime    VAR WORD BANK0
    /Ingvar

    PS. Henrik, vart skall jag skicka fakturan DS.

  16. #16
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    Right Ingvar,

    That's what it is, the Hardware Multiplier.

    The funny thing is that a couple of years ago, I did a comparison between 16F and 18F multiplication, and found that PBP didn't use the hardware MUL for 18F's. So, I made my own MUL routine for 18F's and have been using it ever since (when I need the speed). I hadn't looked since then, but, meLabs must have snuck it in there when I wasn't paying attention. Nothing listed in the Version History.

    That's good to know!
    <br>

    P.S. Ingvar, Add some to that bill for me too.
    Last edited by Darrel Taylor; - 13th October 2005 at 00:40.
    DT

  17. #17
    Join Date
    Jul 2003
    Location
    Sweden
    Posts
    237


    Did you find this post helpful? Yes | No

    Question

    Sure, as soon as Henrik tells me where to send it

  18. #18
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,604


    Did you find this post helpful? Yes | No

    Default

    Ingvar,
    I thought you allready had my billing adress since you helped me with that asm isr a couple of years ago.....
    You're in Stockholm, right? Me and two friends are going there on Friday-Saturday to have some fun and visit the famous jazz-club Stampen. Come there and I'll buy you a beer or two! You too Darrel, but I guess it's a little too far off for you...

    Things have come in between me and this project right this week but I'll let you know how it all works out.
    I appreciate the help guys, I really do!

    /Henrik Olsson.

  19. #19
    Join Date
    Jul 2003
    Location
    Sweden
    Posts
    237


    Did you find this post helpful? Yes | No

    Default

    Sorry Henrik, i live out my days on the westcoast, Kungsbacka to be precise. Good luck with the rest of the project ....... and don't smoke anything in a jazz-club, i hear it's not always tobacco Not that i'd really know since i'm not a jazzfan.

Similar Threads

  1. SMS via pic
    By kenandere in forum GSM
    Replies: 15
    Last Post: - 10th March 2010, 10:00
  2. HSERIN & Interupts (aka controlling PIC programs from a remote PC)
    By HankMcSpank in forum mel PIC BASIC Pro
    Replies: 16
    Last Post: - 17th June 2009, 14:46
  3. pic to pic ir link versus wired link : help please anyone
    By xnihilo in forum mel PIC BASIC Pro
    Replies: 13
    Last Post: - 30th May 2008, 21:01
  4. My PIC can't do the math! Can yours?
    By sayzer in forum mel PIC BASIC Pro
    Replies: 5
    Last Post: - 12th May 2006, 07:28
  5. Serial Pic to Pic using HSER
    By Chadhammer in forum mel PIC BASIC Pro
    Replies: 5
    Last Post: - 11th March 2005, 23:14

Members who have read this thread : 0

You do not have permission to view the list of names.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts