Optimizing DIV


Closed Thread
Results 1 to 40 of 42

Thread: Optimizing DIV

Hybrid View

  1. #1
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    That's all stuff that can be done later if things work out.

    I think right now, to prove that it does "optimize" the divide routine, you need a "Base Line".
    How long does it take PBPL to do a single signed 32-bit divide?
    And how long does it take yours?

    Just a few tests at various bit sizes will tell if there's any benefit or not.
    <br>
    DT

  2. #2
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by Darrel Taylor View Post
    That's all stuff that can be done later if things work out.
    I think right now, to prove that it does "optimize" the divide routine, you need a "Base Line".
    How long does it take PBP to do a single signed 32-bit divide.
    And how long does it take yours.
    Just a few tests at various bit sizes will tell if there's any benefit or not.
    <br>
    That's the thing...It depends on the bits in the divisor and the dividend.
    I'm sure you know how binary division works, set a result bit, shift everything over one, subtract, if there's a carry, add it back in, reset the result bit, try again.
    Like I said, I used your little error-checker, and the only error I can get out of the last code posted was 0/0. Everything else checks out good (unless my implementation is wrong!).
    As it is, I changed up the code just a bit to only send out to the LCD once every 256 loops (if bl.byte0=0 then -do the lcd-). So, obviously, the LCD was taking up loads of cycles, which it does anyways.
    The optimize's as they stand right now (the right shift version) do speed things up a little bit, overall, similar to how a right shift saves time over a divide by 2. In some cases though, they use a few more cycles.
    I know the 'left shift loop reduction' method on R0 and R1 has decent grounding. Just can't figure out how to implement it without wrecking the results!
    It might be easier to just use 4 different versions, where the version chosen is based on the larger of R0 and R1 (32, 24, 16 and 8 bit versions). Will use a bit more memory though...

  3. #3
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    What, are you going to make me do it?

    I will have exact numbers, good or bad.
    Do you want to be the first to know ... (a.k.a. you do it)

    Or do you give me a free Jab? Without a glove.
    <br>
    DT

  4. #4
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    Results are in ....

    I'm taping up my hand.

    Anything you want to change?
    <br>
    DT

  5. #5
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Arg! I fell asleep in my chair thinking about it.
    I'm tellin' ya... I wrote a complement of 32bit math routines for my 6809E back in the day on my CoCo2. There's probably one friggin thing I'm forgetting. And no I haven't google'd anything because I don't want to.
    I'll remember it eventually... Make that jab to the back of the head. Maybe that shot will knock it forward.

  6. #6
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Ok, getting somewhere now...
    Had a thought earlier of adding a few extra loops to knock down the '32bit only' divide into sections dealing with only 32, 24, 16 and 8 bit divides.
    It's working well in the short tests, sometimes knocking cycle counts down by more than 3/4, usually by about 1/2.
    Fairly sure I fixed the 0/0 bug also...that could've been a bit more obvious...but I don't see how.

  7. #7
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    <img align=left border=1 hspace=10 vspace=10 src="http://www.picbasic.co.uk/forum/attachment.php?attachmentid=2857" />
    <!-- Name:  kapow.jpg
Views: 2000
Size:  5.7 KB -->It appears that along with some new errors, ...
    you've added an average of 3 instruction cycles to the PBP DIV time.

    According to the theory, the most optimization would be gained when using small numbers. So 8-bit/8-bit should see the greatest effect.<hr>

    --Testing 8/8, Skip 0/0 this time--
    Test- A=1-255, B=0-255
    No ERRORs:
    SkiMIN=762 SkiMAX=1018
    PBPMIN=759 PBPMAX=1015


    This is the latest test program ... SkiDiv5.pbp

    These are the results at different OPT levels ...
    SKI_DIV_OPT_1
    SKI_DIV_OPT_2
    SKI_DIV_OPT_3
    SKI_DIV_OPT_0

    This test program uses the Timing methods described in Post #2.
    DT

  8. #8
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by Darrel Taylor View Post
    It appears that along with some new errors, ...
    Hmmm... That code I posted last night at 22:17 worked good for me, as far as errors go. 0/0 was the only one I got and I found the source of that one. But not much for savings.

    you've added an average of 3 instruction cycles to the PBP DIV time.
    According to the theory, the most optimization would be gained when using small numbers. So 8-bit/8-bit should see the greatest effect.
    That's what I'm getting also now that I've got a 'method' working.
    Last edited by skimask; - 14th September 2008 at 03:52.

Similar Threads

  1. Optimizing LCD commands?
    By jblackann in forum mel PIC BASIC Pro
    Replies: 1
    Last Post: - 4th December 2007, 16:30

Members who have read this thread : 0

You do not have permission to view the list of names.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts