32 bit square root


Closed Thread
Results 1 to 21 of 21

Hybrid View

  1. #1
    Join Date
    Sep 2005
    Location
    Campbell, CA
    Posts
    1,107


    Did you find this post helpful? Yes | No

    Default

    Thanks to all who replied. I have all the other routines done and working. Now I just have to speed up the SQR routine. I'll try them all and post my results.

    Thanks again.
    Charles Linquist

  2. #2
    Join Date
    Feb 2006
    Location
    Gilroy, CA
    Posts
    1,530


    Did you find this post helpful? Yes | No

    Default Assembly code now working

    I compared your assembly code to the original Microchip TB040 to see if I could make it work. I did notice that a = 0 in the document, and in your code, every place that a was, now = 1. I am not very good at assembly, so I would like to know the difference between:

    Code:
            movwf BITLOC0, 1
    and

    Code:
            movwf BITLOC0, 0
    But, after doing this, I was able to get both the 16 bit and 32 bit working. I did not try setting them all back to 1, to see if that made a difference. But I did accidentally leave in one "1", and it did not work until I made it a zero. I am very curious to know the speed differences of each, if you have done any tests.

    Thanks,

    Walter


    Code:
    ASM
     
      
    Sqrt tstfsz ARGA3,0 ; determine if the number is 16-bit
            bra Sqrt32 ; or 32-bit and call the best function
            tstfsz ARGA2, 0
            bra Sqrt32
            clrf RES1, 0
            bra Sqrt16
    
    
    Sqrt16 clrf TEMP0, 0 ; clear the temp solution
            movlw 0x80 ; setup the first bit
            movwf BITLOC0, 0
            movwf RES0, 0
    Square8 movf RES0, W, 0 ; square the guess
            mulwf RES0, 0
            movf PRODL, W, 0 ; ARGA - PROD test
            subwf ARGA0, W, 0
            movf PRODH, W, 0
            subwfb ARGA1, W, 0
            btfsc STATUS, C, 0
            bra NextBit ; if positive then next bit
        ; if negative then rotate right
            movff TEMP0, RES0 ; move last good value back into RES0
            rrncf BITLOC0, F, 0 ; then rotote the bit and put it
            movf BITLOC0, W, 0 ; back into RES0
            iorwf RES0, F, 0
            btfsc BITLOC0, 7, 0; if last value was tested then get
            bra Done ; out
            bra Square8 ; elso go back for another test
    NextBit movff RES0, TEMP0 ; copy the last good approximation
            rrncf BITLOC0, F, 0 ; rotate the bit location register
            movf BITLOC0, W, 0
            iorwf RES0, F, 0
            btfsc BITLOC0, 7, 0 ; if last value was tested then get
            bra Done ; out
            bra Square8
    Done movff TEMP0,RES0 ; put the final result in RES0
            bra TotallyDone
        
       
       
    Sqrt32 clrf TEMP0, 0 ; clear the temp solution
            clrf TEMP1, 
            clrf BITLOC0, 0 ; setup the first bit
            clrf RES0, 0
            movlw 0x80
            movwf BITLOC1, 0 ; BitLoc = 0x8000
            movwf RES1, 0 ; RES = 0x8000
    Squar16 movff RES0, ARG1L ; square the guess
            movff RES1, ARG1H
            call Sq16
            movf SQRES0, W, 0 ; ARGA - PROD test
            subwf ARGA0, W, 0
            movf SQRES1, W, 0
            subwfb ARGA1, W, 0
            movf SQRES2, W, 0
            subwfb ARGA2, W, 0
            movf SQRES3, W, 0
            subwfb ARGA3, W, 0
            btfsc STATUS, C, 0
            bra NxtBt16 ; if positive then next bit
        ; if negative then rotate right
            addlw 0x00 ; clear carry
            movff TEMP0, RES0 ; move last good value back into RES0
            movff TEMP1, RES1
            rrcf BITLOC1, F, 0 ; then rotote the bit and put it
            rrcf BITLOC0, F, 0
            movf BITLOC1, W, 0 ; back into RES1:RES0
            iorwf RES1, F, 0
            movf BITLOC0, W, 0
            iorwf RES0, F, 0
            btfsc STATUS, C, 0 ; if last value was tested then get
            bra Done32 ; out
            bra Squar16 ; elso go back for another test
    NxtBt16 addlw 0x00 ; clear carry
            movff RES0, TEMP0 ; copy the last good approximation
            movff RES1, TEMP1
            rrcf BITLOC1, F, 0 ; rotate the bit location register
            rrcf BITLOC0, F, 0
            movf BITLOC1, W, 0 ; and put back into RES1:RES0
            iorwf RES1, F, 0
            movf BITLOC0, W, 0
            iorwf RES0, F, 0
    
    
            btfsc STATUS, C, 0 ; if last value was tested then get
            bra Done32 ; out
            bra Squar16
    Done32 movff TEMP0,RES0 ; put the final result in RES1:RES0
            movff TEMP1,RES1
            bra TotallyDone
    
      
    Sq16 movf ARG1L, W, 0
            mulwf ARG1L ; ARG1L * ARG2L ->
        ; PRODH:PRODL
            movff PRODH, SQRES1 ;
            movff PRODL, SQRES0 ;
            movf ARG1H, W, 0
            mulwf ARG1H ; ARG1H * ARG2H ->
        ; PRODH:PRODL
            movff PRODH, SQRES3 ;
            movff PRODL, SQRES2 ;
            movf ARG1L, W, 0
            mulwf ARG1H ; ARG1L * ARG2H ->
        ; PRODH:PRODL
            movf PRODL, W, 0 ;
            addwf SQRES1, F, 0 ; Add cross
            movf PRODH, W, 0 ; products
            addwfc SQRES2, F, 0 ;
            clrf WREG, 0 ;
            addwfc SQRES3, F, 0 ;
            movf ARG1H, W, 0 ;
            mulwf ARG1L ; ARG1H * ARG2L ->
        ; PRODH:PRODL
            movf PRODL, W, 0 ;
            addwf SQRES1, F, 0 ; Add cross
            movf PRODH, W, 0 ; products
            addwfc SQRES2, F, 0 ;
            clrf WREG, W ;
            addwfc SQRES3, F, 0 ;
            return
    
    TotallyDone    
        
    ENDASM
    Last edited by ScaleRobotics; - 1st May 2009 at 17:58.

  3. #3
    Join Date
    Feb 2006
    Location
    Gilroy, CA
    Posts
    1,530


    Did you find this post helpful? Yes | No

    Default

    I used LCDOUT instead of your serial routine though.

  4. #4
    Join Date
    Sep 2005
    Location
    Campbell, CA
    Posts
    1,107


    Did you find this post helpful? Yes | No

    Default

    I'll try timing it Monday. I eventually got everything to work using the SQR routine in PBPL. I gained some extra time by not using the ADCIN command. Instead, I start a channel, then do the computation for the previous channel. Not having to wait on the A/D bought me enough time to get everything done. I'm doing true RMS on 4 channels simultaneously using an '8723 at 40MHz.
    Charles Linquist

  5. #5
    Join Date
    Feb 2006
    Location
    Gilroy, CA
    Posts
    1,530


    Did you find this post helpful? Yes | No

    Default Square Root 16 and 32 bit include file

    Quote Originally Posted by Charles Linquis View Post
    I'll try timing it Monday.
    Thanks Charles,

    No worries, I was just curious if pbpl was any slower compared to using pbp with the square root assembly code. Since pbpl uses a bit more space, I was curious to know if it was faster, or slower. It appears that a 20mhz chip can do about 5 or 6 of these in 1 ms using the assembly include file.

    If anyone is interested, to make things easier, I have attached it as an include file. It can only be used on PIC18 chips, and according to TB040, must be modified for use with PIC17 devices that have a hardware multiplier. But if you did not have a new version of pbp (that had pbpl included), this would allow you to perform 32 bit square root. And it is much smaller than compiling in pbpl.

    To use, load argh with the upper 16 bits, and argl with the lower 16 bits, then call square. Result will be in word variable RES.

    Code:
    INCLUDE "square.pbp"
    
    'some defines here
    main:
    'and a little bit of code there....
    
    ARGH = $0001       'load upper 16 bits into argument (any value you want)
    ARGL = $ffff       'load lower 16 bits into argument (any value you want)
    call square        'call square assembly function
    lcdout $FE,1,#RES  'print result to lcd
    Here are the results from codetimer.bas:
    Time: 84.66328 usec
    OSC Freq: 48 Mhz
    Attached Files Attached Files
    Last edited by ScaleRobotics; - 5th February 2011 at 20:49. Reason: added codetimer speed at 48 mhz

  6. #6
    Join Date
    Feb 2006
    Location
    Gilroy, CA
    Posts
    1,530


    Did you find this post helpful? Yes | No

    Default

    Well, I was wrong. PBPL won my crude speed comparison, and was 1.4 times faster when compared to the above TB040 assembly code. I had thought that Microchip's TB040 code would be pretty optimized, but that is pretty impressive MeLabs! Maybe I can stop being afraid to us pbpl now!
    http://www.scalerobotics.com

  7. #7
    Join Date
    Jul 2003
    Posts
    2,405


    Did you find this post helpful? Yes | No

    Default

    You would flip if you knew some of the big manufacturers we've sold PBP to that have
    produced multi-million dollar gadgets with it.

    It's a lot more solid than some folks give it credit for.
    Regards,

    -Bruce
    tech at rentron.com
    http://www.rentron.com

Similar Threads

  1. Bits, Bytes Words and Arrays
    By Melanie in forum FAQ - Frequently Asked Questions
    Replies: 24
    Last Post: - 14th June 2016, 08:55
  2. Replies: 3
    Last Post: - 18th March 2008, 05:29
  3. PICBasic newbie problem
    By ELCouz in forum mel PIC BASIC Pro
    Replies: 32
    Last Post: - 12th February 2008, 01:55
  4. 32 bit data displaying on LCD
    By selahattin in forum mel PIC BASIC Pro
    Replies: 10
    Last Post: - 15th September 2006, 14:33
  5. USART interrupt not interrupting right
    By Morpheus in forum mel PIC BASIC Pro
    Replies: 12
    Last Post: - 6th March 2005, 02:07

Members who have read this thread : 0

You do not have permission to view the list of names.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts