Optimizing DIV


Closed Thread
Results 1 to 40 of 42

Thread: Optimizing DIV

Hybrid View

  1. #1
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    .................................................. ............
    EDIT: It'll fail above 24 bits...and if it's negative...Nothing in there shifting over the bytes!

    EDIT #2 : Replaced code in post #15
    Last edited by skimask; - 11th September 2008 at 05:43. Reason: Edit's abound...

  2. #2
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Ok, this looks good...er...I think...
    Code:
    '****************************************************************
    '*  Name    : Test_SkiDIV.pbp                                   *
    '*  Author  : Darrel Taylor                                     *
    '*  Date    : 9/9/2008                                          *
    '*  Version : 1.0                                               *
    '*  Thread  : instruction execution time                        *
    '*      http://www.picbasic.co.uk/forum/showthread.php?p=61992  *
    '****************************************************************
    ;****************************************************************
    ;* DIV        : 32 x 32 divide                                  *
    ;* Input      : R0 / R1                                         *
    ;* Output     : R0 = quotient                                   * 
    ;*            : R2 = remainder                                  *
    ;* Notes      : R2 = R0 MOD R1                                  *
    ;****************************************************************
    ;1755 bytes used
    DEFINE OSC 40
    DEFINE HSER_TXSTA 24h 'Hser transmit status init 
    DEFINE HSER_RCSTA 90h 'Hser receive status init 
    DEFINE HSER_BAUD 38400 'Hser baud rate 
    DEFINE HSER_CLROERR 1 'Hser clear overflow automatically 
    DEFINE SKI_DIV_OPT 1
    AL    VAR LONG
    BL    VAR LONG
    SkiQ  VAR LONG   ; ASM quotient
    SkiR  VAR LONG   ; ASM remainder
    PBPQ  VAR LONG   ; PBP quotient
    PBPR  VAR LONG   ; PBP remainder
    AW  VAR WORD
    BW  VAR WORD
    RW  VAR WORD
    ERROR VAR BIT
    For AL = 0 to 1000
        ERROR = 0
        For BL = 0 to 1000
            @ MOVE?NN  _AL, R0
            @ MOVE?NN  _BL, R1  ; AL / BL
            @ L?CALL   #DIVS	'throw in a little bit of clock counting?
            @ MOVE?ANN R0, _SkiQ
            @ MOVE?NN  R2, _SkiR
            PBPQ = AL / BL                   ; do same in PBP
            PBPR = AL // BL		'throw in a bit of clock counting here too?
            if SkiQ != PBPQ then 
                HSEROUT [" Quotient Error: A=",DEC AL,"  B=",DEC BL, _
                         "  Ski=",dec SkiQ,"  PBP=",DEC PBPQ,13,10]
                ERROR = 1
            endif
            if SkiR != PBPR then 
                HSEROUT ["Remainder Error: A=",DEC AL,"  B=",DEC BL, _
                         "  Ski=",dec SkiR,"  PBP=",DEC PBPR,13,10]
                ERROR = 1
            endif
        next BL
        if ERROR = 0 then HSEROUT ["A=",dec AL,"  No ERRORs",13,10]
    next AL
    stop
    ; ---- nothing has been changed below
    ASM
    	ifdef DIVS_USED
      LIST
    #DIVS
    	clrf	R3 + 3		; Clear sign difference indicator
    	btfss	R0 + 3, 7	; Check for R0 negative
    	bra	#divchkr1	; Not negative
    	btg	R3 + 3, 7	; Flip sign indicator
    	clrf	WREG		; Clear W for subtracts
    	negf	R0		; Flip value to plus
    	subfwb	R0 + 1, F
    	subfwb	R0 + 2, F
    	subfwb	R0 + 3, F
    #divchkr1
    	btfss	R1 + 3, 7	; Check for R1 negative
    	bra	#divdo		; Not negative
    	btg	R3 + 3, 7	; Flip sign indicator
    	clrf	WREG		; Clear W for subtracts
    	negf	R1		; Flip value to plus
    	subfwb	R1 + 1, F
    	subfwb	R1 + 2, F
    	subfwb	R1 + 3, F
    	bra	#divdo		; Skip unsigned entry
      NOLIST
    DIV_USED = 1
    	endif
    
    	ifdef DIV_USED
      LIST
    #DIV
    		ifdef DIVS_USED
    	clrf	R3 + 3		; Clear sign difference indicator	
    		endif
    #divdo
    	clrf	R2		; Do the divide
    	clrf	R2 + 1
    	clrf	R2 + 2
    	clrf	R2 + 3
    
    	movlw	32
    	movwf	R3
    
    ;added to speed up s-31 divide op's by ignoring zero'd bytes
            ifdef SKI_DIV_OPT
    		if ( SKI_DIV_OPT == 1 )
    SkiOpt
    	movf    R0 + 3, W	; IF R0.byte3 (low 7 bits) = 0 
    	bcf	W, 7		; Clear 'sign'
    	bnz	#divloop	; 
    	movf    R1 + 3, W	; AND R1.byte3 (low 7 bits)=0 then
    	bcf	W, 7		; Clear 'sign'
    	bnz     #divloop
    	;but only check Rx+3.7 the first time thru (1st time)
    	;after that, instead of a byte shift,
    	;shift everything << 7 (2nd time),
    	;then do 2 more byte shifts if possible
    
    Ski_Shift7
    	movlw	6
    	movwf	R3
    	
    	bcf	STATUS, C
    	rlcf	R0, F		;shift 7 times
    	rlcf	R0 + 1, F
    	rlcf	R0 + 2, F
    	rlcf	R0 + 3, F
    
    	rlcf	R1, F
    	rlcf	R1 + 1, F
    	rlcf	R1 + 2, F
    	rlcf	R1 + 3, F
    
    	movlw	1
    	subwf	R3, F
    	
    	movf	R3, W
    	btfss	STATUS, Z
    	bra	Ski_Shift7	
    
    	;check again after 7 shifts
    	movlw	25
    	movwf	R3
    
    Ski_Shift3
    	movf    R0 + 3, W	; IF R0.byte3 = 0 
    	bnz	#divloop	; 
    	movf    R1 + 3, W	; AND R1.byte3 = 0 then
    	bnz     #divloop
    
    	movff   R0 + 2, R0 + 3 ;      and preshift R0
    	movff   R0 + 1, R0 + 2
    	movff   R0 + 0, R0 + 1
    	clrf    R0
    
    	movff   R1 + 2, R1 + 3 ;      and R1 over 8 bits
    	movff   R1 + 1, R1 + 2
    	movff   R1 + 0, R1 + 1
    	clrf    R1
    	
    	movlw	8
    	subwf	R3, F
    	
    	movf	R3, W
    	btfss	STATUS, C	;if it's 1 (actually 1-8)
    	bra	Ski_Shift3	;jump out
    
    		endif
            endif
    
    ;above added to speed divide operations
    
    ;added to speed up s-31 divides by skipping clr'd bits in divisor/dividend
    	ifdef SKI_DIV_OPT
    		if ( SKI_DIV_OPT == 2 )
    SkiOpt2
    		;change 7 to 6
    	btfsc	R0 + 3, 6	; if highest bit set, goto divloop
    	bra	#divloop
    		;change 7 to 6
    	btfsc	R1 + 3, 6	; if highest bit set, goto divloop
    	bra	#divloop
    	;check Rx+3.6 instead of .7 and do shift if possible
    
    ;streamlined code here...old stuff is gone...
    	bcf    	STATUS, C	;clr carry-shift over complete R0
    	rlcf	R0, F		;shift R0, .7 into carry
    	rlcf	R0 + 1, F	;shift R0+1
    	rlcf	R0 + 2, F	;shift R0+2
    	rlcf	R0 + 3, F	;shift R0+3
    
    	bcf	STATUS, C	;clr carry-shift over complete R1
    	rlcf	R1, F		;shift R1, .7 into carry
    	rlcf	R1 + 1, F	;shift R1+1
    	rlcf	R1 + 2, F	;shift R1+2
    	rlcf	R1 + 3, F	;shift R1+3
    
    	movlw	1		;subtract one from the loop count
    	subwf	R3, F
    
    	movf	R3, W
    	btfss STATUS, Z	;stop if no more loops
    	bra	SkiOpt2
    
    		endif
    	endif
    ;above added to speed up divides at the bit level
    
    #divloop
    	rlcf	R0 + 3, W
    	rlcf	R2, F
    	rlcf	R2 + 1, F
    	rlcf	R2 + 2, F
    	rlcf	R2 + 3, F
    	movf	R1, W
    	subwf	R2, F
    	movf	R1 + 1, W
    	subwfb	R2 + 1, F
    	movf	R1 + 2, W
    	subwfb	R2 + 2, F
    	movf	R1 + 3, W
    	subwfb	R2 + 3, F
    	bc	#divok
    	movf	R1, W
    	addwf	R2, F
    	movf	R1 + 1, W
    	addwfc	R2 + 1, F
    	movf	R1 + 2, W
    	addwfc	R2 + 2, F
    	movf	R1 + 3, W
    	addwfc	R2 + 3, F
    	bcf	STATUS, C
    #divok
    	rlcf	R0, F
    	rlcf	R0 + 1, F
    	rlcf	R0 + 2, F
    	rlcf	R0 + 3, F
    	decfsz	R3, F
    	bra	#divloop
    		ifdef DIVS_USED
    	btfss	R3 + 3, 7	; Should result be negative?
    	bra	#divdone		; Not negative
    	clrf	WREG		; Clear W for subtracts
    	negf	R0		; Flip quotient to minus
    	subfwb	R0 + 1, F
    	subfwb	R0 + 2, F
    	subfwb	R0 + 3, F
    	negf	R2		; Flip remainder to minus
    	subfwb	R2 + 1, F
    	subfwb	R2 + 2, F
    	subfwb	R2 + 3, F
    #divdone
    		endif
    	movf	R0, W		; Get low byte to W
    	goto	DUNN
      NOLIST
    DUNN_USED = 1
    	endif
    ENDASM
    Last edited by skimask; - 11th September 2008 at 06:34.

  3. #3
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    Now I get nothing with SKI_DIV_OPT 1.

    SKI_DIV_OPT 2 still looks the same.

    Are you still using the Simulator?
    <br>
    DT

  4. #4
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by Darrel Taylor View Post
    Now I get nothing with SKI_DIV_OPT 1.

    SKI_DIV_OPT 2 still looks the same.

    Are you still using the Simulator?
    <br>
    Yes, yes, yes... I know...Don't use the simulator.
    I'll get on it good this weekend with hardware (probably just temporarily reprogram my OBD reader for grins)...

    Ok, So, Opt 1 - nothing - What mean you by 'nothing'? Zero's all the way around?
    Opt 2 - gets roughly the same garbage as before?

  5. #5
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by skimask View Post
    Ok, So, Opt 1 - nothing - What mean you by 'nothing'? Zero's all the way around?
    No serial output at all. It's stuck in the optimize section, getting dizzy doing loops.
    May have something to do with subtracting 8 loops from what's now 25 (was 32), but I'm not sure.

    Opt 2 - gets roughly the same garbage as before?
    Same stuff. Quotient is 0, Remainder has the full A value.
    <br>
    DT

  6. #6
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Jeeze...up in SkiShift3...
    Code:
    	movlw	8
    	subwf	R3, F
    	
    	movf	R3, W
    	btfss	STATUS, C	;if it's 1 (actually 1-8)
    	bra	Ski_Shift3	;jump out
    Not going to get much of a carry from that am I?
    The subwf should set STATUS as appropriate, should be able to remove movf R3, W above the branch.
    I'm still looking thru my code in MCS...
    Might not have to worry about the most-sig-bit in R0/R1 since it's preset to 0 by the code at the beginning, therefore, that'll negate checking bit 30 instead of bit 31 of R0/R1.
    Last edited by skimask; - 11th September 2008 at 23:18.

  7. #7
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Did some pencil/paper work on the s31 divide operations at the bit level...
    Trying to optimize at the bit level is fruitless. Preshifting bits accomplishes the same thing that #divloop does except the fact that if the subtraction fails, #divloop restores the working registers (Rx). A few cycles may be wasted there with the restoration of the R(x) register, but those same cycles that may have been saved there, would have been used in the preshifting anyway.
    Optimizing at the byte level should still show a fair amount of cycle savings...
    More pencil/paper work...

Similar Threads

  1. Optimizing LCD commands?
    By jblackann in forum mel PIC BASIC Pro
    Replies: 1
    Last Post: - 4th December 2007, 16:30

Members who have read this thread : 0

You do not have permission to view the list of names.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts