Optimizing DIV


Closed Thread
Results 1 to 40 of 42

Thread: Optimizing DIV

Hybrid View

  1. #1
    Join Date
    Jul 2003
    Location
    Colorado Springs
    Posts
    4,959


    Did you find this post helpful? Yes | No

    Wink

    Quote Originally Posted by skimask View Post
    I'm cooking on some numbers right now ... After those get done cooking, I'm going to expand those number out to ... and let it cook again. Like you annotated, could take months!
    Go for it! I think the SkiDiv5 program should give you enough information to give it a worthy effort.

    Just remember. I'm only the guy giving the results.
    You've chosen to attempt bettering Jeff Schmoyer ...
    Good luck.
    <br>
    DT

  2. #2
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by Darrel Taylor View Post
    Go for it! I think the SkiDiv5 program should give you enough information to give it a worthy effort.
    Just remember. I'm only the guy giving the results.
    You've chosen to attempt bettering Jeff Schmoyer ...
    Good luck.
    <br>
    Working on it... I'm surely not trying to better Jeff! That would be fruitless. If this does work as planned, I'm sure it'll be one of those tradeoffs...speed for space.
    You're the only guy giving the results? So far...
    Standby...shouldn't be too long now...

  3. #3
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Code I'm using now. Using MPLAB simulator to get accurate cycle counts.
    Code:
    resetplaceholder:	'18f4685 code
    DEFINE	OSC		40	'40 Mhz clock for proto work
    DEFINE	NO_CLRWDT	1	'no extra clear watchdog timer instructions
    DISABLE
    DEFINE SKI_DIV_OPT 3
    AL VAR LONG : BL VAR LONG : SkiQ VAR LONG : SkiR VAR LONG
    PBPQ VAR LONG : PBPR VAR LONG : AW VAR WORD : BW VAR WORD : RW VAR WORD
    ERROR VAR BIT : errorcount var long : maxnum var long : s1 var long
    pbpq = al / bl	'need to use at least one PBP divide to kick in DIVS for now
    al3 var al.byte3 : al2 var al.byte2 : al1 var al.byte1 : al0 var al.byte0
    bl3 var bl.byte3 : bl2 var bl.byte2 : bl1 var bl.byte1 : bl0 var bl.byte0
    'aliased so I can find it easily in MPSIM
    testcode:
    	@ nop	;stp = easy to search and add a breakpoint in SIM
    		maxnum = 9 : s1 = 1 : gosub dodivide
            	maxnum = 99 : s1 = 1 : gosub dodivide
            	maxnum = 999 : s1 = 19 : gosub dodivide
            	maxnum = 9999 : s1 = 193 : gosub dodivide
    		maxnum = 99999 : s1 = 1193 : gosub dodivide
    		maxnum = 999999 : s1 = 31279 : gosub dodivide
    		maxnum = 9999999 : s1 = 112791 : gosub dodivide
    		maxnum = 99999999 : s1 = 327913 : gosub dodivide
    		maxnum = 999999999 : s1 = 1279137 : gosub dodivide
    	@ nop	;stp
    END
    stop
    dodivide:	For AL = 0 to maxnum step s1
    			for BL = 0 to maxnum step s1
    				PBPQ = AL / BL
    			next BL
    		next AL
    		@ nop	;stp
    		For AL = 0 to maxnum step s1
    			for BL = 0 to maxnum step s1
    				@ MOVE?NN	_AL, R0
    				@ MOVE?NN	_BL, R1		; AL / BL
    				@ L?CALL	#DIVS
    				@ MOVE?ANN	R0, _SkiQ
    				@ MOVE?NN	R2, _SkiR
    			next BL
    		next AL
    		@ nop	;stp
    		return
    
    ASM
    	ifdef DIVS_USED
      LIST
    #DIVS
    	clrf	R3 + 3		; Clear sign difference indicator
    	btfss	R0 + 3, 7	; Check for R0 negative
    	bra	#divchkr1	; Not negative
    	btg	R3 + 3, 7	; Flip sign indicator
    	clrf	WREG		; Clear W for subtracts
    	negf	R0		; Flip value to plus
    	subfwb	R0 + 1, F
    	subfwb	R0 + 2, F
    	subfwb	R0 + 3, F
    #divchkr1
    	btfss	R1 + 3, 7	; Check for R1 negative
    	bra	#divdo		; Not negative
    	btg	R3 + 3, 7	; Flip sign indicator
    	clrf	WREG		; Clear W for subtracts
    	negf	R1		; Flip value to plus
    	subfwb	R1 + 1, F
    	subfwb	R1 + 2, F
    	subfwb	R1 + 3, F
    	bra	#divdo		; Skip unsigned entry
      NOLIST
    DIV_USED = 1
    	endif
    	ifdef DIV_USED
      LIST
    #DIV
    		ifdef DIVS_USED
    	clrf	R3 + 3		; Clear sign difference indicator	
    		endif
    #divdo
    	clrf	R2		; Do the divide
    	clrf	R2 + 1
    	clrf	R2 + 2
    	clrf	R2 + 3
    	movlw	32
    	movwf	R3
    
    ;added to speed up s-31 divides by using byte and bit shifting,
    ;added checking for 32, 24, 16 and 8 bit divides
    ;and using those routines if R0 and R1 are small enough
    		ifdef SKI_DIV_OPT
    SkiOpt3	;shift down bytes if low bytes are 0'd
    	movf    R0, W      ; IF R0(0)= 0 
    	bnz     SkiOpt4
    
    	movf    R1, W      ;   AND R1(0)= 0 then 
    	bnz     SkiOpt4
    
    	movff   R0 + 1, R0 + 0 ;      and preshift R0
    	movff   R0 + 2, R0 + 1
    	movff   R0 + 3, R0 + 2
    	clrf    R0 + 3
    
    	movff   R1 + 1, R1 + 0 ;      and R1 over 8 bits
    	movff   R1 + 2, R1 + 1
    	movff   R1 + 3, R1 + 2
    	clrf    R1 + 3
    
    	movlw   8              ;      loops - 8
    	subwf   R3, F
    	btfss   STATUS, Z      ; stop if no loop's left (0/0)
    	bra     SkiOpt3
    	bra	#divdone
    
    SkiOpt4	;shift down bytes if low bits are 0'd
    	btfsc	R0, 0	; if lowest bit set, goto divloop
    	bra	skiopt5
    	btfsc	R1, 0	; if lowest bit set, goto divloop
    	bra	skiopt5
    
    	bcf    	STATUS, C	;clr carry-shift over complete R0
    	rrcf	R0 + 3, F	;shift R0+3, .0 into carry
    	rrcf	R0 + 2, F	;shift R0+2
    	rrcf	R0 + 1, F	;shift R0+1
    	rrcf	R0 + 0, F	;shift R0+0
    
    	bcf	STATUS, C	;clr carry-shift over complete R1
    	rrcf	R1 + 3, F	;shift R1, .0 into carry
    	rrcf	R1 + 2, F	;shift R1+2
    	rrcf	R1 + 1, F	;shift R1+1
    	rrcf	R1 + 0, F	;shift R1+0
    
    	movlw	1		;subtract one from the loop count
    	subwf	R3, F
    
    	btfss	STATUS, Z	;stop if no more loops
    	bra	SkiOpt4
    	bra	#divdone
    
    skiopt5	;check if can use different divide methods (32, 24, 16, 8)
    	movlw	32		;load loop count
    	movwf	R3
    	movf	R0 + 3, W
    	bnz	#divloop	;use normal div
    	movf	R1 + 3, W
    	bnz	#divloop	;use normal div
    
    	movlw	24		;load loop count
    	movwf	R3
    	movf	R0 + 2, W
    	bnz	#divloop24	;jump out to 24 bit
    	movf	R1 + 2, W
    	bnz	#divloop24	;jump out to 24 bit
    
    	movlw	16		;load loop count
    	movwf	R3
    	movf	R0 + 1, W
    	bnz	#divloop16	;jump out to 16 bit
    	movf	R1 + 1, W
    	bnz	#divloop16	;jump out to 16 bit
    
    	movlw	8		;load loop count
    	movwf	R3
    	bra	#divloop8	;fall thru to 8 bit
    	
    		endif
    ;above added to speed divide operations
    
    #divloop	;32 bit
    	rlcf	R0 + 3, W
    	rlcf	R2, F
    	rlcf	R2 + 1, F
    	rlcf	R2 + 2, F
    	rlcf	R2 + 3, F
    	movf	R1, W
    	subwf	R2, F
    	movf	R1 + 1, W
    	subwfb	R2 + 1, F
    	movf	R1 + 2, W
    	subwfb	R2 + 2, F
    	movf	R1 + 3, W
    	subwfb	R2 + 3, F
    	bc	#divok
    	movf	R1, W
    	addwf	R2, F
    	movf	R1 + 1, W
    	addwfc	R2 + 1, F
    	movf	R1 + 2, W
    	addwfc	R2 + 2, F
    	movf	R1 + 3, W
    	addwfc	R2 + 3, F
    	bcf	STATUS, C
    #divok
    	rlcf	R0, F
    	rlcf	R0 + 1, F
    	rlcf	R0 + 2, F
    	rlcf	R0 + 3, F
    	decfsz	R3, F
    	bra	#divloop
    
    		ifdef DIVS_USED
    	btfss	R3 + 3, 7	; Should result be negative?
    	bra	#divdone	; Not negative
    	clrf	WREG		; Clear W for subtracts
    	negf	R0		; Flip quotient to minus
    	subfwb	R0 + 1, F
    	subfwb	R0 + 2, F
    	subfwb	R0 + 3, F
    	negf	R2		; Flip remainder to minus
    	subfwb	R2 + 1, F
    	subfwb	R2 + 2, F
    	subfwb	R2 + 3, F
    	bra	#divdone
    		endif
    
    #divloop24	;24 bit
    	rlcf	R0 + 2, W
    	rlcf	R2, F
    	rlcf	R2 + 1, F
    	rlcf	R2 + 2, F
    	movf	R1, W
    	subwf	R2, F
    	movf	R1 + 1, W
    	subwfb	R2 + 1, F
    	movf	R1 + 2, W
    	subwfb	R2 + 2, F
    	bc	#divok24
    	movf	R1, W
    	addwf	R2, F
    	movf	R1 + 1, W
    	addwfc	R2 + 1, F
    	movf	R1 + 2, W
    	addwfc	R2 + 2, F
    	bcf	STATUS, C
    #divok24
    	rlcf	R0, F
    	rlcf	R0 + 1, F
    	rlcf	R0 + 2, F
    	decfsz	R3, F
    	bra	#divloop24
    
    		ifdef DIVS_USED
    	btfss	R3 + 3, 7	; Should result be negative?
    	bra	#divdone	; Not negative
    	clrf	WREG		; Clear W for subtracts
    	negf	R0		; Flip quotient to minus
    	subfwb	R0 + 1, F
    	subfwb	R0 + 2, F
    	subfwb	R0 + 3, F
    	negf	R2		; Flip remainder to minus
    	subfwb	R2 + 1, F
    	subfwb	R2 + 2, F
    	subfwb	R2 + 3, F
    	bra	#divdone
    		endif
    		
    #divloop16	;16 bit
    	rlcf	R0 + 1, W
    	rlcf	R2, F
    	rlcf	R2 + 1, F
    	movf	R1, W
    	subwf	R2, F
    	movf	R1 + 1, W
    	subwfb	R2 + 1, F
    	bc	#divok16
    	movf	R1, W
    	addwf	R2, F
    	movf	R1 + 1, W
    	addwfc	R2 + 1, F
    	bcf	STATUS, C
    #divok16
    	rlcf	R0, F
    	rlcf	R0 + 1, F
    	decfsz	R3, F
    	bra	#divloop16
    
    		ifdef DIVS_USED
    	btfss	R3 + 3, 7	; Should result be negative?
    	bra	#divdone	; Not negative
    	clrf	WREG		; Clear W for subtracts
    	negf	R0		; Flip quotient to minus
    	subfwb	R0 + 1, F
    	subfwb	R0 + 2, F
    	subfwb	R0 + 3, F
    	negf	R2		; Flip remainder to minus
    	subfwb	R2 + 1, F
    	subfwb	R2 + 2, F
    	subfwb	R2 + 3, F
    	bra	#divdone
    		endif
    		
    #divloop8	;8 bit
    	rlcf	R0, W
    	rlcf	R2, F
    	movf	R1, W
    	subwf	R2, F
    	bc	#divok8
    	movf	R1, W
    	addwf	R2, F
    	bcf	STATUS, C
    #divok8
    	rlcf	R0, F
    	decfsz	R3, F
    	bra	#divloop8
    
    		ifdef DIVS_USED
    	btfss	R3 + 3, 7	; Should result be negative?
    	bra	#divdone	; Not negative
    	clrf	WREG		; Clear W for subtracts
    	negf	R0		; Flip quotient to minus
    	subfwb	R0 + 1, F
    	subfwb	R0 + 2, F
    	subfwb	R0 + 3, F
    	negf	R2		; Flip remainder to minus
    	subfwb	R2 + 1, F
    	subfwb	R2 + 2, F
    	subfwb	R2 + 3, F
    		endif
    #divdone
    	movf	R0, W		; Get low byte to W
    	goto	DUNN
      NOLIST
    DUNN_USED = 1
    	endif
    ENDASM
    END
    Cycle counts for various loop and step sizes
    Code:
    loop count	s1=step size	'pbp clock count     ski clock count	increase
    9 : 		s1 = 1		'105,278	          26,124	4.033
    99 : 		s1 = 1		'10,632,832	       2,507,692	4.240
    999 : 		s1 = 19		'2,981,601	       1,136,305	2.624
    9999 : 	        s1 = 193	'2,869,397	       1,174,807	2.442
    99999 : 	s1 = 1193	'7,479,745	       3,913,171	1.911
    999999 : 	s1 = 31279	'1,081,858	         722,091	1.498
    9999999 : 	s1 = 112791	'8,388,778	       5,638,902	1.462
    99999999 : 	s1 = 327913	'98,376,761	      98,167,489	1.002
    999999999 : 	s1 = 1279137	'646,580,978	     663,354,535	0.973
    Obviously, the biggest speed increase comes from using the various divide routines (32, 24, 16, 8), and once you get into the really big numbers, you lose cycles.
    In a bit, I'm going to load this into my protoboard and see what happens (PBP div vs. Ski_Opt divides) as far as accuracy goes.
    I still think shifting left and cutting the loop count ahead of time has merit. As I said, I know I did it on that 6809E back in the day. I just can't remember how...

    EDIT: Added the error checking code back in and put a WATCH on the error counter variable in MPSIM.
    So far, the only error I've gotten is on 0/0. Ski_Opt comes back with both zero's. PBP comes back with a max'd out quotient (2^32 -1), 0 in the remainder.
    Slow going in the sim with a step of 1 running from 0 to (2^31 -1)....
    Just calculated out the situation as described in the line above...
    Running on the SIM on my laptop (Dell Insp. 8200 P4 @ 1.7Ghz), it could take about 571,232,829 years, 5 months, 13 days to complete the loop as written!!!
    Ya...not so much!!!
    Last edited by skimask; - 14th September 2008 at 03:49.

  4. #4
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Well, after letting it run a couple of days into the higher numbers, I've found more errors, random bit/byte errors (at least I can't find a pattern to them). Either my PIC is screwing up the data, or my code with ALL of the optimizations is jacked up somewhere. So, after screwing around with it for a couple of days, I give up!
    One thing that does seem to work correctly though is the '4 different divides' version of the code, the normal s-31 divide, and 3 other divides for unsigned 24, 16, and 8 bit divides. When I get that code cleaned up and commented, it'll be here. A fair amount of speed increase for math intensive code, as DT said, when the numbers are small. I know in a couple of my programs, the speed increase will be a welcome enhancement.

  5. #5
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Latest and greatest (so far)...
    Seems to work great for me...will probably fail miserably for others!
    Haven't packaged it up real nice yet, some comments could stand to be added yet, some removed...
    8 bit only divides run about 4.2 times faster
    16 bit only divides run about 2.4 times faster
    16/24 bit mixed divides run about 1.5 times faster
    24/32 bit mixed divides actually run about 2% slower

    But the cycle counts, and therefore the actual figures aren't 100% accurate because they include the cycles used by the For/Next loops.

    Code:
    resetplaceholder:	'18f4685 code
    DEFINE	OSC		40	'40 Mhz clock for proto work
    DEFINE	NO_CLRWDT	1	'no extra clear watchdog timer instructions
    DISABLE
    DEFINE SKI_DIV_OPT 1
    AL VAR LONG : BL VAR LONG : SkiQ VAR LONG : SkiR VAR LONG : PBPQ VAR LONG
    PBPR VAR LONG : AW VAR WORD : BW VAR WORD : RW VAR WORD : ERROR VAR BIT
    errorcount var long : indicator var word : loopcount var long
    minnum var long : maxnum var long : s1 var long
    pbpq = al / bl	'need to use at least one PBP divide to kick in DIVS for now
    'indicator commented out in ASM code during -speed- runs
    testcode:	minnum = 1000 : maxnum = 1255 : s1 = 1 : gosub dodivide
                        gosub dodivide : gosub dodivide : gosub dodivide
                        gosub dodivide : gosub dodivide : gosub dodivide
    		@ nop	;stopper - easy to find point to set a breakpoint in MPSIM
    END
    stop
    dodivide:	For AL = minnum to maxnum step s1
                       For BL = minnum to maxnum step s1
    '			indicator.lowbyte = $f0		'watch in MPSIM to show what's going on
    '			loopcount = loopcount + 1	'watch in MPSIM to show what's going on
    
    			@ MOVE?NN	_AL, R0				;use only during OPT run
    			@ MOVE?NN	_BL, R1		; AL / BL	;use only during OPT run
    			@ L?CALL	#DIVS				;use only during OPT run
    			@ MOVE?NN	R0, _SkiQ			;use only during OPT run
    			@ MOVE?NN	R2, _SkiR			;use only during OPT run
    
    '			indicator.lowbyte = $0f		'watch in MPSIM to show what's going on
    
    '			PBPQ = AL / BL					;use only during PBP run
    
    '			PBPR = AL // BL					;use only when doing -both- runs to check for errors
    '			indicator.lowbyte = 0		'watch in MPSIM to show what's going on
    '			if ( pbpq <> skiq ) or ( pbpr <> skir ) then errorcount = errorcount + 1	'watch in MPSIM to show what's going on
    		next BL
              next AL
         return
    end
    ASM
    	ifdef DIVS_USED
      LIST
    #DIVS
    	clrf	R3 + 1		;clear shiftout counter
    	clrf	R3 + 3		; Clear sign difference indicator
    	btfss	R0 + 3, 7	; Check for R0 negative
    	bra	#divchkr1	; Not negative
    	btg	R3 + 3, 7	; Flip sign indicator
    	clrf	WREG		; Clear W for subtracts
    	negf	R0 + 0		; Flip value to plus
    	subfwb	R0 + 1, F
    	subfwb	R0 + 2, F
    	subfwb	R0 + 3, F
    #divchkr1
    	btfss	R1 + 3, 7	; Check for R1 negative
    	bra	#divdo		; Not negative
    	btg	R3 + 3, 7	; Flip sign indicator
    	clrf	WREG		; Clear W for subtracts
    	negf	R1 + 0		; Flip value to plus
    	subfwb	R1 + 1, F
    	subfwb	R1 + 2, F
    	subfwb	R1 + 3, F
    	bra	#divdo		; Skip unsigned entry
      NOLIST
    DIV_USED = 1
    	endif
    	ifdef DIV_USED
      LIST
    #DIV
    		ifdef DIVS_USED
    	clrf	R3 + 3		; Clear sign difference indicator
    		endif
    #divdo
    	clrf	R2 + 0		; Do the divide
    	clrf	R2 + 1
    	clrf	R2 + 2
    	clrf	R2 + 3
    	movlw	32
    	movwf	R3 + 0
    		ifdef SKI_DIV_OPT
    ;check for zero case and send directly to divloop if they are zero
    	movf	R0 + 0, W
    	bnz	#divzero1
    	movf	R0 + 1, W
    	bnz	#divzero1
    	movf	R0 + 2, W
    	bnz	#divzero1
    	movf	R0 + 3, W
    	bnz	#divzero1
    	bra	#divloopa	;if R0 is zero, do #divloop
    #divzero1
    	movf	R1 + 0, W
    	bnz	SkiOpt5
    	movf	R1 + 1, W
    	bnz	SkiOpt5
    	movf	R1 + 2, W
    	bnz	SkiOpt5
    	movf	R1 + 3, W
    	bnz	SkiOpt5		;if R1 is not zero, continue OPT
    	bra	#divloopa	;if R1 is zero, do divloop
    SkiOpt5	;check if can use different divide methods (32, 24, 16, 8)
    	movlw	32		;load loop count
    	movwf	R3 + 0
    	movf	R0 + 3, W
    	bnz	#divloopa	;use normal div
    	movf	R1 + 3, W
    	bnz	#divloopa	;use normal div
    	movlw	24		;load loop count
    	movwf	R3 + 0
    	movf	R0 + 2, W
    	bnz	#divloop24a	;jump out to 24 bit
    	movf	R1 + 2, W
    	bnz	#divloop24a	;jump out to 24 bit
    	movlw	16		;load loop count
    	movwf	R3 + 0
    	movf	R0 + 1, W
    	bnz	#divloop16a	;jump out to 16 bit
    	movf	R1 + 1, W
    	bnz	#divloop16a	;jump out to 16 bit
    	movlw	8		;load loop count
    	movwf	R3 + 0
    	movf	R0 + 0, W
    	bnz	#divloop8a	;jump out to 8 bit
    	movf	R1 + 0, W
    	bnz	#divloop8a
    	movlw	32		;reload loop count with max count
    	movwf	R3 + 0
    		endif
    ;above added to speed divide operations
    #divloopa	;32 bit divide
    ;	movlw	50
    ;	movwf	_indicator + 1
    #divloop
    	rlcf	R0 + 3, W
    	rlcf	R2, F
    	rlcf	R2 + 1, F
    	rlcf	R2 + 2, F
    	rlcf	R2 + 3, F
    	movf	R1, W
    	subwf	R2, F
    	movf	R1 + 1, W
    	subwfb	R2 + 1, F
    	movf	R1 + 2, W
    	subwfb	R2 + 2, F
    	movf	R1 + 3, W
    	subwfb	R2 + 3, F
    	bc	#divok
    	movf	R1, W
    	addwf	R2, F
    	movf	R1 + 1, W
    	addwfc	R2 + 1, F
    	movf	R1 + 2, W
    	addwfc	R2 + 2, F
    	movf	R1 + 3, W
    	addwfc	R2 + 3, F
    	bcf	STATUS, 0
    #divok
    	rlcf	R0, F
    	rlcf	R0 + 1, F
    	rlcf	R0 + 2, F
    	rlcf	R0 + 3, F
    	decfsz	R3, F
    	bra	#divloop
    		ifdef DIVS_USED
    	bra	#divnegchk	; Check for negative result
    		endif
    		ifdef SKI_DIV_OPT
    #divloop24a	;24 bit divide
    ;	movlw	36
    ;	movwf	_indicator + 1
    #divloop24
    	rlcf	R0 + 2, W
    	rlcf	R2, F
    	rlcf	R2 + 1, F
    	rlcf	R2 + 2, F
    	movf	R1, W
    	subwf	R2, F
    	movf	R1 + 1, W
    	subwfb	R2 + 1, F
    	movf	R1 + 2, W
    	subwfb	R2 + 2, F
    	bc	#divok24
    	movf	R1, W
    	addwf	R2, F
    	movf	R1 + 1, W
    	addwfc	R2 + 1, F
    	movf	R1 + 2, W
    	addwfc	R2 + 2, F
    	bcf	STATUS, 0
    #divok24
    	rlcf	R0, F
    	rlcf	R0 + 1, F
    	rlcf	R0 + 2, F
    	decfsz	R3, F
    	bra	#divloop24
    		ifdef DIVS_USED
    	bra	#divnegchk	; Check for negative result
    		endif
    #divloop16a	;16 bit divide
    ;	movlw	22
    ;	movwf	_indicator + 1
    #divloop16
    	rlcf	R0 + 1, W
    	rlcf	R2, F
    	rlcf	R2 + 1, F
    	movf	R1, W
    	subwf	R2, F
    	movf	R1 + 1, W
    	subwfb	R2 + 1, F
    	bc	#divok16
    	movf	R1, W
    	addwf	R2, F
    	movf	R1 + 1, W
    	addwfc	R2 + 1, F
    	bcf	STATUS, 0
    #divok16
    	rlcf	R0, F
    	rlcf	R0 + 1, F
    	decfsz	R3, F
    	bra	#divloop16
    		ifdef DIVS_USED
    	bra	#divnegchk	; Check for negative result
    		endif
    #divloop8a	;8 bit divide
    ;	movlw	8
    ;	movwf	_indicator + 1
    #divloop8
    	rlcf	R0, W
    	rlcf	R2, F
    	movf	R1, W
    	subwf	R2, F
    	bc	#divok8
    	movf	R1, W
    	addwf	R2, F
    	bcf	STATUS, 0
    #divok8
    	rlcf	R0, F
    	decfsz	R3, F
    	bra	#divloop8
    		endif
    		ifdef DIVS_USED
    #divnegchk
    	btfss	R3 + 3, 7	; Should result be negative?
    	bra	#divdone	; Not negative
    	clrf	WREG		; Clear W for subtracts
    	negf	R0		; Flip quotient to minus
    	subfwb	R0 + 1, F
    	subfwb	R0 + 2, F
    	subfwb	R0 + 3, F
    	negf	R2		; Flip remainder to minus
    	subfwb	R2 + 1, F
    	subfwb	R2 + 2, F
    	subfwb	R2 + 3, F
    		endif
    #divdone
    	movf	R0 + 0,W
    	goto	DUNN
      NOLIST
    DUNN_USED = 1
    	endif
    ENDASM
    END

  6. #6
    Join Date
    May 2007
    Posts
    604


    Did you find this post helpful? Yes | No

    Default

    This may be of interest, 32x16 divide in 395 cycles:
    http://cablemodem.fibertel.com.ar/at...C%20micros.htm

  7. #7
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by rmteo View Post
    This may be of interest, 32x16 divide in 395 cycles:
    http://cablemodem.fibertel.com.ar/at...C%20micros.htm
    Aye lad...but alas, those routines are unsigned!

    This was all mainly an exercise on my part to see how stuff worked...turned out to do other stuff.
    Mainly, the routine above was meant to replace the 32/32 signed divide routine in the pbppi18l.lib library. The regular PBP library treats all math as though it were 32 bit, even if the numbers are smaller. You don't gain much by optimizing adds, subtracts, and multiplies, but divides stand to gain quite a bit if the routines can differentiate between different size numbers and just to different, faster routines accordingly. The library adds about 100 bytes to the routine, but it's only 100 bytes once, not for every divide, and it could possibly shave a load of cycles off the routine.

Similar Threads

  1. Optimizing LCD commands?
    By jblackann in forum mel PIC BASIC Pro
    Replies: 1
    Last Post: - 4th December 2007, 16:30

Members who have read this thread : 0

You do not have permission to view the list of names.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts