Optimizing CRC16 routine.


Closed Thread
Results 1 to 20 of 20

Hybrid View

  1. #1
    Join Date
    Aug 2010
    Location
    Maryland, USA
    Posts
    869


    Did you find this post helpful? Yes | No

    Default

    What if you combine these 2 lines:
    Code:
     MB_CRC = MB_CRC >> 1          
     MB_CRC = $A001 ^ MB_CRC
    to be this:
    Code:
     MB_CRC = (MB_CRC >> 1) ^ $A001 ; or try DT's poly here just for consistancy
    This would kinda make sense to me as why yours might be ~twice as long, PBP may be creating double code to do it in 2 steps.
    -Bert

    The glass is not half full or half empty, Its twice as big as needed for the job!

    http://foamcasualty.com/ - Warbird R/C scratch building with foam!

  2. #2
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,615


    Did you find this post helpful? Yes | No

    Default

    Hi,
    Thank you for your ideas, I appreciate it.

    Unfortunately it's the opposite of what you think. Originally I did have those two lines combined but splitting the statement into two increased the performance by ~9% (it went from 1894 to 1730 cycles for the 9 test-bytes) it also saved me 8 bytes of program space - which is secondary but still nice.

    /Henrik.

  3. #3
    Join Date
    Aug 2005
    Location
    Michigan, USA
    Posts
    224


    Did you find this post helpful? Yes | No

    Default

    Greetings Henrik,

    Is there a chance you have a lot of overhead associated with array handling? Have you looked at the assembly language generated by that routine?

    I checked out the inner loop using BoostC and found worst case cycle times of 123 cycles in a "for/next" loop and 100 cycles in a "do/while" loop.

    Cheerful regards, Mike

    Code:
      crcword ^= work;         // 123 cycles (BoostC)
      for(char i=0; i<8; i++)  //                      
      { crcword >>= 1;         //
        if(status.C)           //
          crcword ^= 0x0A01;   //
      }
    Code:
      crcword ^= work; i = 0;   // 100 cycles (BoostC)
      do {                      //
        crcword >>= 1;          //
        if(status.C)            //
          crcword ^= 0x0A01;    //
        i++;                    //
      } while(i.3 == 0);        //
    Last edited by Mike, K8LH; - 16th January 2011 at 18:57.

  4. #4
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,615


    Did you find this post helpful? Yes | No

    Default

    Hi Mike,
    I'm using TMR1 to measure the execution time of the routine. I just moved the starting and stopping of the timer so it's around the inner FOR-NEXT loop. In other words the timer is started before the inner FOR and stopped after the inner NEXT effectively excluding the array handling of the outer FOR-NEXT loop from the timing.

    This time I got 1533 cycles.

    So, as far as I can see it is the inner loop that is eating away the time and I still don't understand why. I'll try a DO-WHILE construct instead of the FOR-NEXT and see what I'll get.

    Thanks!
    /Henrik.

  5. #5
    Join Date
    Aug 2005
    Location
    Michigan, USA
    Posts
    224


    Did you find this post helpful? Yes | No

    Default

    ... <snip> ... I'm using TMR1 to measure the execution time of the routine ... <snip> ...
    Could that be a problem? Do you have anything similar to the MPLAB Simulator Stopwatch?

    ... <snip>... This time I got 1533 cycles ... <snip> ...
    My goodness, that doesn't sound quite right. That's close to what you got when you were processing nine bytes. Could there be a problem with the method you're using to count cycles?

    I would expect you to get code similar to the assembler code generated from my BoostC "Do/While" loop;

    Code:
      crcword ^= work; i = 8;   //
    0040  0845  	MOVF crc16_00000_arg_work, W
    0041  06B7  	XORWF gbl_crcword, F
    0042  3008  	MOVLW 0x08
    0043  00C6  	MOVWF crc16_00000_1_i
    
      do {                      //
    0044        label5
    
        crcword >>= 1;          //
    0044  36B8  	LSRF gbl_crcword+D'1', F
    0045  0CB7  	RRF gbl_crcword, F
    
        if(status.C)            //
    0046  1C03  	BTFSS gbl_status,0
    0047  284C  	GOTO	label6
    004C        label6
    
          crcword ^= 0x0A01;    //
    0048  3001  	MOVLW 0x01
    0049  06B7  	XORWF gbl_crcword, F
    004A  300A  	MOVLW 0x0A
    004B  06B8  	XORWF gbl_crcword+D'1', F
    
      } while(--i);             //
    004C  03C6  	DECF crc16_00000_1_i, F
    004D  1D03  	BTFSS STATUS,Z
    004E  2844  	GOTO	label5
    }
    Last edited by Mike, K8LH; - 16th January 2011 at 18:53.

  6. #6
    Join Date
    Oct 2005
    Location
    Sweden
    Posts
    3,615


    Did you find this post helpful? Yes | No

    Default

    Hi Mike,
    I tried a DO-WHILE construct and I'm quite surprised by the results. I moved the starting and stopping back to where I had it originally and with the DO-WHILE I get 1496 cycles instead of 1730 for the 9 test-bytes. A 67 byte array, previously taking ~12600 cycles now executes in ~10800. Getting there....

    Here's the current ASM code:
    Code:
    0012A0                    M _MB_CRC_16
    0012A0 6831               M         setf    _MB_CRC
    0012A2 6832               M         setf    (_MB_CRC) + 1
    0012A4 0E03               M         movlw   003h
    0012A6 5C3F               M         subwf   _MB_Buffer_Ptr, W
    0012A8 6E42               M         movwf   _MB_k
    0012AA 6A40               M         clrf    _MB_i
    0012AC                    M L00108
    0012AC 0004               M         clrwdt
    0012AE 5040               M         movf    _MB_i, W
    0012B0 5C42               M         subwf   _MB_k, W
    0012B2 A0D8               M         btfss   STATUS, C
    0012B4 EF80 F009          M         goto    L00109
    0012B8 5040               M         movf    _MB_i, W
    0012BA 0F56               M         addlw   low (_MB_Buffer)
    0012BC 6EE9               M         movwf   FSR0L
    0012BE 0E01               M         movlw   (_MB_Buffer) >> 8
    0012C0 6AEA               M         clrf    FSR0H
    0012C2 22EA               M         addwfc  FSR0H, F
    0012C4 CFEF F013          M         movff   INDF0, T1
    0012C8 5013               M         movf    T1,  W
    0012CA 1A31               M         xorwf   _MB_CRC, F
    0012CC 6A41               M         clrf    _MB_j
    0012CE                    M L00110
    0012CE 0004               M         clrwdt
    0012D0 B641               M         btfsc   _MB_j, 003h
    0012D2 EF7D F009          M         goto    L00111
    0012D6 0004               M         clrwdt
    0012D8 A031               M         btfss   _MB_CRC, 000h
    0012DA EF78 F009          M         goto    L00112
    0012DE 90D8               M         bcf     STATUS, C
    0012E0 3232               M         rrcf    _MB_CRC + 1, F
    0012E2 3231               M         rrcf    _MB_CRC, F
    0012E4 0E01               M         movlw   low (0A001h)
    0012E6 1A31               M         xorwf   _MB_CRC, F
    0012E8 0EA0               M         movlw   (0A001h) >> 8
    0012EA 1A32               M         xorwf   _MB_CRC + 1, F
    0012EC EF7B F009          M         goto    L00113
    0012F0                    M L00112
    0012F0 90D8               M         bcf     STATUS, C
    0012F2 3232               M         rrcf    _MB_CRC + 1, F
    0012F4 3231               M         rrcf    _MB_CRC, F
    0012F6                    M L00113
    0012F6 2A41               M         incf    _MB_j, F
    0012F8 D7EA               M         bra     L00110
    0012FA                    M L00111
    0012FA 2A40               M         incf    _MB_i, F
    0012FC A0D8               M         btfss   STATUS, C
    0012FE D7D6               M         bra     L00108
    001300                    M L00109
    001300 0012               M         return
    /Henrik.

  7. #7
    Join Date
    Aug 2005
    Location
    Michigan, USA
    Posts
    224


    Did you find this post helpful? Yes | No

    Default

    The inner loop starts at label "L00110" and uses about 18 instruction cycles (worst case) each loop iteration for a whopping total of right around 144 cycles per byte processed.

    I have not check the rest of the code. I must get back to my studies.

    Cheerful regards, Mike
    Last edited by Mike, K8LH; - 16th January 2011 at 19:35.

  8. #8
    Join Date
    Aug 2010
    Location
    Maryland, USA
    Posts
    869


    Did you find this post helpful? Yes | No

    Default Trying to get up to speed

    Hi Henrik,
    here is the snippit I am testing:
    Code:
    Main:      'DO WHATEVER YOU WANT HERE
    tmr1=0
    t1con = 1
        For MB_j = 0 to 7                      ' Cycle thru the low byte of the CRC value                      
            If MB_CRC.Bit0 THEN                ' If the LSB is set
                MB_CRC = MB_CRC >> 1           ' We shift CRC on bit to the right....
                MB_CRC = $A001 ^ MB_CRC        ' and EXOR it with out polynomial for MODBUS CRC-16 ($A001)
            ELSE
                MB_CRC = MB_CRC >> 1           ' If the LSB is not set we simply shift CRC one bit to the right.
            ENDIF
        Next                                   ' Next bit in the byte
    t1con = 0
    stopnow:
    goto stopnow
    This takes 169 cycles according to TMR1 setup as you see here. I have tried with both $A001 and $1021. Both are the same. Does this fall in line with what you have?

    BTW, MPLAB stopwatch reports the same

    Next I think I will try out DT's snippet
    Last edited by cncmachineguy; - 16th January 2011 at 19:58. Reason: added
    -Bert

    The glass is not half full or half empty, Its twice as big as needed for the job!

    http://foamcasualty.com/ - Warbird R/C scratch building with foam!

Members who have read this thread : 0

You do not have permission to view the list of names.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts