Where should I discuss SD/MMC FAT issues?


Closed Thread
Results 1 to 40 of 93

Hybrid View

  1. #1
    Join Date
    Mar 2008
    Location
    Texas, USA
    Posts
    114


    Did you find this post helpful? Yes | No

    Default I2C Update

    We'll, I've done my best and yes, about 84K Tcy is as fast as this thing is going to read 512 bytes from the MMC and write the 512 bytes to the FRAM. It's 83,968 Tcy to be exact. Last night I thought I'd give it one more go.

    At the start of the evening I got my latest code loaded and checked for timer1 overflows. Yep, it was running over. The true transfer times were 166,837 Tcy. I started dissecting the code bit by bit and embedding timers to catch cycle times. Turns out I had a very inefficient MMC routine. I strait-lined the code and got it back up to speed. I tweaked the I2C as well as moving the two function subs WRITE_I2C and GET_BYTE to the top of the program. Within an hour or two I got it down to 129,837 Tcy. After doing some math, I figured I was eating 45 Tcy with every loop and couldn't account for where it was coming from. The last part of the routine to be counted was the time needed to simply perform the FOR-NEXT loop.

    Bingo! Turns out using WORD sized variables in a FOR-NEXT loop can be a huge Tcy eater. The solution I used was to break up the 512 count in the for-next loop to two 256 count loops. Boom! The cycle times fell to 83,986 Tcy. So, in a few hours, I saved 82,869 Tcy, saved 400 lines of complied code and gained about twice the speeds. I'm currently at 16mhz ripping the MMC sector and writing it in a total of just under 21ms. Still not fast enough to use it as I would like, but still... that's fast I2C.

    Here's the updated code:
    Attached Files Attached Files
    Last edited by JD123; - 28th March 2008 at 22:11.
    No, I'm not Superman, but I did stay at a Holiday Inn Express last night!

  2. #2
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by JD123 View Post
    We'll, I've done my best and yes, about 84K Tcy is as fast as this thing is going to read 512 bytes from the MMC and write the 512 bytes to the FRAM. It's 83,968 Tcy to be exact. Last night I thought I'd give it one more go.
    Add
    DEFINE NO_CLRWDT 1 'no extra clear watchdog timer instructions
    to the top of your program.
    Will probably save a few more cycles there.

    Also, in your 'big' loop, this might help somewhat. Remove the 2 for/next loops, replace with this:
    loop1:
    x1 = 2 : x = 0
    loop2:
    gosub get_byte : gosub write_i2c : x = x - 1 : if x = 0 then loop2
    x1 = x1 - 1 : if x1 = 0 then loop1

    Rewriting the 2 gosubs as inline code would save you about 2K cycles at most, might get you down below 20ms, but would use up code space. No real savings there.
    Also, go back and try to BANK0 as much as you can, the commonly used variables first, until something doesn't end up in BANK0 (check the .lst file).
    Last edited by skimask; - 29th March 2008 at 09:09.

  3. #3
    Join Date
    Mar 2008
    Location
    Texas, USA
    Posts
    114


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by skimask View Post
    Add
    DEFINE NO_CLRWDT 1 'no extra clear watchdog timer instructions
    to the top of your program.
    Will probably save a few more cycles there.
    Interesting... I didn't think I called any routines that used the WDT. I'll check the .lst and see if there are any in there.
    Also, in your 'big' loop, this might help somewhat. Remove the 2 for/next loops, replace with this:
    loop1:
    x1 = 2 : x = 0
    loop2:
    gosub get_byte : gosub write_i2c : x = x - 1 : if x = 0 then loop2
    x1 = x1 - 1 : if x1 = 0 then loop1
    Rewriting the 2 gosubs as inline code would save you about 2K cycles at most, might get you down below 20ms, but would use up code space. No real savings there.
    On the outside, this looks to take about the same or more Tcy's to run as my 2 serial loops. I'll compile the two different ways and look at the .lst to count the Tcy's and see which is faster.
    Also, go back and try to BANK0 as much as you can, the commonly used variables first, until something doesn't end up in BANK0 (check the .lst file).
    I've made sure that all variables used in this loop process are in bank0. Can variables not used in the loop slow the loop down?
    No, I'm not Superman, but I did stay at a Holiday Inn Express last night!

  4. #4
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by JD123 View Post
    Interesting... I didn't think I called any routines that used the WDT. I'll check the .lst and see if there are any in there.
    PBP inserts CLEARWDT instructions whenever it thinks there's a possibility of the WDT timing out during execution. But I don't know, and I've never bothered checking, if PBP even takes a look at the config fuses to see if the WDT is enabled and/or being used. I'm assuming that PBP ALWAYS inserts the instruction whether you are using the WDT or not. Better safe than sorry type thing. At any rate, I know from my experience that in projects where I'm not using the WDT timer at all, the DEFINE saves a load of program space...or at least a fair percentage.

    On the outside, this looks to take about the same or more Tcy's to run as my 2 serial loops. I'll compile the two different ways and look at the .lst to count the Tcy's and see which is faster.
    And they'll probably end up being close if not identical. About the only thing I know for sure, is the nested 'manual' for/next loop will save a few bytes of program space.

    I've made sure that all variables used in this loop process are in bank0. Can variables not used in the loop slow the loop down?
    I wouldn't think so, but...again, keeping everything you can in BANK0 can only speed things up a bit and save a few bytes here and there.

  5. #5
    Join Date
    Sep 2004
    Location
    montreal, canada
    Posts
    6,898


    Did you find this post helpful? Yes | No

    Default

    But I don't know, and I've never bothered checking, if PBP even takes a look at the config fuses to see if the WDT is enabled and/or being used.
    NOPE, unless they wouldn't add this DEFINE NO_CLRWDT 1 in their list.. and there's still 2 different way to set fuses... and not a lot of user seems to use 'em anyways...


    Repeat-until will execute faster than a For-Next and need less codespace. This remind me the following thread...
    http://www.picbasic.co.uk/forum/show...sharing&page=2

    Post 8 and++
    Steve

    It's not a bug, it's a random feature.
    There's no problem, only learning opportunities.

  6. #6
    Join Date
    Mar 2008
    Location
    Texas, USA
    Posts
    114


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by mister_e View Post
    Repeat-until will execute faster than a For-Next and need less codespace. This remind me the following thread...
    http://www.picbasic.co.uk/forum/show...sharing&page=2

    Post 8 and++
    Thanks. That a good read. I've written some short programs to do the same kind of benchmarking. I didn't consider the WHILE like loops, because though I know the loop functions are faster, adding back in a counter instruction seemed to offset any gains in total Tcy's per cycler. If checking for a match (end of loop condition) does not depend on a sequential counting, the WHILE type statments would be faster, for sure. Is my understanding about this correct?
    No, I'm not Superman, but I did stay at a Holiday Inn Express last night!

  7. #7
    skimask's Avatar
    skimask Guest


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by JD123 View Post
    Thanks. That a good read.
    Just so I can remember where you're going with this...
    You're getting some data, saving it in an FRAM, then dumping it out to an MMC, right?
    But...you need it to dump inside of 16ms...

    We could go on optimizing this all day (week? month? ), saving a couple of cycles here and there, trying to shave off that last couple of uSec.
    I think in the end, you'd be much better off switching over to an 18F4620 (or whatever suits you, 18Fxxxx), kick up the osc. speed a bit, and take advantage of the extra ram in the 18F. Just going to 40Mhz will get you an extra 2.5x theoretical improvement ('cause you'll probably have to add NOPs at various points to keep the timings in spec), no BANKing issues will save another bunch of cycles...and so on and so on... Jist of the story, that 20-ish ms could theoretically drop to a bit over 8ms per chunk.

Similar Threads

  1. Reading and Writing from SD/MMC cards as FAT filesystem?
    By charliez in forum mel PIC BASIC Pro
    Replies: 3
    Last Post: - 22nd June 2006, 23:26

Members who have read this thread : 0

You do not have permission to view the list of names.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts