PicBasic text compress routine


Closed Thread
Results 1 to 12 of 12

Hybrid View

  1. #1
    Join Date
    Jun 2004
    Posts
    43

    Default PicBasic text compress routine

    Hello ! I am working on a project in which I need to store some result aquired in a serial EEPROM. The problem is that I need a lot of space in the memory, and I don't want to use cascaded eeproms or flash cards. So I thought about a text compressing routine. Can anyone help me with this ? Thank you

  2. #2
    Join Date
    Nov 2005
    Location
    Perth, Australia
    Posts
    429


    Did you find this post helpful? Yes | No

    Default

    Ive got a feeling that the pbp code required to compress and de-compress text would be awfully long. I hope you have a lot of program space left on your PIC.
    Last edited by Kamikaze47; - 25th November 2005 at 06:54.

  3. #3
    Join Date
    Jun 2004
    Posts
    43


    Did you find this post helpful? Yes | No

    Default

    I know, that's why I use only 1k of 8k program memory.

  4. #4
    tdavis80's Avatar
    tdavis80 Guest


    Did you find this post helpful? Yes | No

    Talking

    any chance that any of the data would contain 'often used' words?

    for example: If it was storing text that seemed to have lots of DayOf MonthOf , YearOf info, you could parse the data so those words are stored as a single byte. If you used bytes > 127 as 'compressed' words, you could compress about 127 words.

    Tuesday, November 1, 2005 = 25 bytes
    [128], [129] 1, [130] = 8 bytes

    The above example assumes that all of the received text would have an ascii value less than 128 so that you could use the high half of a byte for compressed characters.

    peudoCode:

    if ascii of character < 128 then character is normal
    if ascii of character > 127 then
    table_position = character - 127 ' example: 128 - 127 = 1
    lookup table_position to find uncompessed word
    endif

    does this make sense?

  5. #5
    Join Date
    Jun 2004
    Posts
    43


    Did you find this post helpful? Yes | No

    Default

    My whole character set is : 0123456789;=? And I think the most repeated character is 0 . So I thought about this scheme:

    0 - 0
    00 - c
    000 - k
    0000 - M
    and so on .. do you have any ideea ?

  6. #6
    Join Date
    Oct 2004
    Location
    Italy
    Posts
    695


    Did you find this post helpful? Yes | No

    Default

    Hi,

    - What is the possible range of the values?

    Example:

    0.00 (First possible value)
    0.01
    0.99
    ...
    ...
    199.99 (Last possible value).


    - How many stored values do you need?

    - What is the available space in the EEPROM? (bytes)


    Luciano

  7. #7
    Join Date
    Feb 2003
    Posts
    432


    Did you find this post helpful? Yes | No

    Default

    Quote Originally Posted by MegaADY
    My whole character set is : 0123456789;=? And I think the most repeated character is 0 . So I thought about this scheme:

    0 - 0
    00 - c
    000 - k
    0000 - M
    and so on .. do you have any ideea ?
    Very difficult to say without seeing more examples of your data.

    The method described by Luciano will give you a guaranteed 2:1 compression irrespective of what the data is. If you need to compress the data more than that and have instances where characters (other than 0 ) repeat for 2 or more consecutive places then how about this for an idea.

    As you say, you only have a character set of 13 characters and Luciano has already pointed out that you could fit two of your characters into a single byte.

    My idea is similar but one half of the byte contains the code for the character and the other half of the byte the number of counts for that character eg

    nnnncccc where n = number of characters and c = character

    08700900200 (11 bytes) would be 10 18 17 20 19 20 12 20 = 8 bytes = 72%

    however

    22377777773354500000661666 (26 byte) would be
    22 13 77 23 15 14 15 50 26 11 36 = 11 bytes = 42%

    Obviously you cant have a count greater than 15 but if you had say twenty five "Zeros" then that would compress as F0 A0 = 25 chars into two bytes = 8%

    As I say, compression depends on the data, but that is true for any compression routine. Worst case, your 40 characters take 40 bytes, best case your 40 characters are all identical eg
    40 x 2 and would compress to three bytes
    F2 F2 A2 = 15 x 2 + 15 x 2 + 10 x 2 = 40 x 2

    You need to analyse your data for repeating patterns to see if that would help.

    Easy to encode.
    Get first character and nnnn = 1
    while next character is the same and nnnn is < 15 increase nnnn
    store it and get the next character and nnnn = 1.

    To decode
    For x = 1 to nnnn
    character = cccc

    Thinking a bit further, this method could also be used for a larger number of characters eg A to Z (upper case only) is 26 characters so could be stored in 5 bits allowing the remaining 3 bits to be used as a counter meaning that upto 7 consecutive letters could be compressed to a single byte.

    Thanks for makng me think about this.... its given me an idea for something I am already working on

    Regards
    Keith

    www.diyha.co.uk
    www.kat5.tv

Similar Threads

  1. Picbasic VS C Compiler
    By koossa in forum mel PIC BASIC Pro
    Replies: 8
    Last Post: - 11th October 2005, 22:44
  2. Replies: 22
    Last Post: - 12th July 2005, 18:39
  3. Can an asm interrupt contain gosub to a picbasic routine?
    By sougata in forum mel PIC BASIC Pro
    Replies: 5
    Last Post: - 21st April 2005, 20:49
  4. PicBasic Fundamentals
    By Billyc in forum General
    Replies: 9
    Last Post: - 4th May 2004, 11:04
  5. PicBasic Pro & PicBasic syntax different
    By Billyc in forum General
    Replies: 5
    Last Post: - 16th April 2004, 22:19

Members who have read this thread : 0

You do not have permission to view the list of names.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts