Parsing Strings...


Closed Thread
Results 1 to 24 of 24
  1. #1
    Join Date
    Feb 2009
    Posts
    41

    Default Parsing Strings...

    Hi Everyone,

    I have written a big chunk of code an it is working well, but I am stuck on something having to do with strings.

    I have received a string from a serial device which contains two target strings seperated by a comma. The data looks like this:

    1432,765

    the length may vary like this:

    22567,8345

    so the the comma is key as a locator. I can't figure out with what little documentation is in the users manual how to parse the string for these two numbers.

    Thanks in advance!
    TR

  2. #2
    Join Date
    Nov 2003
    Location
    Wellton, U.S.A.
    Posts
    5,924


    Did you find this post helpful? Yes | No

    Default

    Look in the manual under SERIN2.

    Look for the part something like:
    WAIT STR\n followed by optional end character...for the first part and then use something like WAIT[,] to get the last part.
    Dave
    Always wear safety glasses while programming.

  3. #3
    Join Date
    Nov 2005
    Location
    Bombay, India
    Posts
    947


    Did you find this post helpful? Yes | No

    Default

    Can you clarify what you mean when you say "I have received a string from a serial device" Does it mean it has been stored somewhere and you have to parse it from there? SERIN2 will not do that for you.

  4. #4
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default Thanks...

    I appreciate all of your help!

    I had written the code using Hserin to capture the raw data from my serial port into a single string of [10]. It works so now I need to separate the two values. So...

    14575,354 would end up as two longs,

    one with 14575 and another with 345. Either number could be as large as 99999 or as small as 1.

    Thanks in advance,
    TR

  5. #5
    Join Date
    Nov 2003
    Location
    Wellton, U.S.A.
    Posts
    5,924


    Did you find this post helpful? Yes | No

    Default

    HSERIN uses the same syntax as SERIN2, so do the "separation" as the data is coming in.
    Dave
    Always wear safety glasses while programming.

  6. #6
    Join Date
    Nov 2005
    Location
    Bombay, India
    Posts
    947


    Did you find this post helpful? Yes | No

    Default

    Ok, since you have the string to be parsed, you will need a parser too! I dont have anything ready at hand but, I'll give it a shot anyway. UNTESTED CODE FOLLOWS

    Code:
    ' assume the string is TOPARSE
    ' it will return you 2 numbers,  one from before the comma, one from after the comma
    Number1 var Long
    Number2 var Long
    Cntr       var byte      ' counter to index the string
    
    ParseNumber:
     Number1 = 0              ' start with 0 in both numbers
     Number2 = 0
     'collect Number1
     for Cntr = 0 to 10        ' you said your string is 10 places long
        if TOPARSE[Cntr] <> "," then
             Number1=Number1*10                             ' x10 to make place for the new digit
             Number1 = Number1+TOPARSE[Cntr]-'0'      ' I'm assuming this is an ASCII string
        else
             goto GetNum2     ' collect the remainder as number2
        endif
     next
     return
    GetNum2:
     for Cntr=Cntr+1 to 10
        if TOPARSE[Cntr] <> "," then
             Number2=Number2*10
             Number2 = Number2+TOPARSE[Cntr]-'0'      ' I'm assuming this is an ASCII string
        else
             return                ' because, you said there are only 2 numbers ;)
        endif
     next
     return
    You could modify this code to do a repetitive parse. Every time you call it, it would return you a number, but I'll leave that to you.

    Good luck.
    Last edited by Jerson; - 11th February 2009 at 16:12. Reason: GetNum2: for Cntr=Cntr+1 to 10

  7. #7
    Join Date
    Nov 2003
    Location
    Wellton, U.S.A.
    Posts
    5,924


    Did you find this post helpful? Yes | No

    Default

    I still think separating the numbers on the way in is the way to go. Although Jerson's code is pretty slick.

    Here is what I was thinking. Just tested it using SERIN2 as the chip I have on the bench is not set up for HSERIN right now.

    The first WAIT is to keep garbage out.
    Code:
    N1	VAR	LONG
    N2	VAR	LONG
    LOOP:
    SERIN2 PORTD.2, 16416, [WAIT("X"),DEC N1,WAIT(","),DEC N2]
    GOTO DISPLAY
    
    DISPLAY:
    Serout2 PORTC.6, 16416, [ DEC N1, $d, $a]
    Serout2 PORTC.6, 16416, [ DEC N2, $d, $a]
    GOTO LOOP
    If you need the data in an array just modify the above with STR...
    Last edited by mackrackit; - 11th February 2009 at 15:09.
    Dave
    Always wear safety glasses while programming.

  8. #8
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default Hmmm...

    Both great suggestions for which I appreciate!!! I'm pondering....

  9. #9
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default No success...

    I've tried dozens of ideas along those lines and no solution works....

    Like:
    hserIN 65535, oops, [STR smallp\5, WAIT(","), STR largep\5\13]

    problem here is the string size I expect to receive is variable, but the STR forces you to put in the string length. If it's shorter the function doesn't return anything, yet it doesn't time out. I'm ending up with the comma in my first string, even though I'm only looking for it as a seperator and after I receive the first serial data, every time thereafter there is a non-acsii character in the first byte of smallp.

    If the data was always the same length it would be a breeze...

  10. #10
    Join Date
    Nov 2003
    Location
    Wellton, U.S.A.
    Posts
    5,924


    Did you find this post helpful? Yes | No

    Default

    From the manual:
    STR followed by a byte array variable, count and optional ending
    character will receive a string of characters. The string length is
    determined by the count or when the optional character is
    encountered in the input.
    [STR smallp\10\,]
    See if that will get the first part correctly.

    Does the data comming in have a qualifying character before the data you want? If so then the first example I gave will work. If it does not have a qualifying character then leave that part out.
    Dave
    Always wear safety glasses while programming.

  11. #11
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default

    I appreciate your help! Very confused

    hserIN 65535, oops, [STR smallp\10\","]

    The first time thru I get the correct first part of the string. the device was sent 777,53 and smallp received 777.

    The second time and every time thereafter the data is skewed. For example if the next packet that comes in is 766,59 smallp receives a 53766. The 53 from the last packet sent....

  12. #12
    Join Date
    Sep 2005
    Location
    Campbell, CA
    Posts
    1,107


    Did you find this post helpful? Yes | No

    Default

    It would seem to be a lot easier to capture the whole sequence, then later scan the array for any commas and deal with it that way, especially since when you are using SERIN you have very little processing time available between characters.
    Charles Linquist

  13. #13
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default

    I think that's a great suggestion and I agree. I have been unable to successfully parse it after the fact.

  14. #14
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default

    and to make matters worse I am unable to consistantly and correctly receive serial packets.

  15. #15
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default

    Well, I got the serial function working reliably by doing this:

    hserIN 65535, oops, [STR smallp\12\10]

    rather than this:

    hserIN 65535, oops, [STR smallp\12\13]

    aparently the linefeed was the last command sent, so by terminating with a carraige return, the linefeed that followed was polluting the serial stream for the next packet

  16. #16
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default

    So now that I'm just looking at a parsing issue I've changed the serial in code to:

    hserIN 65535, oops, [STR smallp\5, WAIT(","), STR largep\5\10]

    It doesn't work as expected. When receiving 1480,95...

    STR smallp\5 actually captures 5 characters including the comma even though it's the "wait for" character. I can't make smallp\4 because sometimes the data received will be 5 characters. It could also be 3 or 2 or 1 characters.

  17. #17
    Join Date
    Feb 2009
    Location
    Southern California
    Posts
    86


    Did you find this post helpful? Yes | No

    Default

    Why don't you try it this way
    hserIN 65535, oops, [STR smallp\5\","]
    hserIN 65535, oops, [STR largep\5\13]

    then you have the values in 2 streams, each left justified-

    David

  18. #18
    Join Date
    Aug 2006
    Location
    Look, behind you.
    Posts
    2,818


    Did you find this post helpful? Yes | No

    Default Second post

    Wow, second post and you are already helping . . . <B><H2>WELCOME TO THE FORUM DAVID !
    If you do not believe in MAGIC, Consider how currency has value simply by printing it, and is then traded for real assets.
    .
    Gold is the money of kings, silver is the money of gentlemen, barter is the money of peasants - but debt is the money of slaves
    .
    There simply is no "Happy Spam" If you do it you will disappear from this forum.

  19. #19
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default

    Hi David,

    That would almost work. The problem is that neither string is guaranteed to be exactly 5 characters. So if the sent data was 1499,136 I would actually get 1499, likewise if the data is only 3 characters like 210,152 I would actually get 210,1

    confusing!!

  20. #20
    Join Date
    Feb 2009
    Location
    Southern California
    Posts
    86


    Did you find this post helpful? Yes | No

    Default

    That is what the option end character is good for. By setting the end character to a "," then the command will take up to but not necessarily 5 characters until it gets the "," or in the second line the Line feed. I put a 13-should be 10, as you mentioned earlier the line feed comes after the carriage return. So assuming you reset the stream before each update to all 0s if you sent the stream
    1499,136(CR)(LF)
    then
    hserIN 65535, oops, [STR smallp\5\","]
    would give you an array with "1"-"4"-"9"-"9"-0 in smallp and
    hserIN 65535, oops, [STR largep\5\10]
    would give you "1"-"3"-"6"-13-0 in largep

    You would just then ignore the 13 or re zero it when you do your parsing


    Thanks for the welcome Joe, I've been in the forum reading all week, mostly older threads. I'm finally sort of caught up on what is still relevant.

  21. #21
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default

    That was the way I read the manual as well. I thought that as soon as the function saw a character meeting the WAIT criteria it would terminate collecting any more characters. In fact what I'm seeing is that it continues to gather all 5 characters, so it collects the comma as well if the string is less than 5!

    The manual says the string length is determined by the count OR when the optional character is encountered. Unfortunately that sentence can be interpreted two ways.

  22. #22
    Join Date
    Nov 2003
    Location
    Wellton, U.S.A.
    Posts
    5,924


    Did you find this post helpful? Yes | No

    Default

    Hi,
    Played with this some more...
    Without a qualifier for some reason the buffer keeps the last data sent and returns it with the next. I guess you do not have a qualifier at the beginning of the string so use the LF or CR at the end.

    This is what I am sending
    123456789,987654321
    and I have my terminal set to terminate the line with a CR LF.
    The working code
    ###############
    Code:
    <html>
    <head></head>
    <body><!--StartFragment--><pre><code><font color="#000080">
    '18F6680   02/14/09  INFEED PARSE TEST BAUD 9600
        </i></font><font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>OSC </b></font>20
        @ <font color="#0000FF"><b>__CONFIG    _CONFIG1H</b></font>, <font color="#0000FF"><b>_OSC_HS_1H
        </b></font>@ <font color="#0000FF"><b>__CONFIG    _CONFIG2H</b></font>, <font color="#0000FF"><b>_WDT_ON_2H </b></font>&amp; <font color="#0000FF"><b>_WDTPS_128_2H
        </b></font>@ <font color="#0000FF"><b>__CONFIG    _CONFIG4L</b></font>, <font color="#0000FF"><b>_LVP_OFF_4L
        </b></font><font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_DREG     PORTG 
        </b></font><font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_DBIT     </b></font>0
        <font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_RSREG    PORTE 
        </b></font><font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_RSBIT    </b></font>0
        <font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_EREG     PORTE 
        </b></font><font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_EBIT     </b></font>1
        <font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_BITS     </b></font>4 
        <font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_LINES    </b></font>4
        <font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_COMMANDUS    </b></font>3000 
        <font color="#FF0000"><b>DEFINE </b></font><font color="#0000FF"><b>LCD_DATAUS   </b></font>150
      <font color="#000080"><i>'###############################################
        </i></font><font color="#FF0000"><b>PAUSE </b></font>100 : <font color="#FF0000"><b>LCDOUT </b></font>$FE,1,<font color="#00FF00"><b><i>&quot;TEST&quot;
        </i></b></font><font color="#0000FF"><b>N1 </b></font><font color="#FF0000"><b>VAR LONG</b></font>:<font color="#0000FF"><b>N2 </b></font><font color="#FF0000"><b>VAR LONG
        </b></font><font color="#0000FF"><b>START</b></font>: <font color="#0000FF"><b>N1 </b></font>= 0 : <font color="#0000FF"><b>N2 </b></font>= 0
        <font color="#FF0000"><b>HIGH </b></font><font color="#0000FF"><b>PORTG</b></font>.4 :<font color="#FF0000"><b>PAUSE </b></font>250:<font color="#FF0000"><b>LOW </b></font><font color="#0000FF"><b>PORTG</b></font>.4
        <font color="#0000FF"><b>RCSTA</b></font>.4 = 0 : <font color="#0000FF"><b>RCSTA</b></font>.4 = 1
        <font color="#000080"><i>'CHANGE LINE FEED AND CARRIAGE RETURN AS REQUIRED 
        </i></font><font color="#0000FF"><b>RCSTA</b></font>=$90:<font color="#0000FF"><b>TXSTA</b></font>=$24:<font color="#0000FF"><b>SPBRG</b></font>=129:<font color="#FF0000"><b>HSERIN</b></font>[<font color="#0000FF"><b>WAIT</b></font>($a),<font color="#0000FF"><b>WAIT</b></font>($d),<font color="#FF0000"><b>DEC </b></font><font color="#0000FF"><b>N1</b></font>,<font color="#0000FF"><b>WAIT</b></font>(<font color="#00FF00"><b><i>&quot;,&quot;</i></b></font>),<font color="#FF0000"><b>DEC </b></font><font color="#0000FF"><b>N2</b></font>] 
        <font color="#FF0000"><b>LCDOUT </b></font>$FE,1,<font color="#FF0000"><b>DEC </b></font><font color="#0000FF"><b>N1 </b></font>: <font color="#FF0000"><b>LCDOUT </b></font>$FE,$C0,<font color="#FF0000"><b>DEC </b></font><font color="#0000FF"><b>N2 </b></font>: <font color="#FF0000"><b>GOTO </b></font><font color="#0000FF"><b>START
    </b></font></code></pre><!--EndFragment--></body>
    </html>
    Dave
    Always wear safety glasses while programming.

  23. #23
    Join Date
    Feb 2009
    Posts
    41


    Did you find this post helpful? Yes | No

    Default

    Hi Dave,

    I think your latest solution might have worked, but I had already got it working by parsing it after the fact based on Jersons parse routine. I really appreciate everyones input. Here is what the working code looks like:

    hserIN 65535, oops, [STR inputdata\12\10] 'get the serial data
    gosub ParseNumber 'split the raw data into 2 numbers

    ParseNumber: ' collect both pieces of the data in sep arrays
    for Cntr = 0 to 10
    if inputdata[Cntr] <> "," then ' Found comma? We've got first
    smallp[Cntr] = inputdata[Cntr]
    else

    Cntr2 = Cntr 'marks where the comma was
    'goto earlyexit
    goto GetNum2 ' Now get the second number
    endif
    next
    earlyexit:
    return

    GetNum2:
    CNTR = Cntr + 1 'get past the comma
    for Cntr = Cntr to 10
    if inputdata[Cntr] < "0" then
    RETURN
    else
    If inputdata[Cntr] > "9" then
    RETURN
    ELSE
    largep[Cntr-Cntr2-1] = inputdata[Cntr] ' It's a valid number
    endif
    ENDIF
    next

    return

    I had to modify the array pointer for the second number by subtracting the point at which the comma was found on the first pass from it's pointer. In other words the pointer for the input data can't be used directly to point to the second number.

  24. #24
    Join Date
    Aug 2006
    Location
    Look, behind you.
    Posts
    2,818


    Did you find this post helpful? Yes | No

    Default

    <b>Proof Again</b> There is more than 1 right way to do almost everything.
    If you do not believe in MAGIC, Consider how currency has value simply by printing it, and is then traded for real assets.
    .
    Gold is the money of kings, silver is the money of gentlemen, barter is the money of peasants - but debt is the money of slaves
    .
    There simply is no "Happy Spam" If you do it you will disappear from this forum.

Similar Threads

  1. Parsing serial data
    By Heckler in forum mel PIC BASIC Pro
    Replies: 1
    Last Post: - 4th March 2010, 14:25
  2. Please help with storing strings in codespace
    By g-hoot in forum mel PIC BASIC Pro
    Replies: 3
    Last Post: - 16th May 2008, 01:02
  3. How to write/read strings EEPROM/LCD
    By g-hoot in forum mel PIC BASIC Pro
    Replies: 22
    Last Post: - 11th February 2007, 06:26
  4. Processing lengthy strings
    By sougata in forum mel PIC BASIC Pro
    Replies: 1
    Last Post: - 21st March 2006, 05:27
  5. I2CWRITE writing Strings to EEPROM
    By NavMicroSystems in forum mel PIC BASIC Pro
    Replies: 9
    Last Post: - 27th March 2005, 19:45

Members who have read this thread : 1

You do not have permission to view the list of names.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts