PDA

View Full Version : USB data corruption



Charlie
- 22nd March 2012, 01:56
This is a bit of a shot in the dark...

I've put together an application around an 18F2550 that receives serial data packets (RS-232ish), then stores them in a serial EEPROM. Asynchronously, the EEPROM is accessed by USB to retrieve packets from the memory. The UART data is protected by checksum and seems to correctly store the data, which I can correctly read via the USB port. (It's an HID application). UART packets arrive about 5 per minute. The USB high level application application checks for updates every 10 seconds or so. About once an hour, the USB application gets corrupted data.

As you can imagine, finding the source of the corruption is extremely tedious! I'm using Darrel's DS_INTS-18.pbp, and ReEnterPBP-18.bas. I've tried disabling the serial port during USB activity as these packets are highly redundant and a loss or two won't matter. The error rate goes down, but does not disappear.
(Unfortunately, the USB application is not under my control, so I need to get the data there error free).

There's over a thousand lines of code which is a bit much to drop in here - instead I'm wondering if anybody has any "off the top of the head" ideas of where to look for corruption sources.

Like I said, shot in the dark... I'm just running out of ideas.

Demon
- 22nd March 2012, 02:52
Ok, what about a sleep problem?

Does the PC take a nap and forget to wake up the USB port? Or is your PIC taking a nap and not waking up, or losing important info just as it is being awakened? Maybe the awakening process is taking too long for the USB connection to be maintained properly?

Robert

EDIT: What if you shorten the sleep process to a minute? Start with the PC, and then the PIC later if it has a sleep.

Charlie
- 22nd March 2012, 09:45
Good thought, Robert. But I'm actually not using SLEEP. However I am using the Watchdog - I'll try disabling it to be sure it's not masking a problem somehow.

tumbleweed
- 22nd March 2012, 10:37
I've tried disabling the serial port during USB activity as these packets are highly redundant and a loss or two won't matter.
That may not help. What would happen if you get a USB request while you're in the middle of getting the UART packet?

Charlie
- 22nd March 2012, 12:24
That may not help. What would happen if you get a USB request while you're in the middle of getting the UART packet?
The UART packet gets corrupted; the checksum and/or byte count are declared bad; and the packet gets tossed away (there'll be another along in a few seconds). I'm collecting environmental data, so the occasional lost packet is no big deal. Missing readings are fine -Bad data is not!
I have figured out a way to capture an error happening today. Which means I now have a way to chase it, instead of staring at the screen waiting for that one time per hour or so when things go badly.

Charlie
- 23rd March 2012, 10:58
Well, things have been running happily for a couple hours, so I thought I'd post the solution and possibly help others that get bit by this.

By cranking up how quickly UART packets are received (the sender is dumping 10X faster), and watching what gets sent or stored, I tracked the problem back to the UART routine. Even thought packets were received correctly and passed validity test, they were sometimes getting stored in a variable with errors. Makes no sense, right?

Well, incoming UART packets are stored in an elastic buffer. They are written to an array until an "end of transmission" packet is received, then the number of bytes are checked, checksum calculated, etc. etc.
When the PIC is off dealing with USB, or any number of other things, it is possible for the array to grow beyond it's defined size.

When you write to an array beyond it's defined size, strange things happen. It's almost as if the array provides a window on memory, and writing outside the range moves the window. No errors, no crashes, just bad values at the beginning of the array, and a few less hairs on the head of the coder. Doubling the size of the array seems to have resolved the issue.

Now to clean up the mess I made of the code while chasing this one... On the plus side, I did uncover a couple other bugs I might have missed without this snipe hunt.

pedja089
- 23rd March 2012, 12:33
PBP doesn't check array boundary. So you must be careful with that...

Demon
- 23rd March 2012, 13:34
... It's almost as if the array provides a window on memory...


Not almost, it definitely does.

When you write beyond array allocated space, you could be overwriting your own variables, system variables, or if you're lucky, unused space.

Robert