This is a tough one.

If the trouble is speed and timing then using more MCU cycles might make it worse.

You can look at writing data to the on-board EEPROM.
http://melabs.com/resources/samples/pbp/eeword.htm