Since 628 has not enough outputs, maybe a serial to parallel chip will help you. Say a 5 x 4094 (total 40 outputs).
You feed serially the data to the SIPO (serial in - parallel out) and directly drive your LEDs.
As for the software I see no other solution that Lookup table. Maybe others come up with another idea.
Ioannis
Bookmarks