9x2x8 lcd "command delays" per text line is probably the worst delay
your snippet lacks that exact detail


the st7920 chip has a spi i/f that's a little faster and uses less pins too

there...