Writing to external RAM (static) will be plenty fast. You have 87uSec betwen bytes. That is over 400 processor cycles.

But the problem you will have with a simple parallel device is you probably don't have enough pins to do the addressing directly. If you had 3.5 free ports, you could use two for the address, one for the data, and 2 or 3 bits for the handshaking. An address latch scheme would allow you do do everything with 14 bits or so, but that would take some '374s (or equiv).

Or you could set up the address with a shift register.