may be in your tight loop you could read all 8 bits of a port, store in a temp_port byte and process it later. The delay is minimum.

Of course that way you will miss the fast double clicks, but as you said you can tolerate this.

Best is interrupts as Steve stated.

Ioannis