I use that code with an optical encoder to convert from quartature to up/down pulses and it tracks the encoder perfectly to >200k edges per second. I modified it to incorporate your counter but I might have messed something up.

Since my encoder doesn't stop on 01 or 10, I only need to check 11 and 00. Already 3 times faster than my original.
It might not have a detent at every quadrature state but it sure must "go thru" them.
If you are at 00 and go to 11, which direction did it turn? There's no way to know unless also decoding the 10/01 states in between the detents - and you are doing that.

Also, don't forget that your LCDoutput subroutine takes considerable amount of time, you can not allow the encoder to pass thru two (or more) states without reading it or it WILL miss pulses or even count backwards.

Take your logic analyzer capture and measure the time between the rising edges of channel A and B respectively.
The pulsewidth of channel A is 11ms but the time between the edges on A and B is much shorter than 5.5ms due to the fact that the phase shift is far from 90° on your encoder. Lets say it's 2ms... How long does your LCDoutput subroutine take?