You need to consider the interrupt latency along with the bit time and adjust the timer accordingly. Look at the generated listing to know how much it is. Another improvement is here

Code:
BitTimer
      bcf T4CON,2     ; Turn Timer4 off 
      movlw 0xBF      ; 104uSec   / 16, 40 Mhz  1 bit time
      movwf TMR4      ; load it
      bsf T4CON,2     ; Turn it back on
      bsf   STATUS,C   ; move bit into carry
      btfss PORTB,0   ; check status of portB.0
      bcf   STATUS,C  ; move bit into carry
      rrcf   _rcv_byte,f ; shift carry into rec'd byte
      decfsz _bit_cntr,f ; decrement the bit counter
      bra   Donebit      ; get out and wait for next int
      bra Donebyte       ; got 8  bits