I have done what you describe. The "shifting" was done using SEROUT2 and SERIN2. The "master" PIC sent out 4 bytes of data (in my case) and the "slave" took the bytes
and loaded them into a buffer. The data was followed by a checksum byte. If the checksum matched, the data was written to the port pins. I used a second wire between the two PICs to act as an ACK to indicate a successful trasmission (a pulse LOW was proof that the data got to the slave). The data rate was 19.2 kbaud. Crystals were used on both parts. I used open-collector drive on both lines and 2K pull-ups. The PICs were 6' or so apart.