Code for bootloader, generally need to be very basic. Then there is little space to make mistake.
My biggest bootloader is around 1,2K, smalest 278B.
I usually use 64K+ to store received HEX. So when I verify download in main code, check checksum, etc. Then I call bootloader, just to copy from 64K+ to 0-64K.
I also used I2C, and SPI flash to store hex.
Comm is handled by main code, sometimes I download hex from FTP, sometimes is over serial, or bluetooth(same as serial)...