The HSEROUT command will wait until it has stuffed the last character into the 2 byte transmit buffer. That means that all strings longer than 2 characters will do some waiting. It will always be faster than any software serial routine like SEROUT, SEROUT2 or DEBUG who will be busy until the bitter end.

Since the buffersize is set in hardware there is no way of getting it bigger, assembler will be affected in just the same way. You can ofcource use interrupts to handle a bigger softwarebuffer in the background.