Since it's only a 16 bit number, maybe you'd be best off using a lookup table. After all, there are only 255 integer numbers of squares inside of 16 bits...
EDIT: Whoops! Just saw that the title said 16/32 bit sqrt instead of 16 bit only.
But...I'd say, same thing still applies. Could easily get the sqrt of the upper 16 bits of the 32 bits using a lookup table. Then depending on the accuracy required, interpolate the lower 16 bits.
Bookmarks