I do not know if there is a 16bit version of the chip but I wonder why you need more steps (meaning more MCU power hungry code).

Aren't smooth enough the 256 steps?

Ioannis