It's probably already on your mind or in the 'round file'
A single TX (PIC16F628A or whatever, as long as it's got a built in UART) and a few receivers (also 16F628A's, same reasoning), each one driving 8 or whatever outputs hooked up to MOC3012's (opti-iso random phase triac drivers). You still have to sync with the main's zero-crossing (if you're dealing with main power), but that could probably be handled at the TX...since I would think that zero-crossing at the TX would most likely be at the same time as all the receivers (unless they were a few miles down the road or something).
Bookmarks