PDA

View Full Version : Speeding up a loop?



achilles03
- 23rd April 2020, 18:01
I'm trying to generate multiple audio tones using a DAC/resistor ladder (currently 4 bit / 16 values - will scale up eventually), but am troubleshooting increasing the speed of this loop for a single tone. Originally I was using a sine lookup table, but for multiple tones it appears to be too slow. I'm currently using a roughed-in binary version of 'Bhaskara I's sine approximation formula', which eliminates any lookups.

However, the implementation I have currently is also slower than I'd like. I will scale up to a 20Mhz OSC at some point, but would like to get this loop <~25uS with a 4Mhz osc if possible.

My question is: what instructions in the loop below are computationally expensive, and what changes could be suggested to speed it up? I'm not sure how much additional overhead I incur with word variable operations as opposed to byte variable operations...?

Thanks in advance!
Dave


define OSC 4

cmcon=7

timebyte var byte
x var byte
range var word
ampvar var byte
ampvar2 var byte
nvar var word
dvar var word
fvar var byte
negflip var byte

pausec1 con 25

TRISB = %00000000

timebyte=0

high porta.1

loop1:
x=timebyte & 127
range=128-x
range=range*x
nvar=range<<3
dvar=20480-range
dvar=dvar>>2
fvar=nvar/dvar
ampvar=8+fvar
negflip=2*timebyte.7
ampvar2=ampvar-negflip*fvar
PORTB=ampvar2
timebyte=timebyte+1
goto loop1

tumbleweed
- 23rd April 2020, 18:27
There is no way that's anywhere near as fast as a lookup table.

In your post on the MPELABS forum you had:


lookbyte=timebyte // 32
ampvar=SINB[lookbyte]

The killer there is the '//'.

Did you try replacing that with richard's suggestion (anding timebyte with 31)?
You need to get rid of all multiplication and division related instructions.

If you want to get < 25us with a 4MHz clock you only have 25 asm instructions for the entire loop.

achilles03
- 23rd April 2020, 19:09
I did implement the "& 31" change, and that did significantly speed it up (output tone increased to ~2400 Hz, which is an order of magnitude increase and the // operation was indeed the main speed killer). I still have that on the back burner as a fallback if I can't get anything faster.

At this point I'm trying to figure out if it's possible to implement multiple "decent" audio tones up to 10kHz on a 16Fxx architecture with a 20Mhz OSC. I've even considered compromising with a "less clean" sin approximation, such as a simple polynomial for the quarter wave (ex: =x*22/14-47/21249*x^2-7*x^3/(2^16) with x in binary radians has an error of about 1.7%, assuming I can implement it without being killed time-wise with DIV32).

Thanks,
Dave

richard
- 24th April 2020, 00:08
(ex: =x*22/14-47/21249*x^2-7*x^3/(2^16) with x in binary radians has an error of about 1.7%, assuming I can implement it without being killed time-wise with DIV32).

on a esp8266 @ 160mhz clk or a esp32 , pic16fxx never

tumbleweed
- 24th April 2020, 11:11
audio tones up to 10kHz on a 16Fxx architecture with a 20Mhz OSC
16Fxx isn't the best choice

For comparison:
16Fxx @ 20MHz executes instructions at 5 MIPS,
16F1xxxx @ 32MHz -> 8 MIPS,
18F @ 64MHz -> 16 MIPS

Looking back at your original pseudo code with lookup table SINB array in ram:


loop1:
lookbyte=timebyte and 31
ampvar=SINB(lookbyte)
PORTB=ampvar
timebyte=timebyte+1
goto loop1


On a PIC18F @ 64MHz this executes in 1us/loop (1MHz update rate)
If you put the SINB table in ROM (which is more likely) it slows down to 1.25us/loop (800KHz update rate)

I didn't do those evaluations using PBP, so I don't know if it's much slower than that.

richard
- 24th April 2020, 14:46
a 16f648a@20mhz just passes 8khz with a 32 step lu in asm for a 4 bit r2r ladder on portb
my lu values may be dodgy i just guessed them, result looks awful @ any freq



#CONFIG __config _INTOSC_OSC_NOCLKOUT & _WDT_ON & _PWRTE_OFF & _MCLRE_ON & _BODEN_ON & _LVP_OFF & _DATA_CP_OFF & _CP_OFF
#ENDCONFIG

DEFINE OSC 20


inx var byte bank0
trisb=%11110000
goto overasm
asm

table
addwf PCL, F
retlw 8
retlw 9
retlw 10
retlw 11
retlw 12
retlw 13
retlw 14
retlw 14
retlw 15
retlw 14
retlw 14
retlw 13
retlw 12
retlw 11
retlw 10
retlw 9
retlw 8
retlw 6
retlw 5
retlw 4
retlw 3
retlw 2
retlw 1
retlw 1
retlw 0
retlw 1
retlw 1
retlw 2
retlw 3
retlw 4
retlw 5
retlw 6
_lu
movf _inx ,w
call table
MOVE?AB PORTB
RETURN
ENDASM
overasm:
inx=0
loopp:
CALL lu
inx=inx+1
inx =inx&31
goto loopp

CuriousOne
- 25th April 2020, 09:00
There is another approach. Check this video.

https://www.youtube.com/watch?v=ophqt_RmiS0

C (https://www.youtube.com/watch?v=ophqt_RmiS0)ode in description.

Ioannis
- 26th April 2020, 18:27
This guy has done unbelievable things with low end PIC and also a color video game with multi channel sound and VGA output on a PIC18F2550!
http://pic24.ru/doku.php/en/osa/articles/vga_game

Using an RTOS...

Ioannis

CuriousOne
- 27th April 2020, 06:40
Yeah, in his creations, 16F690 outperforms some guys with RPI :D

Ioannis
- 27th April 2020, 08:54
I dare to say that he is a Case Study! Wish I was proficient in C to run through his code.

Ioannis

richard
- 27th April 2020, 10:20
in my wildest imaginings i never expected such feats on such modest hardware from a rtos.
humbled yet again.
rtos something else to learn where will it end?

Ioannis
- 27th April 2020, 14:26
Who knows really? Very exciting though!

Ioannis

CuriousOne
- 27th April 2020, 20:44
Well there is a color game with sound even on 16F628A, I've built it, but you need SCART input to use it, no VGA output...

Ioannis
- 27th April 2020, 21:17
I think you mean this one:

https://www.quinapalus.com/picsi.html

Ioannis