PDA

View Full Version : Speed optimization (framebuffer scrolling)



HenrikOlsson
- 9th October 2022, 17:18
I'm working on my MAX7219 display code and I'm in optimization mode. Have managed to get from 22ms to 6.5ms for redrawing a screen of 81 matrices (at 2MHz SPI clock) and I don't think I can squeeze more performance out of that particular section of the code at this moment. Next in line is the scrolling of the framebuffer, here's the code for it:


NumberOfDisplays CON 36
FrameBufferSize CON NumberOfDisplays * 10 + 20

FrameBuffer VAR BYTE[FrameBufferSize]
Row VAR BYTE
Col VAR WORD
Offset VAR WORD
FrameBufferByte VAR BYTE
MAX7219_Value VAR BYTE


ScrollLeft:
' Shifts content of the screenbuffer one column to the left. It does not redraw the display.
' Input: None
' Output: None
' This version: 3.5ms for 36 displays @64MHz

FOR Col = 0 to NumberOfDisplays + 1 ' 8 columns on left and right side, outside of display area
For Row = 0 to 9 ' One invisible row top and bottom of display area to facilitate vertical scrolling
Offset = Row * (NumberOfDisplays + 2) + Col
MAX7219_Value = FrameBuffer[Offset] >> 1

' As long as the next byte is NOT the last byte in the row:
IF Col < (NumberOfDisplays + 1) THEN
FrameBufferByte = FrameBuffer[Offset + 1]
MAX7219_Value.7 = FrameBufferByte.0
ENDIF

FrameBuffer[Offset] = MAX7219_Value
NEXT
NEXT
RETURN

Any ideas on how to make this run faster?

tumbleweed
- 9th October 2022, 23:55
Without getting into asm or anything, how about this (untested):


OFFSET_INCR CON NumberOfDisplays + 2
ScrollLeft:
' Shifts content of the screenbuffer one column to the left. It does not redraw the display.
' Input: None
' Output: None
' This version: 1.30ms for 36 displays @64MHz

' do for Col < NumberOfDisplays+1
FOR Col = 0 to NumberOfDisplays ' 8 columns on left and right side, outside of display area
Offset = Col
For Row = 0 to 9 ' One invisible row top and bottom of display area to facilitate vertical scrolling
'Offset = Row * (NumberOfDisplays + 2) + Col
MAX7219_Value = FrameBuffer[Offset] >> 1

FrameBufferByte = FrameBuffer[Offset + 1]
MAX7219_Value.7 = FrameBufferByte.0

FrameBuffer[Offset] = MAX7219_Value
Offset = Offset + OFFSET_INCR ' basically Row * (NumberOfDisplays + 2)
NEXT
NEXT

Col = NumberOfDisplays + 1 ' 8 columns on left and right side, outside of display area
Offset = Col
For Row = 0 to 9 ' One invisible row top and bottom of display area to facilitate vertical scrolling
'Offset = Row * (NumberOfDisplays + 2) + Col
MAX7219_Value = FrameBuffer[Offset] >> 1
FrameBuffer[Offset] = MAX7219_Value
Offset = Offset + OFFSET_INCR ' basically Row * (NumberOfDisplays + 2)
NEXT
RETURN


That cuts it down to 1.3ms

richard
- 10th October 2022, 00:41
try some asm

untested but should be close
bcnt var byte bank 1
dcnt var byte bank 1
dspbuff var byte[380] ;36 displays + 2 outside of display * 10


rotate_r: ' pic18
asm
banksel _bcnt ;ROW
movlw 10 ;10 rows
movwf _bcnt
clrf FSR0H
movlw low (_dspbuff)
movwf FSR0L
movlw 111 ;BUFFER SIZE low byte (380-1) 36 displays + 2 outside of display * 10-1
ADDWF FSR0L,F
movlw high (_dspbuff)
ADDWFC FSR0H
movlw 1 ;BUFFER SIZE high byte (380)
ADDWF FSR0H
banksel _bcnt ;note place bcnt and dcnt in same bank
MOVLW 38 ;36 displays + 2 outside of display
NROW
movwf _dcnt
bcf STATUS, C
Ncol
rrcf POSTDEC0 ,f ;PER COLUMN
DECFSZ _dcnt ,F
BRA Ncol
BNC NBNC
bsf PLUSW0,7 ;max displays would be 127 to use plusw
NBNC
DECFSZ _bcnt ,F
BRA NROW
banksel 0
endasm














this is my 4 panel code for reference





rotate: ;pic18
asm
banksel _bcnt ;ROW
movlw 8
movwf _bcnt
clrf FSR0H
movlw low (_dspbuff)
movwf FSR0L
movlw 31 ;BUFFER SIZE
ADDWF FSR0L,F
movlw high (_dspbuff)
ADDWFC FSR0H
banksel _bcnt
MOVLW 4
NROW
bcf STATUS, C
rrcf POSTDEC0 ,f ;PER COLUMN
rrcf POSTDEC0 ,f ;PER COLUMN
rrcf POSTDEC0 ,f ;PER COLUMN
rrcf POSTDEC0 ,f ;PER COLUMN
BNC NBNC
bsf PLUSW0,7
NBNC
DECFSZ _bcnt ,F
BRA NROW
endasm

HenrikOlsson
- 10th October 2022, 19:32
Richard, thank you. My problem is I suck as ASM and I don't understand half of what's going on there. It might very well work, in which case I'd be lucky but if it doesn't or I need to modify it I basically have no idea what I'm doing. I should use it as a learning exercise though as I do feel the need to better understand ASM.

tumblweed, wow, that was an easy yet effective modification (and it works perfectly fine) - just what I was hoping for.
Went down from 7.6ms to just above 3ms for a 81 matrix wide display. More than double the speed, cool!

One thing threw me off on a tangent for a while...


OFFSET_INCR CON NumberOfDisplays + 2
Since NumberOfDisplays is already a CONstant (with the value 81 in this particular case) I replaced

Offset = Offset + OFFSET_INCR
with

Offset = Offset + NumberOfDisplays + 2

expecting the compiler to replace both versions to

Offset = Offset + 83
It did not...your version with, the constant defined executes at 3.03ms, the other one in 3.45ms

I took 15 minutes for the penny to drop and now I need to go thru my code and see where else I've made that silly mistake.

Thanks again, both of you guys!

/Henrik.

richard
- 11th October 2022, 01:13
I should use it as a learning exercise though as I do feel the need to better understand ASM.

its worth the effort

this just under 100uS for a iteration




'************************************************* ***************
'* Name : UNTITLED.BAS *
'* Author : richard *
'* Notice : Copyright (c) 2022 caveat emptor *
'* : All Rights Reserved *
'* Date : 11/10/2022 *
'* Version : 1.0 *
'* Notes : *
'* : *
'************************************************* ***************
#CONFIG
CONFIG FOSC = INTIO67
CONFIG PLLCFG = ON
CONFIG PRICLKEN = ON
CONFIG FCMEN = OFF
CONFIG IESO = OFF
CONFIG PWRTEN = ON
CONFIG BOREN = SBORDIS
CONFIG BORV = 190
CONFIG WDTEN = OFF
CONFIG WDTPS = 32768
CONFIG CCP2MX = PORTC1
CONFIG PBADEN = OFF
CONFIG CCP3MX = PORTB5
CONFIG T3CMX = PORTC0
CONFIG HFOFST = ON
CONFIG P2BMX = PORTB5
CONFIG MCLRE = EXTMCLR
CONFIG STVREN = ON
CONFIG LVP = OFF
CONFIG XINST = OFF
CONFIG DEBUG = OFF
CONFIG CP0 = OFF
CONFIG CP1 = OFF
CONFIG CP2 = OFF
CONFIG CP3 = OFF
CONFIG CPB = OFF
CONFIG CPD = OFF
CONFIG WRT0 = OFF
CONFIG WRT1 = OFF
CONFIG WRT2 = OFF
CONFIG WRT3 = OFF
CONFIG WRTC = OFF
CONFIG WRTB = OFF
CONFIG WRTD = OFF
CONFIG EBTR0 = OFF
CONFIG EBTR1 = OFF
CONFIG EBTR2 = OFF
CONFIG EBTR3 = OFF
CONFIG EBTRB = OFF
#ENDCONFIG





DEFINE OSC 64
DEFINE DEBUG_REG PORTB
DEFINE DEBUG_BIT 7
DEFINE DEBUG_BAUD 9600
DEFINE DEBUG_MODE 0
LATB.7=1
bcnt VAR BYTE bank0
dcnt VAR BYTE bank0
BUFF VAR BYTE[32]
iteration VAR word
OSCCON=$70
OSCTUNE.6=1
while ! osccon2.7 :WEND ;wait for pll
ANSELB=0
ANSELC=0
ANSELA=0
dspbuff var byte[380]
clear
trisc = %01111111
TRISB = %01111111
TRISA = %11111111
pause 1000
debug 13,10,"ready",13,10
LATc.7=0
;38x8 bits wide 10 bits high
dspbuff[379]=$c0
dspbuff[37]=$c0
main:
LATc.7=1; trigger oscilloscope
gosub rotate_r
LATc.7=0
debug 13,10,dec iteration
debug 13,10,bin8 dspbuff[379]," " ,bin8 dspbuff[378]," " ,bin8 dspbuff[343] ," ",bin8 dspbuff[342]
debug 13,10,bin8 dspbuff[37] ," " ,bin8 dspbuff[36] ," " ,bin8 dspbuff[1] ," ",bin8 dspbuff[0]
pause 200
iteration=iteration+1
goto main




rotate_r: ' pic18 100uS
asm
banksel _bcnt ;ROW
movlw 10 ;10 rows
movwf _bcnt
movlw high (_dspbuff)
movwf FSR0H
movlw low (_dspbuff)
movwf FSR0L
movlw 123 ;BUFFER SIZE low byte (380-1) 36 displays + 2 outside of display * 10-1
ADDWF FSR0L
movlw 1 ;BUFFER SIZE high byte (380)
ADDWFC FSR0H
MOVLW 38 ;36 displays + 2 outside of display
NROW
movwf _dcnt
bcf STATUS, C
Ncol
rrcf POSTDEC0 ,f ;PER COLUMN
DECFSZ _dcnt ,F
BRA Ncol
BNC NBNC
bsf PLUSW0,7 ;max displays would be 127 to use plusw
NBNC
DECFSZ _bcnt ,F
BRA NROW
banksel 0
return
endasm

HenrikOlsson
- 11th October 2022, 09:07
Thanks Richard,
What device did you run this on?
After being unseccessfull in integrating your version into my code I tried to get your code (I removed some oscillator specific stuff to match my device) to compile for the 57Q43 but it won't.
It can't fit bcnt/dcnt into bank 0 so I tried various banks. Bank 5 works but I still get [ASM WARNING: ASM-SCROLL.ASM (411) : Invalid RAM location specified]

Line 411 in the .asm listing is banksel 0 in the NBNC routine.

I know it says warning and not error but the description sounds bad enough :-)

See, this is what I mean....

For reference, here's what I'm trying to compile for the 57Q43:

DEFINE OSC 64
DEFINE DEBUG_REG PORTB
DEFINE DEBUG_BIT 7
DEFINE DEBUG_BAUD 9600
DEFINE DEBUG_MODE 0

LATB.7=1

bcnt VAR BYTE bank0
dcnt VAR BYTE bank0
iteration VAR word

ANSELB=0
ANSELC=0
ANSELA=0
dspbuff var byte[380]
clear
trisc = %01111111
TRISB = %01111111
TRISA = %11111111
pause 1000
debug 13,10,"ready",13,10
LATc.7=0
;38x8 bits wide 10 bits high
dspbuff[379]=$c0
dspbuff[37]=$c0


main:
LATc.7=1; trigger oscilloscope
gosub rotate_r
LATc.7=0
debug 13,10,dec iteration
debug 13,10,bin8 dspbuff[379]," " ,bin8 dspbuff[378]," " ,bin8 dspbuff[343] ," ",bin8 dspbuff[342]
debug 13,10,bin8 dspbuff[37] ," " ,bin8 dspbuff[36] ," " ,bin8 dspbuff[1] ," ",bin8 dspbuff[0]
pause 200
iteration=iteration+1
goto main


rotate_r: ' pic18 100uS
asm
banksel _bcnt ;ROW
movlw 10 ;10 rows
movwf _bcnt
movlw high (_dspbuff)
movwf FSR0H
movlw low (_dspbuff)
movwf FSR0L
movlw 123 ;BUFFER SIZE low byte (380-1) 36 displays + 2 outside of display * 10-1
ADDWF FSR0L
movlw 1 ;BUFFER SIZE high byte (380)
ADDWFC FSR0H
MOVLW 38 ;36 displays + 2 outside of display
NROW
movwf _dcnt
bcf STATUS, C
Ncol
rrcf POSTDEC0 ,f ;PER COLUMN
DECFSZ _dcnt ,F
BRA Ncol
BNC NBNC
bsf PLUSW0,7 ;max displays would be 127 to use plusw
NBNC
DECFSZ _bcnt ,F
BRA NROW
banksel 0
return
endasm

richard
- 11th October 2022, 09:42
18f26k22

i have not tried much with Q43 yet , but
they have no ram below bank 5.
i dont know what bank pbp defaults to for them bank0 makes no sense at all , before them pbp kinda assumes bank0 before and after calls.
the instruction set is slightly different. movfffl or whatever
there may be other weird shit with indirect addressing

i have some q43's i just lack the enthusiasm to learn about them when the tried and trusty k22 serves my needs

richard
- 11th October 2022, 09:48
try them in bank6
and just remove the banksel 0 at the end


i added a rotate left to and a few tweaks



'************************************************* ***************'* Name : UNTITLED.BAS *
'* Author : richard *
'* Notice : Copyright (c) 2022 caveat emptor *
'* : All Rights Reserved *
'* Date : 11/10/2022 *
'* Version : 1.0 *
'* Notes : *
'* : pic18f26k22 *
'************************************************* ***************
#CONFIG
CONFIG FOSC = INTIO67
CONFIG PLLCFG = ON
CONFIG PRICLKEN = ON
CONFIG FCMEN = OFF
CONFIG IESO = OFF
CONFIG PWRTEN = ON
CONFIG BOREN = SBORDIS
CONFIG BORV = 190
CONFIG WDTEN = OFF
CONFIG WDTPS = 32768
CONFIG CCP2MX = PORTC1
CONFIG PBADEN = OFF
CONFIG CCP3MX = PORTB5
CONFIG T3CMX = PORTC0
CONFIG HFOFST = ON
CONFIG P2BMX = PORTB5
CONFIG MCLRE = EXTMCLR
CONFIG STVREN = ON
CONFIG LVP = OFF
CONFIG XINST = OFF
CONFIG DEBUG = OFF
CONFIG CP0 = OFF
CONFIG CP1 = OFF
CONFIG CP2 = OFF
CONFIG CP3 = OFF
CONFIG CPB = OFF
CONFIG CPD = OFF
CONFIG WRT0 = OFF
CONFIG WRT1 = OFF
CONFIG WRT2 = OFF
CONFIG WRT3 = OFF
CONFIG WRTC = OFF
CONFIG WRTB = OFF
CONFIG WRTD = OFF
CONFIG EBTR0 = OFF
CONFIG EBTR1 = OFF
CONFIG EBTR2 = OFF
CONFIG EBTR3 = OFF
CONFIG EBTRB = OFF
#ENDCONFIG



NumberOfDisplays CON 36
NumberOfRows CON 10
DEFINE OSC 64
DEFINE DEBUG_REG PORTB
DEFINE DEBUG_BIT 7
DEFINE DEBUG_BAUD 9600
DEFINE DEBUG_MODE 0
LATB.7=1
bcnt VAR BYTE bank0
dcnt VAR BYTE bank0
BUFF VAR BYTE[32]
iteration VAR word
OSCCON=$70
OSCTUNE.6=1
while ! osccon2.7 :WEND ;wait for pll
ANSELB=0
ANSELC=0
ANSELA=0
dspbuff var byte[380]
clear
trisc = %01111111
TRISB = %01111111
TRISA = %11111111
pause 1000
debug 13,10,"ready",13,10
LATc.7=0
;38x8 bits wide 10 bits high


main:
iteration=0
dspbuff[379]=3
dspbuff[37]=3
WHILE iteration<610
LATc.7=1
gosub rotate_l
LATc.7=0
debug 13,10,dec iteration
debug 13,10,bin8 dspbuff[379]," " ,bin8 dspbuff[378]," " ,bin8 dspbuff[343] ," ",bin8 dspbuff[342]
debug 13,10,bin8 dspbuff[37] ," " ,bin8 dspbuff[36] ," " ,bin8 dspbuff[1] ," ",bin8 dspbuff[0]
pause 50
iteration=iteration+1
WEND
iteration=0
dspbuff[379]=$c0
dspbuff[37]=$c0
WHILE iteration<610
LATc.7=1
gosub rotate_R
LATc.7=0
debug 13,10,dec iteration
debug 13,10,bin8 dspbuff[379]," " ,bin8 dspbuff[378]," " ,bin8 dspbuff[343] ," ",bin8 dspbuff[342]
debug 13,10,bin8 dspbuff[37] ," " ,bin8 dspbuff[36] ," " ,bin8 dspbuff[1] ," ",bin8 dspbuff[0]
pause 50
iteration=iteration+1
WEND
goto main




rotate_r: ' pic18
asm
banksel _bcnt ;ROW
movlw _NumberOfRows ;10 rows
movwf _bcnt
movlw high (_dspbuff) ;BOTTOM OF BUFFER
movwf FSR0H
movlw low (_dspbuff)
movwf FSR0L
movlw low((_NumberOfDisplays + 2)*_NumberOfRows-1) ;TOP OF BUFFER
ADDWF FSR0L
movlw high((_NumberOfDisplays + 2)*_NumberOfRows-1)
ADDWFC FSR0H
MOVLW _NumberOfDisplays + 2;36 displays + 2 outside of display
RROW
movwf _dcnt ;display
bcf STATUS, C
Rcol
rrcf POSTDEC0 ,f ;PER display
DECFSZ _dcnt ,F
BRA Rcol
BNC RBNC
bsf PLUSW0,7 ;max displays would be 127 to use plusw
RBNC
DECFSZ _bcnt ,F
BRA RROW
banksel 0
return
endasm
rotate_l: ' pic18
asm
banksel _bcnt ;ROW
movlw _NumberOfRows
movwf _bcnt
movlw high (_dspbuff) ;BOTTOM OF BUFFER
movwf FSR0H
movlw low (_dspbuff)
movwf FSR0L
LROW
MOVLW _NumberOfDisplays + 2;36 displays + 2 outside of display
movwf _dcnt ;display
comf WREG
incf WREG
bcf STATUS, C
Lcol
rlcf POSTINC0 ,f ;PER display
DECFSZ _dcnt ,F
BRA Lcol
BNC LBNC
bsf PLUSW0,0 ;max displays would be 127 to use plusw
LBNC
DECFSZ _bcnt ,F
BRA LROW
banksel 0
return
endasm

richard
- 11th October 2022, 10:21
with my suggestions it compiles for 18f47q43, whether it works is not tested . a quick browse of the q43 instruction set and access bank looks ok

HenrikOlsson
- 11th October 2022, 12:11
Bank 6 (or 7) doesn't work at all. I don't know what it's doing exactly. It does display the text but it does not scroll it.

Bank 5 (or 25) works - sort of.
I can scroll it left OR right continously without issues but when I change direction whatever is in the non visible part the buffer comes back on to the display shifted down by one row - at least I think that's what happening.

But it's fast, that's for sure.
* Mine was like 7ms or something like that
* tumblweeds was more than twice as fast at 3ms.
* Richards comes in at 0.2ms - amazing.

Mine and tumbleweeds simply scrolls the display, anything that's shifted outside of the buffer is lost while Richards rotates it so they're not exactly the same.

/Henrik.

tumbleweed
- 11th October 2022, 12:54
On most of the Q family the SFR registers are in banks 0-4, and the start of user ram is in bank 5.
That's where the access bank is (0x500-0x55F). The other part of the access bank is at 0x460-0x4FF, but that's for the fast SFR registers so you can't use that.
So, the "default" setting for 'banksel' should be 5 and not 0.

If you use the MOVFF instruction both the source and destination must be in the first 4K of ram space.
If either are outside that you need to use MOVFFL, which is a three-word instruction so it's more code (and slower) than MOVFF.

The K42/K83 force you to use MOVFFL to access SFR registers outside the access bank since on those the SFR regs are at the very top of ram.
That's one reason to skip the K42 and use the Q43 instead.

richard
- 11th October 2022, 23:43
I can scroll it left OR right continously without issues but when I change direction whatever is in the non visible part the buffer comes back on to the display shifted down by one row - at least I think that's what happening.
i expected to have had the buffer cleared out and reset before a direction change . however that sort of behavior is unanticipated




* Richards comes in at 0.2ms - amazing.

i get 0.1mS , maybe my oscilloscope is off



one of my clients sent me a test jig for q43 chips yesterday. i need to get into them soon i guess
no more excuses

@tumbleweed for confirming what i thought

richard
- 12th October 2022, 02:45
just for interest i tried my code on the st7920 buffer for rows 8 to 15 for the full 16 byte width of course

https://youtu.be/5CVAvcELO6A

works as expected even when direction reversed midstream

richard
- 12th October 2022, 05:04
or a window to scroll
[i just can't help myself]
https://youtu.be/IYPaKOO-0Pk

HenrikOlsson
- 12th October 2022, 08:37
i get 0.1mS , maybe my oscilloscope is off
Your scope is fine. All my quoted measurements are for my current configuration with 81 displays, so roughly double your measurement.


works as expected even when direction reversed midstream
It clearly does. Do you have a non visible are in the buffer? I can't see how it would matter but again, in my system it does weird things with what's outside the actual display.

The scrolling window was cool!

richard
- 12th October 2022, 09:56
Do you have a non visible are in the buffer?
no , i'm using a pointer into the the actual frame buffer of the display so i can't hide any data.
its vital that -
the width is in whole bytes ie for a 128x64 display thats 16 bytes
if i had a hidden byte each end then it would be 18
if you are short or long then the data will appear to move up or down at the wrap around
this may be clearer and can be applicable anywhere

note bank dependence has been eliminated


width con 128 ;display buffer width and
height con 64 ; height
bcnt VAR BYTE
dcnt VAR BYTE
WindowWidth CON 5 ;window is 5 bytes wide scrolling window
windowHeight CON 8 ;window is 8 bits high " "
@windowPad = _width/8 - _WindowWidth ;number of bytes to get to start of next row
@dspbuff = _fbr+16*8+5 ;window start address row8 byte 5
; fbr is the display frame buffer
rotate_r: ' pic18
asm
banksel _bcnt ;ROW
movlw _windowHeight
movwf _bcnt
movlw high (dspbuff) ;BOTTOM OF BUFFER
movwf FSR0H
movlw low (dspbuff)
movwf FSR0L
movlw low((_WindowWidth )*_windowHeight-1) ;TOP OF BUFFER
ADDWF FSR0L
movlw high((_WindowWidth )*_windowHeight-1)
ADDWFC FSR0H
MOVLW _WindowWidth
RROW
banksel _dcnt
movwf _dcnt ;display
bcf STATUS, C
Rcol
rrcf POSTDEC0 ,f ;PER display
DECFSZ _dcnt ,F
BRA Rcol
BNC RBNC
bsf PLUSW0,7 ;max would be 127 to use plusw
RBNC
banksel _bcnt
DECFSZ _bcnt ,F
BRA RROW
;banksel 0
return
endasm






rotate_l: ' pic18
asm
banksel _bcnt ;ROW
movlw _windowHeight
movwf _bcnt
lfsr 0, dspbuff
LROW
MOVLW _WindowWidth
banksel _dcnt
movwf _dcnt ;
comf WREG
incf WREG
bcf STATUS, C
Lcol
rlcf POSTINC0 ,f ;
DECFSZ _dcnt ,F
BRA Lcol
BNC LBNC
bsf PLUSW0,0 ;max would be 127 to use plusw
LBNC
movlw windowPad
addwf FSR0L ,f
bnc Lrnc
incf FSR0H ,f
Lrnc
banksel _bcnt
DECFSZ _bcnt ,F
BRA LROW
banksel 0
return
endasm


ps window is rotate_l only . not done rr yet

HenrikOlsson
- 12th October 2022, 11:06
Aha, then that might be it as I have 81 displays so buffer is 83 bytes wide.
But again, it does not move down as it wraps around, only as it changes direction.