PDA

View Full Version : How long does this take to execute



dsicon
- 15th August 2016, 18:16
Hello all
'how many clock cycles for the two lines inside the LOOP?

DEFINE OSC 32

' chip is 16F1939, 8MHz xtal 4X PLL = 32MHz clock

OutputPin1 var PORTD.0
InputPin1 var PORTD.1
Sw1 var byte

'here is executable section
do
OutputPin1 = 1
Sw1 = InputPin1
loop
END

' is it as simple as n per line or is there compiler overhead ?
' and everything else being equal would an 18F be the same?

HenrikOlsson
- 15th August 2016, 19:56
It compiles to this:

0001 M L00001
0001 140F M bsf PORTD, 000h
0002 3000 M movlw 0
0003 188F M btfsc PORTD, 001h
0004 3001 M movlw 1
0005 00BA M movwf _Sw1
0006 33FA M bra L00001
So as far as I can see the part IN the actual loop takes either 5 or 6 cycles, then the jump back to the start of the loop is another 2 cycles.
At 32MHz each instruction cycle is 125ns.
It doesn't seem to make any difference if it's for a 16F or 18F but be aware that if you have the WDT enabled it's possible that the compiler "injects" an instruction to clear it depending on where, in a larger program, this snippet ends up.

/Henrik.

dsicon
- 15th August 2016, 20:21
just what i needed to know, thanks and hello again Henrik

so if i had 4 pairs of lines like above inside the loop (different pins) then i would have
6 * 4 = 24 cycles
24 + 2 = 26 cycles
26 * .125 = 3.25uS
sound about right ?
i am not doing anything that requires high precision regarding number, if it was 2x slower i would be ok, if it was 3 - 4x i would need to know to make context adjustments

not incredibly speedy is it ?
that's only 300kHz for 6 pins

just to be clear
so 1/32MHz = 31nS
one instruction cycle = 4 clock periods ?
(this is starting to come back to me, it fades so quickly if you don't do this every day)

this is probably fast enough though and i assume reading a whole port would be about the same as reading a pin ?
in some areas that should speed things up quite a bit

HenrikOlsson
- 15th August 2016, 21:02
Hi,
Yes, one instruction cycle is 4 Clock cycles. That's where the Fosc/4 you see all around comes from. At 32MHz each instruction cycle (1/32M)*4 or 125ns.

Reading a whole port is obviously faster than Reading the 8 pins indvidually if that's what you're asking.

Four "sets" of those two line compiles to

000004 M L00001
000004 8082 M bsf PORTC, 000h
000006 0E00 M movlw 0
000008 B282 M btfsc PORTC, 001h
00000A 0E01 M movlw 1
00000C 6E1A M movwf _Sw1
00000E 8482 M bsf PORTC, 002h
000010 0E00 M movlw 0
000012 B682 M btfsc PORTC, 003h
000014 0E01 M movlw 1
000016 6E1B M movwf _Sw2
000018 8882 M bsf PORTC, 004h
00001A 0E00 M movlw 0
00001C BA82 M btfsc PORTC, 005h
00001E 0E01 M movlw 1
000020 6E1C M movwf _Sw3
000022 8082 M bsf PORTC, 000h
000024 0E00 M movlw 0
000026 B282 M btfsc PORTC, 001h
000028 0E01 M movlw 1
00002A 6E1D M movwf _Sw4
00002C D7EB M bra L00001

I count 20 instructions. Some of them (the btfsc) may take two cycles to execute but when they do the instruction AFTER the btfsc is skipped so in the end it'll still take 20 cycles, something I didn't take into account in the previous post. 20 * 125ns = 2.5us.

With that said I'm far from an expert in assembly programming so I truly hope someone that is corrects me if I'm wrong.

If you can explain what you're trying to do it may be easier to help you choose the best/fastest option.

/Henrik.

dsicon
- 15th August 2016, 21:13
at the moment this is good, i am just roughing in some rough estimates for various parts of the overall program loop

at this stage whether it is 5 vs 6 shouldn't matter, 20 vs 5 or 6 might

where i am going to need some real help i think is when i get to the RS485 output which needs to run continuously 512 bytes in the overall loop at 250k baud,
the program will need to respond to switches and things and is reading about 30 analogs which produce the serial stream

HenrikOlsson
- 15th August 2016, 21:31
DMX512 by any chance?
If you really need a continous stream at 250k baud then I think you need to some careful planning (and I guess that IS what you're doing with these test so thumbs up!).
At 250k baud you've only got 40us to do stuff Before you need to feed the next byte the UART (well you CAN wait up to 80us since there'a basically a single byte buffer) between each byte.

dsicon
- 15th August 2016, 21:48
yes, DMX
Jeff did it for me many years ago @4MHz, but it was 100% assembler, i don't think picbasic was around then

and there was much less other stuff to do on that one than this new one which is on the boards now, less channels, less functionality
what about the Instant Interrupts ? would that help here ?
do i need an 18F for that ?

HenrikOlsson
- 15th August 2016, 22:17
The correct 18F would allow you to run at 64MHz while I believe the 16F tops out at 32MHz but I might be wrong, haven't verified. Another benefit is that an 18F series device will allow you to have a single 512byte wide array for your DMX data. This doesn't work in the 16F series (max array length is 96). I'm not saying it can't be done on a 16F series just that I think you need some good reasons for selecting a 16F series chip over an 18F series device.

DT-Ints (Instant interrupts) is available for both 16F and 18F series so that doesn't matter but for this application I'm not entirely sure they're the correct aproach, especially if you're writing the ISR in PBP (which you are since you're using DT-Ints.....). The reasom I'm not too sure they're the right aproach is the overhead they add due to the system variable save/restore it has to do each time an interrupt occurs. This takes dozens and dozens of instruction cycles and at an interrupt rate of 25kHz it's going to eat away you're processing power.

With that said there are ways around that but they're not straight forward and you really need to know what you're doing so instead I'd look at one large loop type program calling tight subroutines with known execution time(s) and keep feeding the UART in between. Like,

* Feed the UART - OK, we've got 40us to do something useful.....
* Start the ADC - it'll take xx us to complete - go do something else....
* Read a pin
* Do something with it
* Write a pin
* Wait for the UART, Feed it.
* ADC done? Get result.
* Start ADC, next channel.

And so on.

An interrupt driven sender would be cleaner and if "all" the ISR is doing is indexing a 512 byte long array then examining the generated code and manually saving ONLY the system variables actually USED by the ISR will make DT-Ints work a lot faster but as I said, it's tricky and a very sensitive and delicate way of doing it from a code maintanence and expansion perspective.

/Henrik.

dsicon
- 15th August 2016, 22:59
incredibly helpful!, thanks so much again Henrik
want a job ? :)

HenrikOlsson
- 15th August 2016, 23:45
Glad I could help.
As for the job, I'm fortunate to have one that pays the bills but it's not as fun as developing PBP code so if you're serious about that then drop me PM and we can talk :-)

dsicon
- 16th August 2016, 03:32
it almost makes want to think about external hardware UART that has a 512 or 1k buffer, but i guess getting the data into that would be an issue too
are there such things ?

HenrikOlsson
- 16th August 2016, 07:12
Took a quick look at the Digikey and of the chips they've got the largest FIFO buffer available is 256bytes. I think you need at least 1k to really benefit from an external UART - and I think the interface would have to be parallel or really fast SPI or you'll spend most of the time sending data to the UART just as you would when using the on chip UART.

What would be ideal is a PIC with DMA but that feature doesn't exist in the 16F or 18F series. But really, this should be totally doable with a PIC and on chip UART but it'll take some careful planning.

/Henrik.

HenrikOlsson
- 16th August 2016, 22:35
Hi,
I've played around with this a bit, trying to get a feel for what's possible and what's not. It's actually not as bad as I originally thought - but not that great either.

A 18F25K20 running at 64MHz, interrupting every 44us using DT-Ints slows down the PIC a fair bit. A 100ms pause in the main routine becomes 185ms in reality, a CPU load of 54%. At 32MHz it'll probably not work at all since all avilable CPU cycles will be consumed by the ISR and (mostly) the overhead.

The good news is that it does work using DT-INTS if you can run at 64MHz. The bad news is of course that any commands used in the main routine, that relies on soft timing, will be WAY off.

/Henrik.

dsicon
- 17th August 2016, 00:17
well now you've done it Henrik

i was just warming up to the idea of doing everything in a polled loop and carefully squeezing activities in

re DT-Ints when you say "soft timing" do you mean things like counting the number of times through the Main Loop (or sub loop) to create a crude counter / timer ?
i do that a lot for certain imprecise timing requirements, yeah i know it is not good practice and has many down sides but it is good enough for many things

or do you mean things like PAUSE and PAUSEus ?
or both ?

my plan at the moment is to use 18F25K22 (Charles has been most supportive!) and it will do 64Mhz and does 5V which the '20 does not (big advantage as DMX transceiver i am told must be 5V due to custom and practice even though there are 3.3V RS485 transceiver chips out there)

as to OSC, the way i read the data sheet i can get 64MHz using 4X PLL either with INTOSC or XTAL at 16MHz
do you think 1% is good enough for RS-485 250k baud comms ?
my instinct was no and at this point am not inclined to get into calibration schemes (how good do they get?)

HenrikOlsson
- 17th August 2016, 07:26
With soft timing I mean anything that relies on counting instructioncycles for precise timing, like PAUSE, PAUSEUS, SERIN, PULSIN, etc. Same thing with the stuff that started this thread, we figured out a certain piece of code executes in x number of cycles but when there's an interrupt constantly "stealing" cycles away that timing is obviously no longer correct - nor constant.

I can't say I understand why the uC must be a 5V one just because the bus is RS485 but the 18F25K22 is a very good choice, if it's got enough pins (otherwise the 45K22) or the 26/46 if you need even more FLASH.

The choosen baudrate doesn't really matter when it comes to accuracy of the x-tal, if the clock is 1% off then the baudrate will be 1% off att 250baud and at 250kbaud. The internal oscillator is most likely fine but if you're designing a board and you can spare the pins I'd certainly design in the option to use a x-tal. (I'm using the 16F1519 in a project of mine, internal oscillator, on one board of 7 I had to tweak the baudcon register by two counts to get the correct baudrate since the internal oscillator was off. The other boards are spot on.)

You say DMX tranceiver, are you going to receive data at 250kbaud as well? That will be "interesting"... :-)

/Henrik,.

dsicon
- 18th August 2016, 02:17
"transceiver" is just the term for the driver chip, i only use in one direction
there is an extension of the protocol that allows for a response from the light, i don't know how widely used that is or how much data traffic would be involved typically, my wild guess is little

so i don't plan to support bi-directional and am aware it would be somewhere between challenging and impossible to do well
nonetheless if the pins are available as an architectural thing i plan to allow movement in that direction in the future if it does not complicate the hardware overly

re the the benefit of a 5V PIC, i am sure i could use a 3.3v part and a 5V transceiver (driver) chip and that they could play nice together but now the power system is maybe a bit more complicated

initially i planned to let EVERYTHING run on LiPo voltage ranges 3.0- 4.2V
the pots are ratiometric on the A/D, so that works fine
but i have been advised not to run the RS-485 at 3..3V as it is not best practice
do then i would need a DCDC converter for the driver chip
with a 5V PIC i now have less different voltages to distribute, but i guess there will be a converter either way as i think about it (boost from LiPo to 5V)

still 5V simplifies life as the wall wart can be 5V and override the LiPo
getting a slick instant change over battery is going to be a challenge in itself, down time to charge is not acceptable, mechanically i would much rather have a buried one

yeah, i realize 1% is 1% whatever the rate what really meant was whether 1% is good enough for professional comms ?
googling around just now it seems that actually +-2% should be ok

there is one more fly in this soup ...
there may be i2c transmissions that need to be squeezed in somewhere, maybe 5 - 12 bytes worth (could be broken in half into two bursts)

HenrikOlsson
- 18th August 2016, 07:14
Hi,
It's certainly easier to have a single rail.

I've actually never used I2C so my practical knowledege is very limitied but the manual says:
The timing of the I2C instructions is set so that standard speed devices(100kHz) will be accessible at clock speeds up to 8MHz. Fast mode devices (400kHz) may be used up to 20MHz. If it is desired to access a standard speed device at above 8MHz, the following DEFINE should be added to the program:
DEFINE I2C_SLOW 1
It's a bit unclear what happens with the I2C clock at speeds over 20MHz, will it still be 100/400kHz?
Since it's a synchronous protocol I would think it's not overly sensitive to the clock jitter caused by executing the command while the interrupt is active in the background but I don't know. On the other hand, 12bytes at 100kHz is about a 1ms. I guess you could squeeze that in between two DMX packets without too much impact on overall throughput.

/Henrik.

tumbleweed
- 18th August 2016, 11:55
If you're the DMX transmitter then there's a lot of leeway in the timing. As long as you send a 512-byte packet once a second you're in spec. The time between bytes isn't really an issue either as long as you meet that. You'll probably want it faster than that, but there's usually plenty of time in your main loop to do stuff.

Here's how I've done this in the past:
- use two 512 byte packets... one for the main loop to fill (#1) and one for the ISR to transmit (#2).
- in the main loop do whatever you need to create the DMX packet in buffer #1
- copy the packet in buffer #1 to the second buffer #2 for the ISR to use to transmit
- start the packet transmitting. after generating the BREAK/START timing, the ISR pulls bytes from the DMX transmit buffer #2 until done using the UART TXIF.

The main loop can continue on generating the next packet since it uses buffer #1. Usually, things are so fast that the main loop ends up waiting for the DMX TX ISR to finish sending the previous packet before sending the new one since it takes at least 22ms for the ISR to send the 512 bytes.

Ioannis
- 18th August 2016, 13:09
I have a feeling that this project is on the limit of the 18F series chips with PBP.

Eventually you may need to do a lot in assembly or finally select a 24F chip and C compiler... :(

I really have not seen the spec's of the DMX512. Is it necessary to send data continually?

Ioannis

tumbleweed
- 18th August 2016, 14:20
I have to admit I haven't done this using PBP, but I don't see any reason why an 18F25K22 @ 64MHz using DT-INTS wouldn't work out just fine.
There's plenty of ram and speed available. For that matter, if you're just transmitting DMX you don't even need to use interrupts... they just make it easier to transmit one packet while generating the next one.

DMX calls for a fixed baudrate of 250K with 8 bits of data + 2 stop bits, so that's 11 bits total (START + 8 data + 2 STOP) for 44us/byte. Note that if you're using a pic uart to send data you'll have to generate the second stop bit using a delay of 4us between bytes, but that's not normally a problem.

DMX uses packets of 513 bytes (a START_CODE byte + 512 bytes of data). Normally packets are send continuously about 44 times/sec (22.7msecs) which is the max full-out time for 513 bytes, but as long as you send one packet every second that meets the spec. As the transmitter, there's no other byte-to-byte time limit so things are pretty flexible.

If you're the receiver then things are a bit trickier since you HAVE to assume that you could get 513 bytes at the full-on speed of one byte/44us. That's still doable with a 25K22, but you almost have to use RX interrupts to collect a packet. That said, I've done both transmitters and receivers without using interrupts at all. Sometimes they just complicate things more than they help.

tumbleweed
- 18th August 2016, 14:52
Oh, and somethings else I forgot to mention...

DMX packets start with a BREAK signal, which is a low on the TXD pin of between 92-176us.

For the transmitter I found it easier to generate the DMX BREAK signal using an extra IO pin.
Put a 1-5K series resistor between the pic TX output and the 485 transceiver txd input pins, and tie the IO pin to that junction.


PIX TX out ---/\/\/\---------->RS485 TX IN
IO pin ____________|


Normally, set the IO pin as an input so it's floating. When you want to generate the BREAK signal set it to output low
for between 92us-176us and then make it an input again. You can also generate the BREAK timing by fidgeting around with the uart baudrate
and transmitting a single 00 byte, but I found this to be a lot simpler.

dsicon
- 19th August 2016, 02:14
thanks for that BREAK idea tumbleweed

HenrikOlsson
- 20th August 2016, 00:25
I have a feeling that this project is on the limit of the 18F series chips with PBP.
Eventually you may need to do a lot in assembly or finally select a 24F chip and C compiler...
If they could do this in the early 90's then we sure as hell should be able to do it now, with PICs running PBP code at 64MHz, DT-Ints or no DT-Ints.....


I really have not seen the spec's of the DMX512. Is it necessary to send data continually?
The specification really only says 1 frame per second minimum but that obviously won't work in practice.


Note that if you're using a pic uart to send data you'll have to generate the second stop bit using a delay of 4us between bytes, but that's not normally a problem.
Not sure about that, may depend on the PIC used but at least the 25K22 family supports 9bit transmission, the 9th bit normally being used for parity. Simply enable 9bit transmission and keep the 9th bit at a logic '1' and you'll have your second stop bit.


For the transmitter I found it easier to generate the DMX BREAK signal using an extra IO pin.
Good idea, however I'd just disable the UART for the duration of the BREAK period, this drives the output pin to whatever state the LAT register dictates. Equally simple but saves an output pin.


use two 512 byte packets... one for the main loop to fill (#1) and one for the ISR to transmit (#2).
Using a double buffer may have its merits but since each channel is a single byte I don't really see the need for it. Is there REALLY a need to keep all channels in sync within 25ms or even less?
Perhaps some special type of fixtures/lamps/whatever uses TWO channels combined to get a 16bit value in which case you COULD potentially change one byte of that value while the other one is being sent but then again, if the update rate is >40Hz, will anyone even notice?

Nah, based on the tests I made, running a PIC at 64MHz should allow you to keep up with the theoretical maximum framerate - even WITH DT-Ints.

/Henrik.

tumbleweed
- 20th August 2016, 11:26
Simply enable 9bit transmission and keep the 9th bit at a logic '1' and you'll have your second stop bitNow that's a slick idea! I'll have to remember than one.