Re: (Big) problems achieving fair scheduling of a semaphore ISR producer - thread consumer case


Paul Sokolovsky
 

On Fri, 7 Apr 2017 09:56:30 +0100
Daniel Thompson <daniel.thompson@linaro.org> wrote:

[]

I remember yesterday you said you had disabled all the local
echos... did you reach a point where the TX/RX ratio was less than
1:1?
Ok, master, following patch:
;-)

I've been very careful not to say you're wrong. I *only* said the
example you brought up was not a reproduction of an incorrect thread
schedule.
Well, I'd love to be pointed where I'm wrong and be shown how to do it
right. For the background, receiving a char in an interrupt or from a
callback, storing it in a buffer and then letting main application pull
chars from buffer (or block on it) is a very mundane task, with many
systems working like that. In MicroPython, we have that implemented in
bare metal (well, via STM32Cube/CC3200 HAL libs), in
Xtensa proprietary cooperative multitasking RTOS (well, backed by
bare-metal handling too), in STM32 USB-CDC, and all that "just works".
Zephyr so far the only one (of the tried) which has problems.

As UART is a pretty simple device and you can do only so much about
handling it, and I assume everything of that was already done in
Zephyr, I blame threading/sync primitives, as much more differentiating
matter. I may be very wrong (and ready to "eat my hat").



--- a/samples/subsys/console/getchar/src/main.c
+++ b/samples/subsys/console/getchar/src/main.c
@@ -9,6 +9,6 @@ void main(void)
while (1) {
uint8_t c = console_getchar();

- printk("char: [0x%x] %c\n", c, c);
+// printk("char: [0x%x] %c\n", c, c);
}
}

...
Console buffer overflow - char dropped
Console buffer overflow - char dropped
...
Did you run this on carbon? I've not been able to reproduce on this
board.
Ok, now I did. With the patch above, I can't reproduce the overflow
message. But now let's try to echo the input char, and for a
difference, set CONFIG_CONSOLE_GETCHAR_BUFSIZE=64 so we didn't suspect
too short a buffer.

I still can't reproduce overflow message, but now we're entering land of
obvious UART problems. So, to keep 1:1 rx/tx ratio we translate
input CR using picocom:

picocom -b115200 --omap crcrlf --emap "" /dev/ttyUSB0

Now paste the whole "Ok, now I did ..." para above. The behavior: first
subchunk being pasted, the noticeable pause, then second subchunk
appears, at which point board no longer reacts to input and appears
dead.

Blame "my" code? Well, fire up the classical samples/subsys/shell, and
paste there:

Attempt 1:
shell> Ok, now I id. With
(lockup)

Attempt 2:
shell> Ok, now Idid. With atch abov, I can't eproduce te overflo
mssage.But now le's try toecho the iput char, nd for a
dference, et CONFIG_ONSOLE_GECHAR_BUFSIE=64 so wedidn't susect

Attempt 3:
Could paste whole 5 times with wildly lost chars, before the thing
finally gave up and locked up.

All that is nothing new. See e.g.
https://jira.zephyrproject.org/browse/ZEP-467 and what patch "fixed"
it. I write "fixed" because, well, hang indeed seems to be fixed on on
arduino_101 \o/, but it all still works, umm, in expected way:

shell> Now paste the whole "Ok, now I did ..." para above. The behavor: frst
sbchun bein past, th notieablepause thensecon subcunk
apears at wich pnt bord nolonge reacs to nput nd apeaToro msan

(But again, passed "UART ship it" test of being pastes into for a
minute without hangs.)

So, probably can come to a conclusion: for reliable operation, both UART
rx *and* tx should be buffered and interrupt-driven.

It *does* reproduce on qemu_cortex_m3 but the UART simulation looks a
bit dubious. I can't directly test the input but the simulated UART
is *very* obviously producing characters way faster than the normal
baud rate... it appears to simply be running as fast as it can.
So, can come to 2nd conclusion - QEMU emulation can't be trusted much.
That's the saddest part, because I was interested first and foremost
in the local QEMU test automation. I mean, the saddest part is
it's unclear what to do about it. Do you mean that you looked at the
source of the old-old QEMU version used by Zephyr SDK and so this
simplistic UART emulation? That "old-old" part is what deterred me from
doing the same, I guess I should try QEMU 2.8 after all. But if it's
not fixed there, I'm at loss again - should I try all the gazillion
forks around, or try to fix it myself against mainline (didn't hack on
qemu for a while)?..



Daniel.


--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

Join devel@lists.zephyrproject.org to automatically receive all group messages.