Re: (Big) problems achieving fair scheduling of a semaphore ISR producer - thread consumer case


Paul Sokolovsky
 

Hello Daniel,

On Fri, 7 Apr 2017 12:38:34 +0100
Daniel Thompson <daniel.thompson@linaro.org> wrote:

[]

...
Console buffer overflow - char dropped
Console buffer overflow - char dropped
...
Did you run this on carbon? I've not been able to reproduce on this
board.
Ok, now I did. With the patch above, I can't reproduce the overflow
message. But now let's try to echo the input char, and for a
difference, set CONFIG_CONSOLE_GETCHAR_BUFSIZE=64 so we didn't
suspect too short a buffer.

I still can't reproduce overflow message,
... funny you should say that.

I've been taking a quick glance at the vendor HAL here. It implements
some kind of weird private locking scheme that looks like it doesn't
play nicely with a pre-emptive RTOS (or even with interrupt handlers).
Ack, thanks, I see it.

[]

So, probably can come to a conclusion: for reliable operation, both
UART rx *and* tx should be buffered and interrupt-driven.
I don't understand how the evidence leads to this conclusion.
Well, it's reasoning from the opposite direction, let's recount:

1. As was quoted, MicroPython has multiple implementations of that
handling for multiple environments, all "just working". Output
buffering is used at all times though.

2. Pure mathematical reasoning that if a char at a given baud rate is
received each X cycles, then busy-polling to send a char will take
these X cycles. But while hardware receives a char each X cycles on its
own, before we can spend X cycles to send it, we first all also need to
spend some other cycles to handle an interrupt, extract received char,
etc. So, the ratio is never 1:1, but instead 1:1.xx, so we sustainably
will be getting late, and lose or overflow eventually.

3. Simply due to lack of better leads to a problem ("broken UART
drivers and handling" started to sound much more convincing than the
original "scheduling is broken" guess, which couldn't be proven).

Personally I'm still rather concerned that the driver may not be
robust (although, just as you blaming the scheduler, I haven't
collected much evidence to support this).
"The driver"? It's "the drivers, and for a while". If you think that
the problem is peculiar to stm32, then there was example of
arduino_101 having had a big problem, and still having some, frdm_k64f
also had (== has) similar problems, etc.

I've done a quick review of the driver and so far I haven't seen
anything that explains the character loss (although it would good to
neutralize the private locking so we can see the output from the ISR).

I'm afraid, for me, time is up thinking about this for a while.
However, if I was looking at this further, I'd consider reinstating
the old UART driver (HAL is sufficiently complexified that it becomes
hard to reason about) and see if it behaves the same way...
Well, thanks for helping the investigation, I appreciate that, as
usual! Reinstate the old driver - for one-off testing? If so, I'd find
hack-hacking the existing one to be more practical. And anything beyond
that borders on organizational and political issues. Suffice that
others provided feedback on that (p.1 at
https://youtu.be/XUJK2htXxKw?t=1885), I don't want to go there ;-).

[]

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

Join devel@lists.zephyrproject.org to automatically receive all group messages.