Re: Fibers Become Unrunnable in Nanokernel

Benjamin Walsh <benjamin.walsh@...>

On Wed, Feb 22, 2017 at 09:39:26PM -0500, Rosen, Michael R wrote:
As far as I can tell, the "timer" is expired and the struct tcs's
for the fibers are not in the runnable list. All other fibers in the
system on ARC seem to be in the runnable list as expected. Also,
from some basic stack analysis, it appears that the unrunnable
fibers are still in the nano_timer_test function. One thing worth
noting is that while most fibers are just doing some math and
storing it in memory; but two of them are accessing a SPI and I2C
device. When these fibers are prevented from accessing the device,
the system seems to run smoothly; otherwise it doesn't. Has anything
like this ever been encountered before?

Note also that moving to Zephyr 1.6 would be significant effort as
we have implemented a number of custom drivers and other features
that would take a significant time to port.
This does not really solve your problem, but Zephyr 1.6 contains a
legacy layer that provides all the APIs of the old kernels on top of
the new kernel. It's not a NOP to move to 1.6, since you might have
some issues with e.g. stack sizes, or some other > > idiosyncrasies,
but it might be less painful than you think.

About your issue: the first thing I always suspect with weird
behaviour like this is stack smashing. There is a kconfig option for
ARC that enables stack overflow/underflow detection. Do you have
that option enabled ?
Just to update you and the mailing list; I think the issue is one you
solved for 1.6. However, its not tracked on JIRA or in the release
notes so I didn't realize such a critical bug was not fixed in 1.5.
The commit in question is 5986ec040b. As this is a very specific
timing bug, we are still validating our code to be 100% sure its
fixed, but its looking good so far.
Sorry, I did not have much time to look at this. But yes, this looks
indeed like a problem even in the pre-unified kernel (per-1.6).

Join to automatically receive all group messages.