Re: Fibers Become Unrunnable in Nanokernel

Benjamin Walsh <benjamin.walsh@...>

Hi Michael,

I am working off of Zephyr 1.5 in the nanokernel configuration on a
custom Intel Curie board. We have a number of fibers running on both
cores, but seem to be having trouble with fibers on the ARC core. We
haven't been able to pin down why, but occasionally (or something
rapidly), a number of fibers seem to simply stop running. Diving into
the issue in GDB reveals that while all the fibers have roughly the
form below, some fibers don't appear to be sleeping NOR are they on
the runnable list:

void fiber(int a, int b) {
struct nano_timer timer;

nano_timer_init(&timer, NULL);

while (1) {
nano_timer_start(&timer, MSEC(FIBER_PERIOD_MS));
... // Do work
nano_timer_test(&timer, TICKS_UNLIMITED);

As far as I can tell, the "timer" is expired and the struct tcs's for
the fibers are not in the runnable list. All other fibers in the
system on ARC seem to be in the runnable list as expected. Also, from
some basic stack analysis, it appears that the unrunnable fibers are
still in the nano_timer_test function. One thing worth noting is that
while most fibers are just doing some math and storing it in memory;
but two of them are accessing a SPI and I2C device. When these fibers
are prevented from accessing the device, the system seems to run
smoothly; otherwise it doesn't. Has anything like this ever been
encountered before?

Note also that moving to Zephyr 1.6 would be significant effort as we
have implemented a number of custom drivers and other features that
would take a significant time to port.
This does not really solve your problem, but Zephyr 1.6 contains a
legacy layer that provides all the APIs of the old kernels on top of the
new kernel. It's not a NOP to move to 1.6, since you might have some
issues with e.g. stack sizes, or some other idiosyncrasies, but it might
be less painful than you think.

About your issue: the first thing I always suspect with weird behaviour
like this is stack smashing. There is a kconfig option for ARC that
enables stack overflow/underflow detection. Do you have that option
enabled ?

Benjamin Walsh, SMTS
WR VxWorks Virtualization Profile
Zephyr kernel maintainer

Join to automatically receive all group messages.