I think I figured this out:
- I was trying to wake up the same fiber I was running - I was always calling fiber_fiber_wakeup() instead of proper context depending version
After fixing these two issues, the fiber wake up seems to work now fine.
I would suggest that if possible, the wake up code could check that we are not trying to wake ourselves.
Cheers, Jukka
toggle quoted messageShow quoted text
On Wed, 2016-02-24 at 13:54 +0200, Jukka Rissanen wrote: Re-sending this using proper mail address.
I managed to get following backtrace when the system is hanging:
_nano_fiber_ready (tcs=tcs(a)entry=0xa800f020 <tx_fiber_stack>) at /home/jukka/src/zephyr/kernel/nanokernel/nano_fiber.c:53 53 while (pQ->link && (tcs->prio >= pQ->link->prio)) { (gdb) bt #0 _nano_fiber_ready (tcs=tcs(a)entry=0xa800f020 <tx_fiber_stack>) at /home/jukka/src/zephyr/kernel/nanokernel/nano_fiber.c:53 #1 0x40033ea9 in _nano_wait_q_remove_no_check (wait_q=<optimized out>) at /home/jukka/src/zephyr/kernel/nanokernel/include/wait_q.h:57 #2 _fifo_put_non_preemptible (fifo=<optimized out>, data=0xa8006520 <tx_buffers>) at /home/jukka/src/zephyr/kernel/nanokernel/nano_fifo.c:118 #3 0x4003c77b in net_driver_15_4_send (buf=0xa8006520 <tx_buffers>) at /home/jukka/src/zephyr/net/ip/net_driver_15_4.c:81 #4 0x400350f3 in net_tcpip_output (buf=0xa8006520 <tx_buffers>, lladdr=<optimized out>) at /home/jukka/src/zephyr/net/ip/net_core.c:801 #5 0x400369e9 in tcpip_ipv6_output (buf=buf(a)entry=0xa8006520 <tx_buffers>) at /home/jukka/src/zephyr/net/ip/contiki/ip/tcpip.c:785 #6 0x4003a15f in uip_nd6_rs_output (buf=0xa8006520 <tx_buffers>) at /home/jukka/src/zephyr/net/ip/contiki/ipv6/uip-nd6.c:870 #7 0x40038d43 in uip_ds6_send_rs (buf=buf(a)entry=0x0) at /home/jukka/src/zephyr/net/ip/contiki/ipv6/uip-ds6.c:708 #8 0x40036b76 in eventhandler (buf=0x0, data=<optimized out>, ev=118 'v') at /home/jukka/src/zephyr/net/ip/contiki/ip/tcpip.c:471 #9 process_thread_tcpip_process (process_pt=process_pt(a)entry=0xa8009 2b 8 <tcpip_process+12>, ev=ev(a)entry=136 '\210', data=0xa800cda0 <uip_ds6_timer_rs>, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/ip/tcpip.c:885 #10 0x40036c32 in call_process (p=0xa80092ac <tcpip_process>, ev=<optimized out>, data=<optimized out>, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:198 #11 0x40036e90 in process_post_synch (p=<optimized out>, ev=<optimized out>, data=<optimized out>, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:375 #12 0x400370d7 in process_thread_etimer_process ( process_pt=process_pt(a)entry=0xa80092c8 <etimer_process+12>, ev=ev @e ntry=130 '\202', data=<optimized out>, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/etimer.c:117 #13 0x40036c32 in call_process (p=0xa80092bc <etimer_process>, ev=<optimized out>, data=<optimized out>, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:198 #14 0x40036e90 in process_post_synch (p=p(a)entry=0xa80092bc <etimer_process>, ev=ev(a)entry=130 '\202', data=data(a)entry=0x0, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:375 #15 0x400370fb in etimer_request_poll () at /home/jukka/src/zephyr/net/ip/contiki/os/sys/etimer.c:130 #16 0x40034ac8 in net_timer_fiber () at /home/jukka/src/zephyr/net/ip/net_core.c:674 #17 0x400340e0 in _thread_entry (pEntry=0x40034a7c <net_timer_fiber>, parameter1=<optimized out>, parameter2=<optimized out>, parameter3=0x0)
(gdb) p pQ->link $6 = (struct tcs *) 0xa800e620 <timer_fiber_stack> (gdb) p tcs->prio $7 = 7 (gdb) p pQ->link->prio $8 = 7
Any ideas what is going on?
Cheers, Jukka
On Wed, 2016-02-24 at 12:03 +0200, Jukka Rissanen wrote:
Hi Peter & Ben,
I am seeing couple of issues when trying to wake up a fiber.
1) if my fiber tries to to wake up itself (yes, there was a bug in my program), there seems to be weird issues (hangs). This could be just symptoms of 2)
2) after making sure that I am not trying to wake the running fiber, I am seeing weird issues. After running several rounds of sleeps and wakes, the OS just hangs when calling fiber_sleep(). I wonder how to debug this, any suggestions?
Cheers, Jukka
On Mon, 2016-02-22 at 18:07 +0000, Mitsis, Peter wrote:
For what it is worth, I am currently tackling this item.
Peter
-----Original Message----- From: Jukka Rissanen [mailto:jukka.rissanen(a)linux.intel.com] Sent: February-22-16 2:01 AM To: Walsh, Benjamin Cc: devel(a)lists.zephyrproject.org Subject: [devel] Re: Re: RFC: make _fiber_start() return a handle on the fiber
Hi Ben,
On Fri, 2016-02-19 at 10:36 -0500, Benjamin Walsh wrote:
When we start a fiber via the _fiber_start() API family, we don't get back a handle on the created fiber. The fiber identifier is actually the start of the fiber's stack. This hasn't been a problem until now since no API requires a handle on the fiber, except one, fiber_delayed_start_cancel(): that API is part of a pair, where the other API, fiber_delayed_start() starts the fiber and returns a handle.
However, Jukka asked me an API could be created that cancels a fiber_sleep() call, something like fiber_wakeup(). The implementation of such an API is very simple, but it requires a handle on the fiber we want to wake up. This in turn requires the signature of the _fiber_start() family to return a handle to the fiber that gets started.
The signature of _fiber_start() et al. would then change from a void return type to a void * return type.
Objections, comments, etc ? No comments -> no objects -> perhaps we can continue with this route then? Looks like it. You want to do the implementation yourself Jukka ? I am a bit busy right now so I am fine if you do it.
Cheers, Jukka
|