Date
1 - 17 of 17
RFC: make _fiber_start() return a handle on the fiber
Benjamin Walsh <benjamin.walsh@...>
On Wed, Feb 24, 2016 at 03:04:41PM +0200, Jukka Rissanen wrote:
I think I figured this out:Yeah, that would do it: the task has to yield to the fiber somehow, since it will never pend (in a nanokernel context). After fixing these two issues, the fiber wake up seems to work nowHmm, I would counter this with the argument that passing the current fiber context is the same as passing any other invalid value. There are absolutely no checks being done to see if the handle is a pointer to a valid fiber. However, I think there is an issue if the caller tries to a wake up a fiber that was potentially sleeping but that has waken up, and is not running yet. In that case, that will corrupt the fiber ready queue since the fiber will be pointing to itself for the next fiber to run. If a fiber is not waiting, it should not be added to the ready queue (since it's already in the ready queue). The interesting thing with this fix is that it would catch a fiber trying to wake itself up, since it is not sleeping.
|
|
Jukka Rissanen
I think I figured this out:
toggle quoted messageShow quoted text
- I was trying to wake up the same fiber I was running - I was always calling fiber_fiber_wakeup() instead of proper context depending version After fixing these two issues, the fiber wake up seems to work now fine. I would suggest that if possible, the wake up code could check that we are not trying to wake ourselves. Cheers, Jukka
On Wed, 2016-02-24 at 13:54 +0200, Jukka Rissanen wrote:
Re-sending this using proper mail address.
|
|
Jukka Rissanen
Re-sending this using proper mail address.
toggle quoted messageShow quoted text
I managed to get following backtrace when the system is hanging: _nano_fiber_ready (tcs=tcs(a)entry=0xa800f020 <tx_fiber_stack>)     at /home/jukka/src/zephyr/kernel/nanokernel/nano_fiber.c:53 53              while (pQ->link && (tcs->prio >= pQ->link->prio)) { (gdb) bt #0  _nano_fiber_ready (tcs=tcs(a)entry=0xa800f020 <tx_fiber_stack>)     at /home/jukka/src/zephyr/kernel/nanokernel/nano_fiber.c:53 #1  0x40033ea9 in _nano_wait_q_remove_no_check (wait_q=<optimized out>)     at /home/jukka/src/zephyr/kernel/nanokernel/include/wait_q.h:57 #2  _fifo_put_non_preemptible (fifo=<optimized out>, data=0xa8006520 <tx_buffers>)     at /home/jukka/src/zephyr/kernel/nanokernel/nano_fifo.c:118 #3  0x4003c77b in net_driver_15_4_send (buf=0xa8006520 <tx_buffers>)     at /home/jukka/src/zephyr/net/ip/net_driver_15_4.c:81 #4  0x400350f3 in net_tcpip_output (buf=0xa8006520 <tx_buffers>, lladdr=<optimized out>)     at /home/jukka/src/zephyr/net/ip/net_core.c:801 #5  0x400369e9 in tcpip_ipv6_output (buf=buf(a)entry=0xa8006520 <tx_buffers>)     at /home/jukka/src/zephyr/net/ip/contiki/ip/tcpip.c:785 #6  0x4003a15f in uip_nd6_rs_output (buf=0xa8006520 <tx_buffers>)     at /home/jukka/src/zephyr/net/ip/contiki/ipv6/uip-nd6.c:870 #7  0x40038d43 in uip_ds6_send_rs (buf=buf(a)entry=0x0)     at /home/jukka/src/zephyr/net/ip/contiki/ipv6/uip-ds6.c:708 #8  0x40036b76 in eventhandler (buf=0x0, data=<optimized out>, ev=118 'v')     at /home/jukka/src/zephyr/net/ip/contiki/ip/tcpip.c:471 #9  process_thread_tcpip_process (process_pt=process_pt(a)entry=0xa80092b 8 <tcpip_process+12>,     ev=ev(a)entry=136 '\210', data=0xa800cda0 <uip_ds6_timer_rs>, buf=0x0)     at /home/jukka/src/zephyr/net/ip/contiki/ip/tcpip.c:885 #10 0x40036c32 in call_process (p=0xa80092ac <tcpip_process>, ev=<optimized out>, data=<optimized out>,     buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:198 #11 0x40036e90 in process_post_synch (p=<optimized out>, ev=<optimized out>, data=<optimized out>, buf=0x0)     at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:375 #12 0x400370d7 in process_thread_etimer_process (     process_pt=process_pt(a)entry=0xa80092c8 <etimer_process+12>, ev=ev(a)e ntry=130 '\202',     data=<optimized out>, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/etimer.c:117 #13 0x40036c32 in call_process (p=0xa80092bc <etimer_process>, ev=<optimized out>, data=<optimized out>,     buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:198 #14 0x40036e90 in process_post_synch (p=p(a)entry=0xa80092bc <etimer_process>, ev=ev(a)entry=130 '\202',     data=data(a)entry=0x0, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:375 #15 0x400370fb in etimer_request_poll () at /home/jukka/src/zephyr/net/ip/contiki/os/sys/etimer.c:130 #16 0x40034ac8 in net_timer_fiber () at /home/jukka/src/zephyr/net/ip/net_core.c:674 #17 0x400340e0 in _thread_entry (pEntry=0x40034a7c <net_timer_fiber>, parameter1=<optimized out>,     parameter2=<optimized out>, parameter3=0x0) (gdb) p pQ->link $6 = (struct tcs *) 0xa800e620 <timer_fiber_stack> (gdb) p tcs->prio $7 = 7 (gdb) p pQ->link->prio $8 = 7 Any ideas what is going on? Cheers, Jukka
On Wed, 2016-02-24 at 12:03 +0200, Jukka Rissanen wrote:
Hi Peter & Ben,
|
|
Rissanen, Jukka <jukka.rissanen@...>
I managed to get following backtrace when the system is hanging:
_nano_fiber_ready (tcs=tcs(a)entry=0xa800f020 <tx_fiber_stack>)     at /home/jukka/src/zephyr/kernel/nanokernel/nano_fiber.c:53 53 while (pQ->link && (tcs->prio >= pQ->link->prio)) { (gdb) bt #0  _nano_fiber_ready (tcs=tcs(a)entry=0xa800f020 <tx_fiber_stack>)     at /home/jukka/src/zephyr/kernel/nanokernel/nano_fiber.c:53 #1  0x40033ea9 in _nano_wait_q_remove_no_check (wait_q=<optimized out>)     at /home/jukka/src/zephyr/kernel/nanokernel/include/wait_q.h:57 #2  _fifo_put_non_preemptible (fifo=<optimized out>, data=0xa8006520 <tx_buffers>)     at /home/jukka/src/zephyr/kernel/nanokernel/nano_fifo.c:118 #3  0x4003c77b in net_driver_15_4_send (buf=0xa8006520 <tx_buffers>)     at /home/jukka/src/zephyr/net/ip/net_driver_15_4.c:81 #4  0x400350f3 in net_tcpip_output (buf=0xa8006520 <tx_buffers>, lladdr=<optimized out>)     at /home/jukka/src/zephyr/net/ip/net_core.c:801 #5  0x400369e9 in tcpip_ipv6_output (buf=buf(a)entry=0xa8006520 <tx_buffers>)     at /home/jukka/src/zephyr/net/ip/contiki/ip/tcpip.c:785 #6  0x4003a15f in uip_nd6_rs_output (buf=0xa8006520 <tx_buffers>)     at /home/jukka/src/zephyr/net/ip/contiki/ipv6/uip-nd6.c:870 #7  0x40038d43 in uip_ds6_send_rs (buf=buf(a)entry=0x0)     at /home/jukka/src/zephyr/net/ip/contiki/ipv6/uip-ds6.c:708 #8  0x40036b76 in eventhandler (buf=0x0, data=<optimized out>, ev=118 'v')     at /home/jukka/src/zephyr/net/ip/contiki/ip/tcpip.c:471 #9  process_thread_tcpip_process (process_pt=process_pt(a)entry=0xa80092b 8 <tcpip_process+12>,     ev=ev(a)entry=136 '\210', data=0xa800cda0 <uip_ds6_timer_rs>, buf=0x0)     at /home/jukka/src/zephyr/net/ip/contiki/ip/tcpip.c:885 #10 0x40036c32 in call_process (p=0xa80092ac <tcpip_process>, ev=<optimized out>, data=<optimized out>,     buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:198 #11 0x40036e90 in process_post_synch (p=<optimized out>, ev=<optimized out>, data=<optimized out>, buf=0x0)     at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:375 #12 0x400370d7 in process_thread_etimer_process (     process_pt=process_pt(a)entry=0xa80092c8 <etimer_process+12>, ev=ev(a)e ntry=130 '\202',     data=<optimized out>, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/etimer.c:117 #13 0x40036c32 in call_process (p=0xa80092bc <etimer_process>, ev=<optimized out>, data=<optimized out>,     buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:198 #14 0x40036e90 in process_post_synch (p=p(a)entry=0xa80092bc <etimer_process>, ev=ev(a)entry=130 '\202',     data=data(a)entry=0x0, buf=0x0) at /home/jukka/src/zephyr/net/ip/contiki/os/sys/process.c:375 #15 0x400370fb in etimer_request_poll () at /home/jukka/src/zephyr/net/ip/contiki/os/sys/etimer.c:130 #16 0x40034ac8 in net_timer_fiber () at /home/jukka/src/zephyr/net/ip/net_core.c:674 #17 0x400340e0 in _thread_entry (pEntry=0x40034a7c <net_timer_fiber>, parameter1=<optimized out>,     parameter2=<optimized out>, parameter3=0x0) (gdb) p pQ->link $6 = (struct tcs *) 0xa800e620 <timer_fiber_stack> (gdb) p tcs->prio $7 = 7 (gdb) p pQ->link->prio $8 = 7 Any ideas what is going on? Cheers, Jukka On Wed, 2016-02-24 at 12:03 +0200, Jukka Rissanen wrote: Hi Peter & Ben,--------------------------------------------------------------------- Intel Finland Oy Registered Address: PL 281, 00181 Helsinki Business Identity Code: 0357606 - 4 Domiciled in Helsinki This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
|
|
Jukka Rissanen
Hi Peter & Ben,
toggle quoted messageShow quoted text
I am seeing couple of issues when trying to wake up a fiber. 1) if my fiber tries to to wake up itself (yes, there was a bug in my program), there seems to be weird issues (hangs). This could be just symptoms of 2) 2) after making sure that I am not trying to wake the running fiber, I am seeing weird issues. After running several rounds of sleeps and wakes, the OS just hangs when calling fiber_sleep(). I wonder how to debug this, any suggestions? Cheers, Jukka
On Mon, 2016-02-22 at 18:07 +0000, Mitsis, Peter wrote:
For what it is worth, I am currently tackling this item.
|
|
Liu, Sharron <sharron.liu@...>
Returning a handler maybe is good for app to invoke other fiber related APIs for operation on same fiber.
toggle quoted messageShow quoted text
But regarding the case described below, I have question can "nano_sem_give" be used instead of requesting fiber_wakeup()? From Zephyr document I learnt both "semaphore wait" and "sleep" blocks the execution of fiber and triggers the scheduler to dequeue another executable fiber. Thanks, Sharron
-----Original Message-----
From: Benjamin Walsh [mailto:benjamin.walsh(a)windriver.com] Sent: Saturday, February 20, 2016 1:17 AM To: Nashif, Anas <anas.nashif(a)intel.com> Cc: devel(a)lists.zephyrproject.org; Rissanen, Jukka <jukka.rissanen(a)intel.com> Subject: [devel] Re: RFC: make _fiber_start() return a handle on the fiber Sure.You are right, this will not break existing code. Nevertheless, weWe're just returning a thread ID when we start a fiber now, you canWhen we start a fiber via the _fiber_start() API family, we don'tSounds good, but we need to do it in away that keeps APIs
|
|
Mitsis, Peter <Peter.Mitsis@...>
For what it is worth, I am currently tackling this item.
toggle quoted messageShow quoted text
Peter
-----Original Message-----
|
|
Jukka Rissanen
Hi Ben,
On Fri, 2016-02-19 at 10:36 -0500, Benjamin Walsh wrote: I am a bit busy right now so I am fine if you do it.Looks like it. You want to do the implementation yourself Jukka ?When we start a fiber via the _fiber_start() API family, we don'tNo comments -> no objects -> perhaps we can continue with this Cheers, Jukka
|
|
Benjamin Walsh <benjamin.walsh@...>
Sure.You are right, this will not break existing code. Nevertheless, weWe're just returning a thread ID when we start a fiber now, you canWhen we start a fiber via the _fiber_start() API family, we don'tSounds good, but we need to do it in away that keeps APIs
|
|
Nashif, Anas
On 19/02/2016, 08:40, "Benjamin Walsh" <benjamin.walsh(a)windriver.com> wrote:
On Fri, Feb 19, 2016 at 11:07:33AM -0500, Nashif, Anas wrote:You are right, this will not break existing code.We're just returning a thread ID when we start a fiber now, you canOn Feb 16, 2016, at 11:35, Benjamin Walsh <benjamin.walsh(a)windriver.com> wrote:Sounds good, but we need to do it in away that keeps APIs compatible I guess. Nevertheless, we should track such API changes and document them in release notes. Anas
|
|
Benjamin Walsh <benjamin.walsh@...>
On Fri, Feb 19, 2016 at 11:07:33AM -0500, Nashif, Anas wrote:
We're just returning a thread ID when we start a fiber now, you canOn Feb 16, 2016, at 11:35, Benjamin Walsh <benjamin.walsh(a)windriver.com> wrote:Sounds good, but we need to do it in away that keeps APIs compatible I guess. ignore it. I don't see an issue here...
|
|
Benjamin Walsh <benjamin.walsh@...>
On Fri, Feb 19, 2016 at 08:12:04AM -0800, Dirk Brandewie wrote:
Ugh, you're right, it's a nano_thread_id_t.
|
|
Dirk Brandewie <dirk.j.brandewie@...>
On 02/16/2016 11:34 AM, Benjamin Walsh wrote:
Folks,Makes sense but why not tell the compiler the truth about the type? Objections, comments, etc ?
|
|
Nashif, Anas
On Feb 16, 2016, at 11:35, Benjamin Walsh <benjamin.walsh(a)windriver.com> wrote:Sounds good, but we need to do it in away that keeps APIs compatible I guess. Anas Cheers,
|
|
Benjamin Walsh <benjamin.walsh@...>
Looks like it. You want to do the implementation yourself Jukka ?When we start a fiber via the _fiber_start() API family, we don'tNo comments -> no objects -> perhaps we can continue with this route
|
|
Jukka Rissanen
Hi,
On Tue, 2016-02-16 at 14:34 -0500, Benjamin Walsh wrote: Folks,No comments -> no objects -> perhaps we can continue with this route then? Cheers, Jukka
|
|
Benjamin Walsh <benjamin.walsh@...>
Folks,
When we start a fiber via the _fiber_start() API family, we don't get back a handle on the created fiber. The fiber identifier is actually the start of the fiber's stack. This hasn't been a problem until now since no API requires a handle on the fiber, except one, fiber_delayed_start_cancel(): that API is part of a pair, where the other API, fiber_delayed_start() starts the fiber and returns a handle. However, Jukka asked me an API could be created that cancels a fiber_sleep() call, something like fiber_wakeup(). The implementation of such an API is very simple, but it requires a handle on the fiber we want to wake up. This in turn requires the signature of the _fiber_start() family to return a handle to the fiber that gets started. The signature of _fiber_start() et al. would then change from a void return type to a void * return type. Objections, comments, etc ? Cheers, Ben -- Benjamin Walsh, SMTS Wind River Rocket Zephyr kernel maintainer www.windriver.com
|
|