Hi Zephyr developers,
I was looking at the k_poll mechanism in Zephyr and found that it can potentially miss events from the producer in SMP when the following happens:
1) Set up a bunch of events with K_POLL_TYPE_SIGNAL and K_POLL_MODE_NOTIFY_ONLY. 2) A producer thread calls k_poll_signal_raise() to set the "signalled" field to 1 for event X in the list above. 3) The destination/consumer thread is already in the middle of processing of events and finds the same event X "raised" (from a previous signal_raise() call maybe), runs the handler for the event to completion and sets the "signalled" field to zero. 4) Destination thread goes back to k_poll to pend on events.
Now depending how 2) and 3) are ordered - step 3) could set "signalled" back to zero and lose the update from the producer thread in step 2).
Am I missing something here? Shouldn't we take a snapshot of the events/signals when exiting k_poll in the critical section and pass that to the destination thread? Once, a thread calls k_poll again then "OR" the pending events with events that the thread has not handled so far (if any)?
|
|
Anyone familiar with k_poll architecture here? Let me know if you need any other info.
toggle quoted messageShow quoted text
Hi Zephyr developers,
I was looking at the k_poll mechanism in Zephyr and found that it can potentially miss events from the producer in SMP when the following happens:
1) Set up a bunch of events with K_POLL_TYPE_SIGNAL and K_POLL_MODE_NOTIFY_ONLY. 2) A producer thread calls k_poll_signal_raise() to set the "signalled" field to 1 for event X in the list above. 3) The destination/consumer thread is already in the middle of processing of events and finds the same event X "raised" (from a previous signal_raise() call maybe), runs the handler for the event to completion and sets the "signalled" field to zero. 4) Destination thread goes back to k_poll to pend on events.
Now depending how 2) and 3) are ordered - step 3) could set "signalled" back to zero and lose the update from the producer thread in step 2).
Am I missing something here? Shouldn't we take a snapshot of the events/signals when exiting k_poll in the critical section and pass that to the destination thread? Once, a thread calls k_poll again then "OR" the pending events with events that the thread has not handled so far (if any)?
|
|
toggle quoted messageShow quoted text
From: devel@... <devel@...>
On Behalf Of Venkatesh Sukumaran via lists.zephyrproject.org
Sent: 08 November 2021 08:08
To: devel@...
Subject: Re: [Zephyr-devel] Issue with k_poll
Anyone familiar with k_poll architecture here? Let me know if you need any other info.
Hi Zephyr developers,
I was looking at the k_poll mechanism in Zephyr and found that it can potentially miss events from the producer in SMP when the following happens:
1) Set up a bunch of events with K_POLL_TYPE_SIGNAL and K_POLL_MODE_NOTIFY_ONLY.
2) A producer thread calls k_poll_signal_raise() to set the "signalled" field to 1 for event X in the list above.
3) The destination/consumer thread is already in the middle of processing of events and finds the same event X "raised" (from a previous signal_raise() call maybe), runs the handler for the event to completion and sets the "signalled" field
to zero.
4) Destination thread goes back to k_poll to pend on events.
Now depending how 2) and 3) are ordered - step 3) could set "signalled" back to zero and lose the update from the producer thread in step 2).
Am I missing something here? Shouldn't we take a snapshot of the events/signals when exiting k_poll in the critical section and pass that to the destination thread? Once, a thread calls k_poll again then "OR" the pending events with events
that the thread has not handled so far (if any)?
|
|
That's right: k_poll() is level triggered, it's not a queue. The number of wakeups is not guaranteed to be equal to the number of events signalled. The idea is that you go to sleep waiting on any of a number of events/things to occur and are woken up when
at least one of them does. But then you need to do the accounting yourself to figure out who was signaled (and how many times) and what should happen.
In this particular use case it sounds like you should be polling on a semaphore and not a signal. Semaphores do provide proper counting of how many give/take calls are made.
Andy
toggle quoted messageShow quoted text
From: Cufi, Carles <Carles.Cufi@...>
Sent: Monday, November 8, 2021 1:12 AM
To: venkatesh.sukumaran@... <venkatesh.sukumaran@...>; devel@... <devel@...>; Ross, Andrew J <andrew.j.ross@...>; Leung, Daniel <daniel.leung@...>
Subject: RE: [Zephyr-devel] Issue with k_poll
+ Andy, Daniel
From: devel@... <devel@...>
On Behalf Of Venkatesh Sukumaran via lists.zephyrproject.org
Sent: 08 November 2021 08:08
To: devel@...
Subject: Re: [Zephyr-devel] Issue with k_poll
Anyone familiar with k_poll architecture here? Let me know if you need any other info.
Hi Zephyr developers,
I was looking at the k_poll mechanism in Zephyr and found that it can potentially miss events from the producer in SMP when the following happens:
1) Set up a bunch of events with K_POLL_TYPE_SIGNAL and K_POLL_MODE_NOTIFY_ONLY.
2) A producer thread calls k_poll_signal_raise() to set the "signalled" field to 1 for event X in the list above.
3) The destination/consumer thread is already in the middle of processing of events and finds the same event X "raised" (from a previous signal_raise() call maybe), runs the handler for the event to completion and sets the "signalled"
field to zero.
4) Destination thread goes back to k_poll to pend on events.
Now depending how 2) and 3) are ordered - step 3) could set "signalled" back to zero and lose the update from the producer thread in step 2).
Am I missing something here? Shouldn't we take a snapshot of the events/signals when exiting k_poll in the critical section and pass that to the destination thread? Once, a thread calls k_poll again then "OR" the pending events with events
that the thread has not handled so far (if any)?
|
|
I understand that the number of wakeups is not equal to the number of events signalled but the problem here is that the pending thread can *miss* a signal. The "signalled" field is being reset to zero by the pending thread after servicing the signal handler and at that time, the same signal could be set again.
toggle quoted messageShow quoted text
That's right: k_poll() is level triggered, it's not a queue. The number of wakeups is not guaranteed to be equal to the number of events signalled. The idea is that you go to sleep waiting on any of a number of events/things to occur and are woken up when
at least one of them does. But then you need to do the accounting yourself to figure out who was signaled (and how many times) and what should happen.
In this particular use case it sounds like you should be polling on a semaphore and not a signal. Semaphores do provide proper counting of how many give/take calls are made.
Andy
+ Andy, Daniel
Anyone familiar with k_poll architecture here? Let me know if you need any other info.
Hi Zephyr developers,
I was looking at the k_poll mechanism in Zephyr and found that it can potentially miss events from the producer in SMP when the following happens:
1) Set up a bunch of events with K_POLL_TYPE_SIGNAL and K_POLL_MODE_NOTIFY_ONLY.
2) A producer thread calls k_poll_signal_raise() to set the "signalled" field to 1 for event X in the list above.
3) The destination/consumer thread is already in the middle of processing of events and finds the same event X "raised" (from a previous signal_raise() call maybe), runs the handler for the event to completion and sets the "signalled"
field to zero.
4) Destination thread goes back to k_poll to pend on events.
Now depending how 2) and 3) are ordered - step 3) could set "signalled" back to zero and lose the update from the producer thread in step 2).
Am I missing something here? Shouldn't we take a snapshot of the events/signals when exiting k_poll in the critical section and pass that to the destination thread? Once, a thread calls k_poll again then "OR" the pending events with events
that the thread has not handled so far (if any)?
|
|
Can you submit that as a github bug? I don't know that I see it, but if you're sure that it can happen it's something we need to fix.
Andy
toggle quoted messageShow quoted text
From: Venkatesh Sukumaran <venkatesh.sukumaran@...>
Sent: Monday, November 8, 2021 9:39 AM
To: Ross, Andrew J <andrew.j.ross@...>
Cc: Cufi, Carles <Carles.Cufi@...>; devel@... <devel@...>; Leung, Daniel <daniel.leung@...>
Subject: Re: [Zephyr-devel] Issue with k_poll
I understand that the number of wakeups is not equal to the number of events signalled but the problem here is that the pending thread can *miss* a signal. The "signalled" field is being reset to zero by the pending thread after servicing the
signal handler and at that time, the same signal could be set again.
That's right: k_poll() is level triggered, it's not a queue. The number of wakeups is not guaranteed to be equal to the number of events signalled. The idea is that you go to sleep waiting on any of a number of events/things to occur and are woken up when
at least one of them does. But then you need to do the accounting yourself to figure out who was signaled (and how many times) and what should happen.
In this particular use case it sounds like you should be polling on a semaphore and not a signal. Semaphores do provide proper counting of how many give/take calls are made.
Andy
+ Andy, Daniel
Anyone familiar with k_poll architecture here? Let me know if you need any other info.
Hi Zephyr developers,
I was looking at the k_poll mechanism in Zephyr and found that it can potentially miss events from the producer in SMP when the following happens:
1) Set up a bunch of events with K_POLL_TYPE_SIGNAL and K_POLL_MODE_NOTIFY_ONLY.
2) A producer thread calls k_poll_signal_raise() to set the "signalled" field to 1 for event X in the list above.
3) The destination/consumer thread is already in the middle of processing of events and finds the same event X "raised" (from a previous signal_raise() call maybe), runs the handler for the event to completion and sets the "signalled" field to zero.
4) Destination thread goes back to k_poll to pend on events.
Now depending how 2) and 3) are ordered - step 3) could set "signalled" back to zero and lose the update from the producer thread in step 2).
Am I missing something here? Shouldn't we take a snapshot of the events/signals when exiting k_poll in the critical section and pass that to the destination thread? Once, a thread calls k_poll again then "OR" the pending events with events that the thread
has not handled so far (if any)?
|
|
toggle quoted messageShow quoted text
Can you submit that as a github bug? I don't know that I see it, but if you're sure that it can happen it's something we need to fix.
Andy
I understand that the number of wakeups is not equal to the number of events signalled but the problem here is that the pending thread can *miss* a signal. The "signalled" field is being reset to zero by the pending thread after servicing the
signal handler and at that time, the same signal could be set again.
That's right: k_poll() is level triggered, it's not a queue. The number of wakeups is not guaranteed to be equal to the number of events signalled. The idea is that you go to sleep waiting on any of a number of events/things to occur and are woken up when
at least one of them does. But then you need to do the accounting yourself to figure out who was signaled (and how many times) and what should happen.
In this particular use case it sounds like you should be polling on a semaphore and not a signal. Semaphores do provide proper counting of how many give/take calls are made.
Andy
+ Andy, Daniel
Anyone familiar with k_poll architecture here? Let me know if you need any other info.
Hi Zephyr developers,
I was looking at the k_poll mechanism in Zephyr and found that it can potentially miss events from the producer in SMP when the following happens:
1) Set up a bunch of events with K_POLL_TYPE_SIGNAL and K_POLL_MODE_NOTIFY_ONLY.
2) A producer thread calls k_poll_signal_raise() to set the "signalled" field to 1 for event X in the list above.
3) The destination/consumer thread is already in the middle of processing of events and finds the same event X "raised" (from a previous signal_raise() call maybe), runs the handler for the event to completion and sets the "signalled" field to zero.
4) Destination thread goes back to k_poll to pend on events.
Now depending how 2) and 3) are ordered - step 3) could set "signalled" back to zero and lose the update from the producer thread in step 2).
Am I missing something here? Shouldn't we take a snapshot of the events/signals when exiting k_poll in the critical section and pass that to the destination thread? Once, a thread calls k_poll again then "OR" the pending events with events that the thread
has not handled so far (if any)?
|
|