RFC: 5/5 Provide interfaces for Power Management Applications Policies
Thomas, Ramesh
Problem Statement:
Add OS infrastructure to enable application-based power management
policies, which is architecture independent, supports microkernel and
nanokernel and the interface clearly identifies the policy based action
taken.
Why is this a problem:
-----------------------------
Currently the kernel has hooks for a power management application to be
notified of a suspend (_sys_soc_suspend) and resume (_sys_soc_resume)
operation that only operate within the microkernel and only on X86
architectures.
This creates an inconsistent interface for all architectures to expand
and easily implement a power management application across multiple
architectures. This also creates a requirement for using the
microkernel only, while some of our supported platforms work with only a
nanokernel (i.e. Quark D2000).
During Deep Sleep, devices lose power and state with no consistent
methods to allow a PMA to enforce devices to save and restore states.
This stops core functions of the Application that depend on devices
being powered on, configured, and ready to operate.
What should be done:
-----------------------------
The Zephyr kernel is not looking to implement power management policies.
Instead, the kernel shall provide interfaces for a system integrator to
utilize and create their own Power Management Application (PMA) that can
enforce any policies needed.
This will be accomplished by providing an OS power-management
infrastructure that has an architecture independent interface. The
Zephyr kernel will provide notification methods in both the Microkernel
and Nanokernel for the PMA when the OS is about to enter and exit system
idle. Currently the Zephyr kernel has support for this notification
only on X86 through the _sys_soc_suspend() and _sys_soc_resume()
functions. Expanding the scope of these functions to include the ARM
and ARC architectures, support both the Nanokernel and Microkernel, and
provide a more detailed return codes would be the first steps. The
kernel will continue to provide the determination on how much time to
spend idle (number of ticks), passing this information along to the PMA.
When kernel is about to go idle, the kernel itself will disable
interrupts. The kernel will then call the _sys_soc_suspend() function,
notifying the PMA of the operation. Included in the notification are
the maximum number of ticks that the system can be set idle for. The
PMA will then determine what policies can be executed within the
allotted time frame.
Currently the kernel expects the _sys_soc_suspend() to return one of
the following values:
- PM_NOT_HANDLED - The PMA did not put the CPU to LPS or the SOC to
Deep Sleep.
- PM_HANDLED - The PMA was able to put the CPU to LPS or the SOC to Deep
Sleep.
The proposal is to replace _sys_soc_suspend() return values to provide a
clear action based indicator of the policies the PMA implements:
- PM_SOC_LPS - Indicating that the PMA was successful at pushing the CPU
into a low power state.
- PM_SOC_DS - Indicating that the PMA was successful at pushing the SOC
into the Deep Sleep state
- PM_SOC_TICKLESS - Indicating that the PMA has accomplished any device
suspend operations. This does not include any CPU or SOC power
operations.
- PM_SOC_NOT_HANDLED - Indicating the PMA was not able to accomplish any
action in the time allotted by the kernel.
As policy decisions are the realm of the PMA, the kernel will now
provide a method for the PMA to get the current list of devices enabled
for the application. This will allow the PMA to walk the device list
and determine any policy decisions based upon the available tick count,
call the device’s suspend() routine, and deal with any possible failures
in the process.
All of these operations have any expressed latency requirement of NONE.
Writing a PMA:
---------------------
Writing a Power Management Application to enforce policies will
implement the API described below.
Upon startup, the PMA will already be registered as the handler for
_sys_soc_suspend() and _sys_soc_resume() through compile time linking.
The first act of the PMA will be to retrieve the known list of devices
through the device_list_get() function. Because the PMA is part of the
application, it is expected to start after all system devices have been
initialized. Thus the list of devices is not expected to change once
the application has begun.
The device_list_get() function will return the start and end pointers to
the current enabled devices. It is up to the PMA to walk this list and
how to determine the best mechanism to store/process this list. It is
up to the system integrator to verify the amount of time each device
requires for a power cycle, and ensure this time fits within the
allotted time provided by the kernel. This time value is highly
dependent upon each specific device used in the final platform and SOC.
When entering the PMA through the _sys_soc_suspend() function, the PMA
can select between multiple scenarios.
Case 1:
*The time allotted is too short for any power management.*
In this case, the PMA will leave interrupts disabled, and return the
code
PM_SOC_NOT_HANDLED. This will allow the Zephyr kernel to continue on
the idle loop selected at compile time.
Case 2:
*The time allotted is enough for some devices to be suspended.*
a) If everything suspends correctly, the PMA will:
1) Scan through the devices that meet the criteria
2) Call each device’s suspend() function
3) If the time allotted is enough to put the CPU into a LPS the PMA
will:
i) Push the CPU to the LPS re-enabling interrupts at the same
time.
ii) Return PM_SOC_LPS
4) If the time allotted is not enough for CPU or SOC operations, the
PMA
will:
i) Return PM_SOC_TICKLESS
b) If a device fails to suspend, the PMA will:
1) If the device is not essential to the suspend process, as
determined by
the system integrator, the PMA can choose to ignore the failure
2) If the device is essential to the suspend process, as determined
by the
system integrator, the PMA shall return PM_SOC_NOT_HANDLED.
Case 3:
*The time allotted is enough for all devices to be suspended.*
a) If everything suspends correctly, the PMA will:
1) Call each device’s suspend() function
2) If the time allotted is enough to put the CPU into a LPS the PMA
will:
i) Push the CPU to the LPS re-enabling interrupts at the same
time.
ii) Return PM_SOC_LPS
3) If the time allotted is enough to put the SOC into Deep Sleep, the
PMA
will:
i) Push the SOC to Deep Sleep
ii) Return PM_SOC_DS
iii) Re-enable interrupts
4) If the time allotted is not enough for CPU or SOC operations, the
PMA
will:
i) Return PM_SOC_TICKLESS
b) If a device fails to suspend, the PMA will:
1) If the device is not essential to the suspend process, as
determined by
the system integrator, the PMA can choose to ignore the failure
2) If the device is essential to the suspend process, as determined
by the
system integrator, the PMA shall return PM_SOC_NOT_HANDLED.
PMA: Power Manager Application
ISR: Interrupt Service Routine
Proposed PMA Tickless Idle Entry:
+---------+ +-----+ +---------+ +-----+
| Events | | ISR | | Kernel | | PMA |
+---------+ +-----+ +---------+ +-----+
| | | ----------\ |
| | |-| Compute | |
| | | | idle | |
| | | | ticks | |
| | | |---------| |
| | | -----------\ |
| | |-| Schedule | |
| | | | next | |
| | | | event | |
| | | |----------| |
| | | |
| | | _sys_soc_suspend(ticks) |
| | |--------------------------->|
| | | -----------\ |
| | | | Select |-|
| | | | policy | |
| | | | based on | |
| | | | ticks | |
| | | |----------| |
| | | -----------\ |
| | | | Execute |-|
| | | | tickless | |
| | | | policy | |
| | | |----------| |
| | | |
| | | PM_SOC_TICKLESS |
| | |<---------------------------|
| | | |
| | | CPU |
| | | idle wait |
| | |---------- |
| | | | |
| | |<--------- |
| | | |
Proposed PMA Tickless Idle Exit:
+---------+ +-----+ +---------+ +-----+
| Events | | ISR | | Kernel | | PMA |
+---------+ +-----+ +---------+ +-----+
| | | |
| intr | | |
|-------->| | |
| | | |
| | process interrupt | |
| |----------------------->| |
| | | |
| | | _sys_soc_resume() |
| | |--------------------->|
| | | -----------\ |
| | | | Execute |-|
| | | | tickless | |
| | | | exit | |
| | | | policy | |
| | | |----------| |
| | | |
| | | return to kernel |
| | |<---------------------|
| | | |
| | return to ISR | |
| |<-----------------------| |
| | -------------\ | |
| |-| re-compute | | |
| | | Tickless | | |
| | | timeouts | | |
| | |------------| | |
| | -----------\ | |
| |-| schedule | | |
| | | next | | |
| | | task | | |
| | |----------| | |
| | | |
Add OS infrastructure to enable application-based power management
policies, which is architecture independent, supports microkernel and
nanokernel and the interface clearly identifies the policy based action
taken.
Why is this a problem:
-----------------------------
Currently the kernel has hooks for a power management application to be
notified of a suspend (_sys_soc_suspend) and resume (_sys_soc_resume)
operation that only operate within the microkernel and only on X86
architectures.
This creates an inconsistent interface for all architectures to expand
and easily implement a power management application across multiple
architectures. This also creates a requirement for using the
microkernel only, while some of our supported platforms work with only a
nanokernel (i.e. Quark D2000).
During Deep Sleep, devices lose power and state with no consistent
methods to allow a PMA to enforce devices to save and restore states.
This stops core functions of the Application that depend on devices
being powered on, configured, and ready to operate.
What should be done:
-----------------------------
The Zephyr kernel is not looking to implement power management policies.
Instead, the kernel shall provide interfaces for a system integrator to
utilize and create their own Power Management Application (PMA) that can
enforce any policies needed.
This will be accomplished by providing an OS power-management
infrastructure that has an architecture independent interface. The
Zephyr kernel will provide notification methods in both the Microkernel
and Nanokernel for the PMA when the OS is about to enter and exit system
idle. Currently the Zephyr kernel has support for this notification
only on X86 through the _sys_soc_suspend() and _sys_soc_resume()
functions. Expanding the scope of these functions to include the ARM
and ARC architectures, support both the Nanokernel and Microkernel, and
provide a more detailed return codes would be the first steps. The
kernel will continue to provide the determination on how much time to
spend idle (number of ticks), passing this information along to the PMA.
When kernel is about to go idle, the kernel itself will disable
interrupts. The kernel will then call the _sys_soc_suspend() function,
notifying the PMA of the operation. Included in the notification are
the maximum number of ticks that the system can be set idle for. The
PMA will then determine what policies can be executed within the
allotted time frame.
Currently the kernel expects the _sys_soc_suspend() to return one of
the following values:
- PM_NOT_HANDLED - The PMA did not put the CPU to LPS or the SOC to
Deep Sleep.
- PM_HANDLED - The PMA was able to put the CPU to LPS or the SOC to Deep
Sleep.
The proposal is to replace _sys_soc_suspend() return values to provide a
clear action based indicator of the policies the PMA implements:
- PM_SOC_LPS - Indicating that the PMA was successful at pushing the CPU
into a low power state.
- PM_SOC_DS - Indicating that the PMA was successful at pushing the SOC
into the Deep Sleep state
- PM_SOC_TICKLESS - Indicating that the PMA has accomplished any device
suspend operations. This does not include any CPU or SOC power
operations.
- PM_SOC_NOT_HANDLED - Indicating the PMA was not able to accomplish any
action in the time allotted by the kernel.
As policy decisions are the realm of the PMA, the kernel will now
provide a method for the PMA to get the current list of devices enabled
for the application. This will allow the PMA to walk the device list
and determine any policy decisions based upon the available tick count,
call the device’s suspend() routine, and deal with any possible failures
in the process.
All of these operations have any expressed latency requirement of NONE.
Writing a PMA:
---------------------
Writing a Power Management Application to enforce policies will
implement the API described below.
Upon startup, the PMA will already be registered as the handler for
_sys_soc_suspend() and _sys_soc_resume() through compile time linking.
The first act of the PMA will be to retrieve the known list of devices
through the device_list_get() function. Because the PMA is part of the
application, it is expected to start after all system devices have been
initialized. Thus the list of devices is not expected to change once
the application has begun.
The device_list_get() function will return the start and end pointers to
the current enabled devices. It is up to the PMA to walk this list and
how to determine the best mechanism to store/process this list. It is
up to the system integrator to verify the amount of time each device
requires for a power cycle, and ensure this time fits within the
allotted time provided by the kernel. This time value is highly
dependent upon each specific device used in the final platform and SOC.
When entering the PMA through the _sys_soc_suspend() function, the PMA
can select between multiple scenarios.
Case 1:
*The time allotted is too short for any power management.*
In this case, the PMA will leave interrupts disabled, and return the
code
PM_SOC_NOT_HANDLED. This will allow the Zephyr kernel to continue on
the idle loop selected at compile time.
Case 2:
*The time allotted is enough for some devices to be suspended.*
a) If everything suspends correctly, the PMA will:
1) Scan through the devices that meet the criteria
2) Call each device’s suspend() function
3) If the time allotted is enough to put the CPU into a LPS the PMA
will:
i) Push the CPU to the LPS re-enabling interrupts at the same
time.
ii) Return PM_SOC_LPS
4) If the time allotted is not enough for CPU or SOC operations, the
PMA
will:
i) Return PM_SOC_TICKLESS
b) If a device fails to suspend, the PMA will:
1) If the device is not essential to the suspend process, as
determined by
the system integrator, the PMA can choose to ignore the failure
2) If the device is essential to the suspend process, as determined
by the
system integrator, the PMA shall return PM_SOC_NOT_HANDLED.
Case 3:
*The time allotted is enough for all devices to be suspended.*
a) If everything suspends correctly, the PMA will:
1) Call each device’s suspend() function
2) If the time allotted is enough to put the CPU into a LPS the PMA
will:
i) Push the CPU to the LPS re-enabling interrupts at the same
time.
ii) Return PM_SOC_LPS
3) If the time allotted is enough to put the SOC into Deep Sleep, the
PMA
will:
i) Push the SOC to Deep Sleep
ii) Return PM_SOC_DS
iii) Re-enable interrupts
4) If the time allotted is not enough for CPU or SOC operations, the
PMA
will:
i) Return PM_SOC_TICKLESS
b) If a device fails to suspend, the PMA will:
1) If the device is not essential to the suspend process, as
determined by
the system integrator, the PMA can choose to ignore the failure
2) If the device is essential to the suspend process, as determined
by the
system integrator, the PMA shall return PM_SOC_NOT_HANDLED.
PMA: Power Manager Application
ISR: Interrupt Service Routine
Proposed PMA Tickless Idle Entry:
+---------+ +-----+ +---------+ +-----+
| Events | | ISR | | Kernel | | PMA |
+---------+ +-----+ +---------+ +-----+
| | | ----------\ |
| | |-| Compute | |
| | | | idle | |
| | | | ticks | |
| | | |---------| |
| | | -----------\ |
| | |-| Schedule | |
| | | | next | |
| | | | event | |
| | | |----------| |
| | | |
| | | _sys_soc_suspend(ticks) |
| | |--------------------------->|
| | | -----------\ |
| | | | Select |-|
| | | | policy | |
| | | | based on | |
| | | | ticks | |
| | | |----------| |
| | | -----------\ |
| | | | Execute |-|
| | | | tickless | |
| | | | policy | |
| | | |----------| |
| | | |
| | | PM_SOC_TICKLESS |
| | |<---------------------------|
| | | |
| | | CPU |
| | | idle wait |
| | |---------- |
| | | | |
| | |<--------- |
| | | |
Proposed PMA Tickless Idle Exit:
+---------+ +-----+ +---------+ +-----+
| Events | | ISR | | Kernel | | PMA |
+---------+ +-----+ +---------+ +-----+
| | | |
| intr | | |
|-------->| | |
| | | |
| | process interrupt | |
| |----------------------->| |
| | | |
| | | _sys_soc_resume() |
| | |--------------------->|
| | | -----------\ |
| | | | Execute |-|
| | | | tickless | |
| | | | exit | |
| | | | policy | |
| | | |----------| |
| | | |
| | | return to kernel |
| | |<---------------------|
| | | |
| | return to ISR | |
| |<-----------------------| |
| | -------------\ | |
| |-| re-compute | | |
| | | Tickless | | |
| | | timeouts | | |
| | |------------| | |
| | -----------\ | |
| |-| schedule | | |
| | | next | | |
| | | task | | |
| | |----------| | |
| | | |