Re: [RFC] Power Management Infrastructure

Thomas, Ramesh

On 7/13/2016 11:40 PM, Rahamim, Hezi wrote:
Hi Dimitriy,

The get_state is there only for symmetry and good practice.
As mentioned below the power manager service will probably not use it as it is not efficient to do get_state to all devices to know all the devices states...
The more important part of the RFC is adding the set_state function and the device policies.
That made me think why we originally came up with 2 functions when one
was enough. Probably we thought the same way - to keep symmetry. Problem
is that we will keep getting more needs and we will either add more
functions to device_pm_ops or will have to change existing ones.

How about having one function that can be used for all possible device
PM purposes using a control code? Something like following :-

int device_pm_control(device, flags);


DEVICE_POWER_STATE = {device PM states}
SYSTEM_POWER_STATE = {system power policies}

(We can add additional parameters if flags param is overloaded)

returns value based on CONTROL_CODE
e.g. returns device power state if CONTROL_CODE == GET_POWER_STATE

(We probably don't need device_pm_ops if we have only one function.)

***I also have some questions inline below***

Thanks for the comment,

-----Original Message-----
From: Dmitriy Korovkin [mailto:dmitriy.korovkin(a)]
Sent: Thursday, July 14, 2016 00:41
To: devel(a)
Subject: [devel] Re: Re: Re: [RFC] Power Management Infrastructure

Hi Hezi,
I think RFC needs to be extended with the description of the idea of controlling power state of each device (if I got you correctly).
Without it the need of
int (*get_state)(struct device *device, int device_pm_policy); looks very unclear.
If all you need is to provide more that two states, then set_state() looks quite enough.


Dmitriy Korovkin

On 16-07-13 12:11 PM, Rahamim, Hezi wrote:
Hi Ramesh,

Please see my comments below/

Thanks for the comments,

-----Original Message-----
From: Thomas, Ramesh [mailto:ramesh.thomas(a)]
Sent: Wednesday, July 13, 2016 09:41
To: devel(a)
Subject: [devel] Re: [RFC] Power Management Infrastructure

On 7/12/2016 2:03 AM, Rahamim, Hezi wrote:
Hi all,

Current state


In the current Zephyr implementation the driver power hooks
distinguish only

between two driver states (suspend and resume). Drivers may have more
than two
Currently suspend and resume are not actually states but a notification of the state transition. There is a second argument to those functions that specify the current policy for which the transition is happening.

states (i.e. D-states) and can traverse between those states. The
state change

today is limited only from active to suspend while there can be cases
of other

transitions requested by the application.

Please look at the below suggested device power states E.g.
transition between


Moreover, the current device state cannot be queried by an
application or

a Power Manager service.

Below is the current Zephyr PM hooks:

struct device_pm_ops {

int (*suspend)(struct device *device, int pm_policy);

int (*resume)(struct device *device, int pm_policy);


Proposed changes


First proposed change is to have a set state and get state driver

instead of the suspend resume functions:

struct device_pm_ops {

int (*set_state)(struct device *device, int

int (*get_state)(struct device *device, int


The set_state function will behave according to the state transition
of a specific

driver. E.g. transition from active state to suspend state in a UART
device will

save device states and gate the clock.
The proposal, as I understand, is to use the pm hooks to actively
control the power states instead of using them as just notifications
of the SOC's power transitions. Considering this, we had one power
policy called "device_suspend_only". That is open to be broken down
into more device specific power states.

[HR] You are correct, we intend to use the pm driver hooks to actively
control the device Power states. We will use the Zephyer's current
power policies to indicate the system power state. As you mentioned,
when devices will not be in active state the system can still be at "device_suspend_only" state.
Do you see any issues with the apps/drivers keeping the PM service
updated of the device's current state in real time? What about race
conditions? Complexity of communication framework?

The get_state function will enable the Power Manager service to know
the state

of each driver thus enable it to configure the SoC power behavior.
The set_state function looks ok. It is the same as the current suspend
but with the name change. The purpose of the name change is to add a
corresponding get_state. The RFC is not giving much details of the
use of the get_state function.

I assume there is a need for the PM service to build a device tree,
with power hierarchy. It would be helpful if you could explain how
this function fits in your larger design of the PM service's power
policy management of the devices.

[HR] I will give an example:
A user application decides to suspend the I2C and the SPI devices. The
application will then call the corresponding set_state functions of these devices.
The set_state functions will perform the store of the relevant device
state and gate the device clock. In the next idle time the _sys_soc_suspend will be called.
This will trigger the power manager service which will decide what
should be done to optimize the power (clock gate a branch or change the system power state.
The decision of the power manager service will be based on the devices
states which can be obtained also by using the get_state functions.

Since the PM service is expected to have communication established
with all components in the system, wouldn't it know what state each
device is set to? Querying each device and building a tree every time
there is an opportunity to suspend, may take some time causing delay in suspend.

[HR] You are correct, using the get_state function will lead to a less
optimal Power manager service and it will need to use a more optimized method.
However, it is a good practice to have this function as the
application may want to query the device state.

Second proposed change is to add the below device power states:

Note: Many devices do not have all four power states defined.



Normal operation of the device. All device context is retained.



Device context is preserved by the HW and need not be restored by the

The device do not allow the Power Manager service to power it down.



Most device context is lost by the hardware.

Device drivers must save and restore or reinitialize any context lost

by the hardware.

The device can be powered down.

The device is allowing a wake signal to send them to active state.



Power has been fully removed from the device. The device context is

when this state is entered, so the OS software will reinitialize the

when powering it back on.

Device may not wake itself as the SoC need to reinitialize the device.
The description of the power states here sounds like they are
notifications. It sounds like some other component is setting the
power state and notifies using these values and the drivers do
save/restore or other operations based on the notification. Are the
drivers expected to gate clocks, turn off peripherals etc. in these notifications?

[HR] These device power states serve two purposes:
1. The drivers are expected to perform all the power/clock changes It
can perform according to the selected power state and do not influence
other drivers.
2. The power manager service will use the drivers states to decide on
system power policy to go to (it can also stay in
SYS_PM_DEVICE_SUSPEND_ONLY but to optimize the clock gating scheme)
Would these become part of a specification that all device drivers would
need to implement? In this scheme, the PM responsibilities are shared
between PM Service, various apps and drivers. So some plan needs to be
in place to ensure all of them cooperate as expected.

The initial part of the RFC does mention the application can set the
power state of the device and that is what the proposed set_state
function also suggests.

Do they serve both purposes? May be an example of how the device's
power state is actively changed and who and when does it, with respect
to these notifications, would help.

[HR] Here is an example:
There are three peripherals in a certain SOC: UART, I2C and SPI.
Both I2C and SPI are fed from the same PLL and the UART from a second one.
At the beginning the three peripherals are at DEVICE_PM_ACTIVE_STATE.
The user application decides that the I2C and the SPI should go to suspend.
It then calls the set_state function of these devices with DEVICE_PM_SUSPEND_STATE.
When idle comes the PM service is called and see that it can close the SPI and I2C PLL.
However, it cannot move to SYS_PM_DEEP_SLEEP as the UART is still active.
Will the PM service also put devices to suspend state, or only the apps
will do it? Looks like the PM Service will never enter Deep Sleep if any
device is on - any exceptions?

In the above example, the system had to go to idle for the PLL to get
turned off. If you had a central scheme to turn off clocks then the PLL
could have been turned off when both i2c and spi got turned off. Just an

Comments/concerns welcome.



A member of the Intel Corporation group of companies

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
A member of the Intel Corporation group of companies

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
A member of the Intel Corporation group of companies

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Join to automatically receive all group messages.