Re: [RFC] Power Management Infrastructure

Dmitriy Korovkin

Hi Hezi,
I think RFC needs to be extended with the description of the idea of
controlling power state of each device (if I got you correctly).
Without it the need of
int (*get_state)(struct device *device, int device_pm_policy);
looks very unclear.
If all you need is to provide more that two states, then set_state()
looks quite enough.


Dmitriy Korovkin

On 16-07-13 12:11 PM, Rahamim, Hezi wrote:
Hi Ramesh,

Please see my comments below/

Thanks for the comments,

-----Original Message-----
From: Thomas, Ramesh [mailto:ramesh.thomas(a)]
Sent: Wednesday, July 13, 2016 09:41
To: devel(a)
Subject: [devel] Re: [RFC] Power Management Infrastructure

On 7/12/2016 2:03 AM, Rahamim, Hezi wrote:
Hi all,

Current state


In the current Zephyr implementation the driver power hooks
distinguish only

between two driver states (suspend and resume). Drivers may have more
than two
Currently suspend and resume are not actually states but a notification of the state transition. There is a second argument to those functions that specify the current policy for which the transition is happening.

states (i.e. D-states) and can traverse between those states. The state

today is limited only from active to suspend while there can be cases of

transitions requested by the application.

Please look at the below suggested device power states E.g. transition


Moreover, the current device state cannot be queried by an application or

a Power Manager service.

Below is the current Zephyr PM hooks:

struct device_pm_ops {

int (*suspend)(struct device *device, int pm_policy);

int (*resume)(struct device *device, int pm_policy);


Proposed changes


First proposed change is to have a set state and get state driver functions

instead of the suspend resume functions:

struct device_pm_ops {

int (*set_state)(struct device *device, int device_pm_policy);

int (*get_state)(struct device *device, int device_pm_policy);


The set_state function will behave according to the state transition of
a specific

driver. E.g. transition from active state to suspend state in a UART
device will

save device states and gate the clock.
The proposal, as I understand, is to use the pm hooks to actively
control the power states instead of using them as just notifications of
the SOC's power transitions. Considering this, we had one power policy
called "device_suspend_only". That is open to be broken down into more
device specific power states.

[HR] You are correct, we intend to use the pm driver hooks to actively control
the device Power states. We will use the Zephyer's current power policies to
indicate the system power state. As you mentioned, when devices will not be
in active state the system can still be at "device_suspend_only" state.

The get_state function will enable the Power Manager service to know the

of each driver thus enable it to configure the SoC power behavior.
The set_state function looks ok. It is the same as the current suspend
but with the name change. The purpose of the name change is to add a
corresponding get_state. The RFC is not giving much details of the use
of the get_state function.

I assume there is a need for the PM service to build a device tree, with
power hierarchy. It would be helpful if you could explain how this
function fits in your larger design of the PM service's power policy
management of the devices.

[HR] I will give an example:
A user application decides to suspend the I2C and the SPI devices. The application
will then call the corresponding set_state functions of these devices.
The set_state functions will perform the store of the relevant device state and
gate the device clock. In the next idle time the _sys_soc_suspend will be called.
This will trigger the power manager service which will decide what should be done
to optimize the power (clock gate a branch or change the system power state.
The decision of the power manager service will be based on the devices states
which can be obtained also by using the get_state functions.

Since the PM service is expected to have communication established with
all components in the system, wouldn't it know what state each device is
set to? Querying each device and building a tree every time there is an
opportunity to suspend, may take some time causing delay in suspend.

[HR] You are correct, using the get_state function will lead to a less optimal
Power manager service and it will need to use a more optimized method.
However, it is a good practice to have this function as the application
may want to query the device state.

Second proposed change is to add the below device power states:

Note: Many devices do not have all four power states defined.



Normal operation of the device. All device context is retained.



Device context is preserved by the HW and need not be restored by the

The device do not allow the Power Manager service to power it down.



Most device context is lost by the hardware.

Device drivers must save and restore or reinitialize any context lost

by the hardware.

The device can be powered down.

The device is allowing a wake signal to send them to active state.



Power has been fully removed from the device. The device context is lost

when this state is entered, so the OS software will reinitialize the device

when powering it back on.

Device may not wake itself as the SoC need to reinitialize the device.
The description of the power states here sounds like they are
notifications. It sounds like some other component is setting the power
state and notifies using these values and the drivers do save/restore or
other operations based on the notification. Are the drivers expected to
gate clocks, turn off peripherals etc. in these notifications?

[HR] These device power states serve two purposes:
1. The drivers are expected to perform all the power/clock changes
It can perform according to the selected power state and do not influence
other drivers.
2. The power manager service will use the drivers states to decide on
system power policy to go to (it can also stay in SYS_PM_DEVICE_SUSPEND_ONLY
but to optimize the clock gating scheme)

The initial part of the RFC does mention the application can set the
power state of the device and that is what the proposed set_state
function also suggests.

Do they serve both purposes? May be an example of how the device's
power state is actively changed and who and when does it, with respect
to these notifications, would help.

[HR] Here is an example:
There are three peripherals in a certain SOC: UART, I2C and SPI.
Both I2C and SPI are fed from the same PLL and the UART from a second one.
At the beginning the three peripherals are at DEVICE_PM_ACTIVE_STATE.
The user application decides that the I2C and the SPI should go to suspend.
It then calls the set_state function of these devices with DEVICE_PM_SUSPEND_STATE.
When idle comes the PM service is called and see that it can close the SPI and I2C PLL.
However, it cannot move to SYS_PM_DEEP_SLEEP as the UART is still active.

Comments/concerns welcome.



A member of the Intel Corporation group of companies

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
A member of the Intel Corporation group of companies

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Join to automatically receive all group messages.