linux-kernel - Re: [PATCH v5 4/4] platform/x86/intel/pmc: core: Report duration of time in HW sleep state

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3f7ec97b-5fa6-9c54-b4b2-58ebf4f88449@amd.com>
Date:   Mon, 3 Apr 2023 13:07:43 -0500
From:   "Limonciello, Mario" <mario.limonciello@....com>
To:     "Box, David E" <david.e.box@...el.com>,
        "rafael@...nel.org" <rafael@...nel.org>
Cc:     "jstultz@...gle.com" <jstultz@...gle.com>,
        "Shyam-sundar.S-k@....com" <Shyam-sundar.S-k@....com>,
        "markgross@...nel.org" <markgross@...nel.org>,
        "rrangel@...omium.org" <rrangel@...omium.org>,
        "Jain, Rajat" <rajatja@...gle.com>,
        "irenic.rajneesh@...il.com" <irenic.rajneesh@...il.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "hdegoede@...hat.com" <hdegoede@...hat.com>,
        "svenva@...omium.org" <svenva@...omium.org>,
        "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
        "platform-driver-x86@...r.kernel.org" 
        <platform-driver-x86@...r.kernel.org>
Subject: Re: [PATCH v5 4/4] platform/x86/intel/pmc: core: Report duration of
 time in HW sleep state

On 4/3/2023 13:00, Box, David E wrote:
> On Fri, 2023-03-31 at 20:05 +0200, Rafael J. Wysocki wrote:
>> On Thu, Mar 30, 2023 at 9:45 PM Mario Limonciello
>> <mario.limonciello@....com> wrote:
>>>
>>> intel_pmc_core displays a warning when the module parameter
>>> `warn_on_s0ix_failures` is set and a suspend didn't get to a HW sleep
>>> state.
>>>
>>> Report this to the standard kernel reporting infrastructure so that
>>> userspace software can query after the suspend cycle is done.
>>>
>>> Signed-off-by: Mario Limonciello <mario.limonciello@....com>
>>> ---
>>> v4->v5:
>>>   * Reword commit message
>>> ---
>>>   drivers/platform/x86/intel/pmc/core.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/platform/x86/intel/pmc/core.c
>>> b/drivers/platform/x86/intel/pmc/core.c
>>> index e2f171fac094..980af32dd48a 100644
>>> --- a/drivers/platform/x86/intel/pmc/core.c
>>> +++ b/drivers/platform/x86/intel/pmc/core.c
>>> @@ -1203,6 +1203,8 @@ static inline bool pmc_core_is_s0ix_failed(struct
>>> pmc_dev *pmcdev)
>>>          if (pmc_core_dev_state_get(pmcdev, &s0ix_counter))
>>>                  return false;
>>>
>>> +       pm_set_hw_sleep_time(s0ix_counter - pmcdev->s0ix_counter);
>>> +
>>
>> Maybe check if this is really accumulating?  In case of a counter
>> overflow, for instance?
> 
> Overflow is likely on some systems. The counter is only 32-bit and at our
> smallest granularity of 30.5us per tick it could overflow after a day and a half
> of s0ix time, though most of our systems have a higher granularity that puts
> them around 6 days.
> 
> This brings up an issue that the attribute cannot be trusted if the system is
> suspended for longer than the maximum hardware counter time. Should be noted in
> the Documentation.

I think it would be rather confusing for userspace having to account for 
this and it's better to abstract it in the kernel.

How can you discover the granularity a system can support?
How would you know overflow actually happened?  Is there a bit somewhere 
else that could tell you?

In terms of ABI how about when we know overflow occurred and userspace 
reads the sysfs file we return -EOVERFLOW instead of a potentially bad 
value?