linux-kernel - Re: [RFC/RFT][PATCH v0.1] ACPI: OSL: Use usleep_range() in acpi_os

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJvTdKm4Fermz1zgTWohEGSoGpoB3CJL2FF-u6y9FAEBwBbcnQ@mail.gmail.com>
Date: Wed, 4 Dec 2024 16:41:43 -0500
From: Len Brown <lenb@...nel.org>
To: "Rafael J. Wysocki" <rjw@...ysocki.net>
Cc: Linux ACPI <linux-acpi@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>, 
	Linux PM <linux-pm@...r.kernel.org>, Len Brown <len.brown@...el.com>, 
	Arjan van de Ven <arjan@...ux.intel.com>, Pierre Gondois <pierre.gondois@....com>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Hans de Goede <hdegoede@...hat.com>, 
	Mario Limonciello <mario.limonciello@....com>, "Gautham R. Shenoy" <gautham.shenoy@....com>
Subject: Re: [RFC/RFT][PATCH v0.1] ACPI: OSL: Use usleep_range() in acpi_os_sleep()

On Thu, Nov 21, 2024 at 8:15 AM Rafael J. Wysocki <rjw@...ysocki.net> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>
> As stated by Len in [1], the extra delay added by msleep() to the
> sleep time value passed to it can be significant, roughly between
> 1.5 ns on systems with HZ = 1000 and as much as 15 ms on systems with
> HZ = 100, which is hardly acceptable, at least for small sleep time
> values.

Maybe the problem statement is more clear with a concrete example:

msleep(5) on the default HZ=250 on a modern PC takes about 11.9 ms.
This results in over 800 ms of spurious system resume delay
on systems such as the Dell XPS-13-9300, which use ASL Sleep(5ms)
in a tight loop.

(yes, this additional cost used to be over 1200 ms before the v6.12
msleep rounding fix)

> -       msleep(ms);
> +       u64 usec = ms * USEC_PER_MSEC, delta_us = 50;

> +       if (ms > 5)
> +               delta_us = (USEC_PER_MSEC / 100) * ms

I measured 100 resume cycles on the Dell XPS 13 9300 on 4 kernels.
Here is the measured fastest kernel resume time in msec for each:

1. 1921.292 v6.12 msleep (baseline)
2. 1115.579 v6.12 delta_us = (USEC_PER_MSEC / 100) * ms (this patch)
3. 1113.396 v6.12 delta_us = 50
4. 1107.835 v6.12 delta_us = 0

(I didn't average the 100 runs, because random very long device
hiccups  throw off the average)

So any of #2, #3 and #4 are a huge step forward from what is shipping today!

So considering #2 vs #3 vs #4....

I agree that it is a problem for the timer sub-system to work to
maintain a 1ns granularity
that it can't actually deliver.

I think it is fine for the timer sub-system to allow calls to opt into
timer slack --
some callers may actually know what number to use.

However, I don't think that the timer sub-system should force callers to guess
how much slack is appropriate.  I think that a caller with 0 slack
should be internally
rounded up by the timer sub-system to the granularity that it can
actually deliver
with the timer that is currently in use on that system.

Note also that slack of 0 doesn't mean that no coalescing can happen.
A slack=0 timer can land within the slack another timer, and the other
timer will be pulled forward to coalesce.

The 50 usec default for user timer slack is certainly a magic number born
of tests of interesting workloads on interesting systems on a certain date.
It may not be the right number for other workloads, or other systems
with other timers on other dates.

My opinion...

I don't see a justification for increasing timer slack with increasing duration.
User-space timers don't pay this additional delay, why should the ASL
programmer?

Also, the graduated increasing slack with duration is a guess no more
valid than the guess of a flat 50 usec.

A flat 50 or a flat 0 have the virtue of being simple -- they will be simpler
to understand and maintain in the future.

But I can live with any of these options, since they are all a big step forward.

thanks,
Len Brown, Intel Open Source Technology Center