lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANDhNCopf___L_3cARqK8WDHSJWFYYikxvANHh2CRMjDRag9yw@mail.gmail.com>
Date:   Fri, 10 Feb 2023 17:04:28 -0800
From:   John Stultz <jstultz@...gle.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Michael Nazzareno Trimarchi <michael@...rulasolutions.com>,
        Michael <michael@...isi.de>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>,
        linux-rtc@...r.kernel.org, Stephen Boyd <sboyd@...nel.org>,
        linux-kernel@...r.kernel.org
Subject: Re: Problem when function alarmtimer_suspend returns 0 if time delta
 is zero

On Thu, Feb 9, 2023 at 7:40 AM Thomas Gleixner <tglx@...utronix.de> wrote:
> On Thu, Feb 09 2023 at 12:19, Michael Nazzareno Trimarchi wrote:
> > On Wed, Feb 8, 2023 at 7:06 PM Thomas Gleixner <tglx@...utronix.de> wrote:
> >> I wrote that patch against the back then mainline code. No idea if it's
> >> still applying, but the underlying issue is still the same AFAICT.
> >>
> >> It needs some polishing and a proper changelog.
> >>
> > Ok, I will try to update it on some mainline kernel in my environment
> > and test it back. I need
> > a little information if it's possible. Consider that I have no
> > experience in this area. I understand how
> > code was designed in general but the part around the freezer and all
> > those code you remove, what was the logic behind in the removed code?
>
> What I can oracle out of that well commented gem is:
>
>   A userspace task invokes clock_nanosleep(CLOCK_*_ALARM, ...), which
>   arms an alarm timer. The expiry of an alarmtimer causes the system to
>   come out of suspend.
>
>   As the task invokes schedule() it can also be brought back from
>   schedule() via a signal. If that happens then the task cancels the
>   alarmtimer and returns to handle the signal. While doing that it can
>   be frozen, which means the alarm and therefore the wake from suspend
>   is lost.
>
>   To prevent that the code tries to save the earliest expiring alarm if
>   the task is marked freezing() and the suspend code takes that into
>   account.
>
> John, did I summarize that correctly?
>
> The change I made remove that magic and marks the task freezable when it
> goes to schedule, which prevents the signal wakeup. That ensures that
> the alarm stays armed during freeze/suspend.
>
> That obviously needs some testing and scrunity by the folks which use
> this mechanism. IIRC that's used by android, but I might be wrong.

So, thanks for dredging this old thread up, I'm sorry I didn't see it
the first time it came around.

Not having a clear memory of why we do the (min == 0) early return, I
went digging in, and found it was in the original git commit, so I
went looking to the archives.

It's completely not present in the first version of the patch:
  https://lore.kernel.org/lkml/1288809079-14663-8-git-send-email-john.stultz@linaro.org/

But it did appear in the second version:
  https://lore.kernel.org/lkml/1290136329-18291-6-git-send-email-john.stultz@linaro.org/

And from there it's a clear case of wanting to avoid setting the RTC
if there were just no timers to expire.

But, indeed this is a bug, as initializing min to ktime_set(0, 0) as
the "invalid" case isn't a good plan, as it might be possible that
suspend is run right as an alarmtimer expires, and you get a real zero
delta value (as has been reported).

Instead it seemed I should have used KTIME_MAX as the "invalid" case
(as Thomas' patch uses).

However, before the patch was merged, it changed further to handle the
freezer waking a current sleeper:
  https://lore.kernel.org/lkml/1294280159-2513-13-git-send-email-john.stultz@linaro.org/

Which was then used to initialize the min value (still erroneously
using 0 as an "invalid" value) in the case that the freezer woke a
task sleeping which would cause the alarm to be lost (as Thomas
summarized).

Thomas' patch fixes the erronious 0-as-invalid initialization issue
using KTIME_MAX but also simplifies the logic getting rid of the
freezer handling.

I don't have as much familiarity with the freezer handling change, so
while it looks sane, I can't say I would likely catch an issue doing a
visual review.

Michael: If you are still intending to send the patch out, please do,
otherwise I've already forward ported it so I can do some testing with
it. I'm happy to put together a commit message and send it out if
that's easier for you.

And, just for correct Reported-by: tags: Michael Trimarchi, are you
the same Michael (michael@...isi.de) that originally reported this
issue?

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ