lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOf5uwnyNE=o=BE_-+oHGRfNvPEtw5jtTuBj45uOCiKNihFBrQ@mail.gmail.com>
Date:   Thu, 2 Mar 2023 10:34:46 +0100
From:   Michael Nazzareno Trimarchi <michael@...rulasolutions.com>
To:     John Stultz <jstultz@...gle.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Stephen Boyd <sboyd@...nel.org>, Arnd Bergmann <arnd@...db.de>,
        Michael <michael@...isi.de>, kernel-team@...roid.com,
        Peter Zijlstra <peterz@...radead.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>
Subject: Re: [RFC][PATCH 2/2] time: alarmtimer: Use TASK_FREEZABLE to cleanup
 freezer handling

Hi John

On Thu, Mar 2, 2023 at 1:48 AM John Stultz <jstultz@...gle.com> wrote:
>
> On Wed, Mar 1, 2023 at 2:11 PM Thomas Gleixner <tglx@...utronix.de> wrote:
> >
> > On Mon, Feb 27 2023 at 20:06, John Stultz wrote:
> > > On Mon, Feb 27, 2023 at 4:03 PM John Stultz <jstultz@...gle.com> wrote:
> > >> > On Mon, Feb 20 2023 at 19:11, Michael Nazzareno Trimarchi wrote:
> > >> > +static int alarmtimer_pm_notifier_fn(struct notifier_block *bl, unsigned long state,
> > >> > +                                    void *unused)
> > >> > +{
> > >> > +       switch (state) {
> > >> > +       case PM_HIBERNATION_PREPARE:
> > >> > +       case PM_POST_HIBERNATION:
> > >> > +               atomic_set(&alarmtimer_wakeup, 0);
> > >> > +               break;
> > >> > +       }
> > >> > +       return NOTIFY_DONE;
> > >>
> > >> But here, we're setting the alarmtimer_wakeup count to zero if we get
> > >> PM_HIBERNATION_PREPARE or  PM_POST_HIBERNATION notifications?
> > >> And Michael noted we need to add  PM_SUSPEND_PREPARE and
> > >> PM_POST_SUSPEND there for this to seemingly work.
> >
> > Yup. I missed those when sending out that hack.
> >
> > > So Thomas's notifier method of zeroing at the begining of suspend and
> > > tracking any wakeups after that point makes more sense now. It still
> > > feels a bit messy, but I'm not sure there's something better.
> >
> > I'm not enthused about it either.
>
> That said, it does work. :) In my testing, your approach has been
> reliable, so it has that going for it.
>
> > > My only thought is this feels a little bit like its mirroring what the
> > > pm_wakeup_event() logic is supposed to do. Should we be adding a
> > > pm_wakeup_event() to alarmtimer_fired() to try to prevent suspend from
> > > occuring for 500ms or so after an alarmtimer has fired so there is
> > > enough time for it to be re-armed if needed?
> >
> > The question is whether this can be called unconditionally and how that
> > interacts with the suspend logic. Rafael?
>
> I took a brief stab at this, and one thing is the test needs to use
> the /sys/power/wakeup_count dance before suspending.
> However, I still had some cases where the recurring alarmtimer got
> lost, so I need to dig a bit more to understand what was going wrong
> there.
>
> In the meantime, I'm ok with Thomas' approach, but we probably need
> some comment documentation that suggests it might be reworked in a
> cleaner way?
>
> thanks
> -john

For now I have pushed to our internal devices this commit message

time: alarmtimer: Fix wakeup lost between freeze(alarmtask) and
alarmtimer_suspend()

    An alarm timer can happen in between a freeze and alarmtimer_suspend as
    below output:

    > [   89.674127] PM: suspend entry (deep)
    > [   89.714916] Filesystems sync: 0.037 seconds
    > [   89.733594] Freezing user space processes
    > [   89.740680] Freezing user space processes completed (elapsed
0.002 seconds)

    User space tasks are frozen now.

    > [   89.748593] OOM killer disabled.
    > [   89.752257] Freezing remaining freezable tasks
    > [   89.756807] alarmtimer_fired: called
    > [   89.756831] alarmtimer_dequeue: called <---- HERE

    Here fires the underlying hrtimer before devices are suspended, so the
    sig_sendqueue() cannot wake up the task because task->state ==
    TASK_FROZEN, which means the signal won't be handled and the timer won't
    be rearmed until the task is thawed.

    The alarmtimer_suspend() path won't see a pending timer anymore because
    it is dequeued.

    So precisely the time between freeze(alarmtask) and alarmtimer_suspend()
    is a gaping hole which guarantees lost wakeups.

    That hole has been there forever.

    The old horrible freezer hackery was supposed to plug that
    hole, but that gem is not solving anything as far as I understand what
    it is doing.

Grab a bit from Thomas discussion

Michael

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ