[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0iWZqAcbqdTuKbo35Gk6vNS0h9De20GDNZeZvqfhQiSWA@mail.gmail.com>
Date: Thu, 2 Mar 2023 15:32:49 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Michael Nazzareno Trimarchi <michael@...rulasolutions.com>,
John Stultz <jstultz@...gle.com>,
LKML <linux-kernel@...r.kernel.org>,
Stephen Boyd <sboyd@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Michael <michael@...isi.de>, kernel-team@...roid.com,
Peter Zijlstra <peterz@...radead.org>,
"Rafael J. Wysocki" <rafael@...nel.org>
Subject: Re: [RFC][PATCH 2/2] time: alarmtimer: Use TASK_FREEZABLE to cleanup
freezer handling
On Mon, Feb 20, 2023 at 10:19 PM Thomas Gleixner <tglx@...utronix.de> wrote:
>
> Michael!
>
> On Mon, Feb 20 2023 at 19:11, Michael Nazzareno Trimarchi wrote:
> > On Mon, Feb 20, 2023 at 6:48 PM Thomas Gleixner <tglx@...utronix.de> wrote:
> >> [ 27.349352] alarmtimer_enqueue()
> >>
> >> U: Before SUSPEND
> >>
> >> [ 31.353228] PM: suspend entry (s2idle)
> >> [ 31.388857] Filesystems sync: 0.033 seconds
> >> [ 31.418427] Freezing user space processes
> >> [ 31.422406] Freezing user space processes completed (elapsed 0.002 seconds)
> >> [ 31.425435] OOM killer disabled.
> >> [ 31.426833] Freezing remaining freezable tasks
> >> [ 31.429838] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> >> [ 31.432922] printk: Suspending console(s) (use no_console_suspend to debug)
> >> [ 31.435912] alarmtimer alarmtimer.0.auto: PM: dpm_run_callback(): platform_pm_suspend+0x0/0x50 returns -16
> >> [ 31.435954] alarmtimer alarmtimer.0.auto: PM: failed to suspend: error -16
> >>
> >> That means the RTC interrupt was raised before the system was able to suspend.
> >
> > if (ktime_to_ns(min) < 2 * NSEC_PER_SEC) {
> > pm_wakeup_event(dev, 2 * MSEC_PER_SEC);
> > return -EBUSY;
> > }
> >
> > I think that above happens to you. So it means that you are too close
> > to this limit, can be?
>
> Maybe.
>
> > Yes but the alarm for me was set to be fired just before freezing. Is
> > this a valid scenario?
>
> Sure why not?
>
> > Let's say that I set an alarm to be fired just before the userspace
> > freeze happens. If I'm close to it then then process will not process
> > the signal to enquene again the alarm in the list and then during
> > alarm suspend the list will be empty for the above.
>
> Halfways, but slowly your explanations start to make sense. Here is the
> dmesg output you provided:
>
> > [ 89.674127] PM: suspend entry (deep)
> > [ 89.714916] Filesystems sync: 0.037 seconds
> > [ 89.733594] Freezing user space processes
> > [ 89.740680] Freezing user space processes completed (elapsed 0.002 seconds)
>
> User space tasks are frozen now.
>
> > [ 89.748593] OOM killer disabled.
> > [ 89.752257] Freezing remaining freezable tasks
> > [ 89.756807] alarmtimer_fired: called
> > [ 89.756831] alarmtimer_dequeue: called <---- HERE
>
> Here fires the underlying hrtimer before devices are suspended, so the
> sig_sendqueue() cannot wake up the task because task->state ==
> TASK_FROZEN, which means the signal wont be handled and the timer wont
> be rearmed until the task is thawed.
>
> And as you correctly observed the alarmtimer_suspend() path won't see a
> pending timer anymore because it is dequeued.
>
> So precisely the time between freeze(alarmtask) and alarmtimer_suspend()
> is a gaping hole which guarantees lost wakeups.
>
> That's completely unrelated to Johns patches. That hole has been there
> forever.
>
> I assume that this horrible freezer hackery was supposed to plug that
> hole, but that gem is not solving anything as far as I understand what
> it is doing. I'm still failing to understand what it is supposed to
> solve, but that's a different story.
>
> Aside of that I can clearly reproduce the issue, now that I understand
> what you are trying to tell, on plain 6.2 without and with John's
> cleanup.
>
> Something like the below plugs the hole reliably.
>
> Thanks,
>
> tglx
> ---
> --- a/kernel/time/alarmtimer.c
> +++ b/kernel/time/alarmtimer.c
> @@ -26,6 +26,7 @@
> #include <linux/freezer.h>
> #include <linux/compat.h>
> #include <linux/module.h>
> +#include <linux/suspend.h>
> #include <linux/time_namespace.h>
>
> #include "posix-timers.h"
> @@ -176,6 +177,7 @@ static void alarmtimer_dequeue(struct al
> alarm->state &= ~ALARMTIMER_STATE_ENQUEUED;
> }
>
> +static atomic_t alarmtimer_wakeup;
>
> /**
> * alarmtimer_fired - Handles alarm hrtimer being fired.
> @@ -194,6 +196,8 @@ static enum hrtimer_restart alarmtimer_f
> int ret = HRTIMER_NORESTART;
> int restart = ALARMTIMER_NORESTART;
>
> + atomic_inc(&alarmtimer_wakeup);
> +
This appears to be still somewhat racy, because the notifier can run
at this point AFAICS.
> spin_lock_irqsave(&base->lock, flags);
> alarmtimer_dequeue(base, alarm);
> spin_unlock_irqrestore(&base->lock, flags);
> @@ -244,6 +248,16 @@ static int alarmtimer_suspend(struct dev
> if (!rtc)
> return 0;
>
> + /*
> + * Handle wakeups which happened between the start of suspend and
> + * now as those wakeups might have tried to wake up a frozen task
> + * which means they are not longer in the alarm timer list.
> + */
> + if (atomic_read(&alarmtimer_wakeup)) {
> + pm_wakeup_event(dev, 0);
> + return -EBUSY;
> + }
> +
> /* Find the soonest timer to expire*/
> for (i = 0; i < ALARM_NUMTYPE; i++) {
> struct alarm_base *base = &alarm_bases[i];
> @@ -296,6 +310,31 @@ static int alarmtimer_resume(struct devi
> return 0;
> }
>
> +static int alarmtimer_pm_notifier_fn(struct notifier_block *bl, unsigned long state,
> + void *unused)
> +{
> + switch (state) {
> + case PM_HIBERNATION_PREPARE:
> + case PM_POST_HIBERNATION:
> + atomic_set(&alarmtimer_wakeup, 0);
> + break;
> + }
> + return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block alarmtimer_pm_notifier = {
> + .notifier_call = alarmtimer_pm_notifier_fn,
> +};
> +
> +static inline int alarmtimer_register_pm_notifier(void)
> +{
> + return register_pm_notifier(&alarmtimer_pm_notifier);
> +}
> +
> +static inline void alarmtimer_unregister_pm_notifier(void)
> +{
> + unregister_pm_notifier(&alarmtimer_pm_notifier);
> +}
> #else
> static int alarmtimer_suspend(struct device *dev)
> {
> @@ -306,6 +345,15 @@ static int alarmtimer_resume(struct devi
> {
> return 0;
> }
> +
> +static inline int alarmtimer_register_pm_notifier(void)
> +{
> + return 0;
> +}
> +
> +static inline void alarmtimer_unregister_pm_notifier(void)
> +{
> +}
> #endif
>
> static void
> @@ -904,11 +952,17 @@ static int __init alarmtimer_init(void)
> if (error)
> return error;
>
> - error = platform_driver_register(&alarmtimer_driver);
> + error = alarmtimer_register_pm_notifier();
> if (error)
> goto out_if;
>
> + error = platform_driver_register(&alarmtimer_driver);
> + if (error)
> + goto out_pm;
> +
> return 0;
> +out_pm:
> + alarmtimer_unregister_pm_notifier();
> out_if:
> alarmtimer_rtc_interface_remove();
> return error;
Powered by blists - more mailing lists