[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160714005524.GA517@swordfish>
Date: Thu, 14 Jul 2016 09:55:24 +0900
From: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To: Viresh Kumar <viresh.kumar@...aro.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
Jan Kara <jack@...e.cz>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
rjw@...ysocki.net, Tejun Heo <tj@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
vlevenetz@...sol.com, vaibhav.hiremath@...aro.org,
alex.elder@...aro.org, johan@...nel.org, akpm@...ux-foundation.org,
rostedt@...dmis.org, linux-pm@...r.kernel.org,
Petr Mladek <pmladek@...e.com>
Subject: Re: [Query] Preemption (hogging) of the work handler
Hello,
On (07/13/16 08:39), Viresh Kumar wrote:
[..]
> Maybe not, as this can still lead to the original bug we were all
> chasing. This may hog some other CPU if we are doing excessive
> printing in suspend :(
excessive printing is just part of the problem here. if we cab cond_resched()
in console_unlock() (IOW, we execute console_unlock() with preemption and
interrupts enabled) then everything must be ok, and *from printing POV* there
is no difference whether it's printk_kthread or anything else in this case.
the difference jumps in when original console_unlock() is executed with
preemption/irq disabled, then offloading it to schedulable printk_kthread is
the right thing.
> suspend_console() is called quite early, so for example in my case we
> do lots of printing during suspend (not from the suspend thread, but
> an IRQ handled by the USB subsystem, which removes a bus with help of
> some other thread probably).
a silly question -- can we suspend consoles later?
part of suspend/hibernation is cpu_down(), which lands in console_cpu_notify(),
that does synchronous printing for every CPU taken down:
static int console_cpu_notify(struct notifier_block *self,
unsigned long action, void *hcpu)
{
switch (action) {
case CPU_ONLINE:
case CPU_DEAD:
case CPU_DOWN_FAILED:
case CPU_UP_CANCELED:
console_lock();
console_unlock();
^^^^^^^^^^^^^^
}
return NOTIFY_OK;
}
console_unlock() is synchronous (I posted a very early draft patch that makes
it asynchronous, but that's a future work). so if there is a ton of printk()-s,
then console_unlock() will print it, 100% guaranteed. even if printk_kthread
is doing the printing job at the moment, cpu down path will wait for it to
stop, lock the console semaphore, and got to console_unlock() printing loop.
in printk that you have posted, that will happen not only for CPU_DEAD,
but for CPU_DYING as well (possibly, there is a /* invoked with preemption
disabled, so defer */ comment, so may be you never endup doing direct
printk there, but then you schedule a console_unlock() work).
> That is why my Hacky patch tried to do it after devices are removed
> and irqs are disabled, but before syscore users are suspended (and
> timekeeping is one of them). And so it fixes it for me completely.
>
> IOW, we should switch back to synchronous printing after disabling
> interrupts on the last running CPU.
>
> And I of course agree with Rafael that we would need something similar
> in Hibernation code path as well, if we choose to fix it my way.
suspend/hibernation/kexec - all covered by this patch.
-ss
Powered by blists - more mailing lists