[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFH1YnN4_BSP8OywYpLfBHnRRThpk27PEmfYe4baaOkO6FQaNA@mail.gmail.com>
Date: Mon, 6 Jan 2020 10:44:42 +0800
From: Zhenzhong Duan <zhenzhong.duan@...il.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
Arnd Bergmann <arnd@...db.de>
Subject: Re: [PATCH RESEND] ttyprintk: fix a potential sleeping in interrupt
context issue
On Fri, Jan 3, 2020 at 4:43 PM Greg Kroah-Hartman
<gregkh@...uxfoundation.org> wrote:
>
> On Fri, Jan 03, 2020 at 11:45:41AM +0800, Zhenzhong Duan wrote:
> > Google syzbot reports:
> > BUG: sleeping function called from invalid context at
> > kernel/locking/mutex.c:938
> > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/1
> > 1 lock held by swapper/1/0:
> > ...
> > Call Trace:
> > <IRQ>
> > dump_stack+0x197/0x210
> > ___might_sleep.cold+0x1fb/0x23e
> > __might_sleep+0x95/0x190
> > __mutex_lock+0xc5/0x13c0
> > mutex_lock_nested+0x16/0x20
> > tpk_write+0x5d/0x340
> > resync_tnc+0x1b6/0x320
> > call_timer_fn+0x1ac/0x780
> > run_timer_softirq+0x6c3/0x1790
> > __do_softirq+0x262/0x98c
> > irq_exit+0x19b/0x1e0
> > smp_apic_timer_interrupt+0x1a3/0x610
> > apic_timer_interrupt+0xf/0x20
> > </IRQ>
> >
> > Fix it by using spinlock in process context instead of mutex and having
> > interrupt disabled in critical section.
> >
> > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@...il.com>
> > Cc: Arnd Bergmann <arnd@...db.de>
> > Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> > ---
> > drivers/char/ttyprintk.c | 15 +++++++++------
> > 1 file changed, 9 insertions(+), 6 deletions(-)
>
> Why was this resent? What differs from the first version that required
> it to be resent?
>
> Always give us a clue here please :)
Sorry, I should have done that.
patch-bot told me my last version is malformed(tabs converted to
spaces) which may be due to I used gmail web browser to send patch.
Now I have direct access to smtp.gmail.com and use 'git send-email',
so that's not an issue now. No functional changes compared to last
version.
>
>
> >
> > diff --git a/drivers/char/ttyprintk.c b/drivers/char/ttyprintk.c
> > index 4f24e46ebe7c..56db949a7b70 100644
> > --- a/drivers/char/ttyprintk.c
> > +++ b/drivers/char/ttyprintk.c
> > @@ -15,10 +15,11 @@
> > #include <linux/serial.h>
> > #include <linux/tty.h>
> > #include <linux/module.h>
> > +#include <linux/spinlock.h>
> >
> > struct ttyprintk_port {
> > struct tty_port port;
> > - struct mutex port_write_mutex;
> > + spinlock_t spinlock;
> > };
> >
> > static struct ttyprintk_port tpk_port;
> > @@ -99,11 +100,12 @@ static int tpk_open(struct tty_struct *tty, struct file *filp)
> > static void tpk_close(struct tty_struct *tty, struct file *filp)
> > {
> > struct ttyprintk_port *tpkp = tty->driver_data;
> > + unsigned long flags;
> >
> > - mutex_lock(&tpkp->port_write_mutex);
> > + spin_lock_irqsave(&tpkp->spinlock, flags);
> > /* flush tpk_printk buffer */
> > tpk_printk(NULL, 0);
>
> Are you sure you can call this with a spinlock held?
I think so.
>
> Doesn't your trace above show the opposite?
That's why I use spin_lock_irqsave() variants rather than spin_lock()
The issue here is tpk_write()/tpk_close() could be interrupted when
holding a mutex, then in timer handler tpk_write() is called again
trying to acquire same mutex, lead to dead lock.
With spin_lock_irqsave(), interrupt is disabled in process context, so
no such issue.
>
> What is wrong with sleeping during the mutex you currently have? How is
> syzbot reporting this error, is there a reproducer somewhere?
See https://syzkaller.appspot.com/bug?extid=2eeef62ee31f9460ad65
I didn't reproduce the dead lock locally, not even for the warning
syzbot reported, but syzbot does.
Thanks
Zhenzhong
Powered by blists - more mailing lists