[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87k06un1hi.fsf@email.froward.int.ebiederm.org>
Date: Fri, 26 Aug 2022 16:23:05 -0500
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Ye Weihua <yeweihua4@...wei.com>
Cc: <keescook@...omium.org>, <oleg@...hat.com>, <tglx@...utronix.de>,
<chang.seok.bae@...el.com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] signal: fix deadlock caused by calling printk() under
sighand->siglock
Ye Weihua <yeweihua4@...wei.com> writes:
> __dend_signal_locked() invokes __sigqueue_alloc() which may invoke a
> normal printk() to print failure message. This can cause a deadlock in
> the scenario reported by syz-bot below (test in 5.10):
>
> CPU0 CPU1
> ---- ----
> lock(&sighand->siglock);
> lock(&tty->read_wait);
> lock(&sighand->siglock);
> lock(console_owner);
>
> This patch specities __GFP_NOWARN to __sigqueue_alloc(), so that printk
> will not be called, and this deadlock problem can be avoided.
While the patch below will in theory fix the reported deadlock, I don't
think it is a good choice of fix. As a rule we want to allow printk to
be callable in as many places as possible, so that it can be used for
debugging. There are enough places that take siglock that outlawing
printk under siglock will make the kernel unstable.
I tried to read the current kernel and verify this deadlock to see if I
could suggest a better location to change the code to fix the deadlock.
Say modifying task_work_add to not take siglock. The current
task_work_add does not take siglock. I encountered a few other
significant function differences as well. One significant difference is
that io_poll_double_wake no longer exists.
I think the amb-pl011.c driver might even be more different yet.
Can you reproduce this on current kernels?
Reading the code I think this is already fixed.
Perhaps you want to read the code of the affected subsystems and pick
some appropriate changes to backport to 5.10?
Eric
Powered by blists - more mailing lists