linux-kernel - Re: [PATCH] signal: fix deadlock caused by calling printk() under sighand->siglock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87k06un1hi.fsf@email.froward.int.ebiederm.org>
Date:   Fri, 26 Aug 2022 16:23:05 -0500
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Ye Weihua <yeweihua4@...wei.com>
Cc:     <keescook@...omium.org>, <oleg@...hat.com>, <tglx@...utronix.de>,
        <chang.seok.bae@...el.com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] signal: fix deadlock caused by calling printk() under
 sighand->siglock

Ye Weihua <yeweihua4@...wei.com> writes:

> __dend_signal_locked() invokes __sigqueue_alloc() which may invoke a
> normal printk() to print failure message. This can cause a deadlock in
> the scenario reported by syz-bot below (test in 5.10):
>
> 	CPU0				CPU1
> 	----				----
> 	lock(&sighand->siglock);
> 					lock(&tty->read_wait);
> 					lock(&sighand->siglock);
> 	lock(console_owner);
>
> This patch specities __GFP_NOWARN to __sigqueue_alloc(), so that printk
> will not be called, and this deadlock problem can be avoided.

While the patch below will in theory fix the reported deadlock, I don't
think it is a good choice of fix.  As a rule we want to allow printk to
be callable in as many places as possible, so that it can be used for
debugging.  There are enough places that take siglock that outlawing
printk under siglock will make the kernel unstable.

I tried to read the current kernel and verify this deadlock to see if I
could suggest a better location to change the code to fix the deadlock.
Say modifying task_work_add to not take siglock.  The current
task_work_add does not take siglock.  I encountered a few other
significant function differences as well.  One significant difference is
that io_poll_double_wake no longer exists.

I think the amb-pl011.c driver might even be more different yet.

Can you reproduce this on current kernels?

Reading the code I think this is already fixed.

Perhaps you want to read the code of the affected subsystems and pick
some appropriate changes to backport to 5.10?

Eric