lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 26 Aug 2022 16:23:05 -0500
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Ye Weihua <yeweihua4@...wei.com>
Cc:     <keescook@...omium.org>, <oleg@...hat.com>, <tglx@...utronix.de>,
        <chang.seok.bae@...el.com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] signal: fix deadlock caused by calling printk() under
 sighand->siglock

Ye Weihua <yeweihua4@...wei.com> writes:

> __dend_signal_locked() invokes __sigqueue_alloc() which may invoke a
> normal printk() to print failure message. This can cause a deadlock in
> the scenario reported by syz-bot below (test in 5.10):
>
> 	CPU0				CPU1
> 	----				----
> 	lock(&sighand->siglock);
> 					lock(&tty->read_wait);
> 					lock(&sighand->siglock);
> 	lock(console_owner);
>
> This patch specities __GFP_NOWARN to __sigqueue_alloc(), so that printk
> will not be called, and this deadlock problem can be avoided.

While the patch below will in theory fix the reported deadlock, I don't
think it is a good choice of fix.  As a rule we want to allow printk to
be callable in as many places as possible, so that it can be used for
debugging.  There are enough places that take siglock that outlawing
printk under siglock will make the kernel unstable.

I tried to read the current kernel and verify this deadlock to see if I
could suggest a better location to change the code to fix the deadlock.
Say modifying task_work_add to not take siglock.  The current
task_work_add does not take siglock.  I encountered a few other
significant function differences as well.  One significant difference is
that io_poll_double_wake no longer exists.

I think the amb-pl011.c driver might even be more different yet.

Can you reproduce this on current kernels?

Reading the code I think this is already fixed.

Perhaps you want to read the code of the affected subsystems and pick
some appropriate changes to backport to 5.10?

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ