linux-kernel - Re: possible deadlock in send

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201105062351.GA2840779@boqun-archlinux>
Date:   Thu, 5 Nov 2020 14:23:51 +0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     syzbot <syzbot+c5e32344981ad9f33750@...kaller.appspotmail.com>
Cc:     bfields@...ldses.org, jlayton@...nel.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        mingo@...hat.com, peterz@...radead.org,
        syzkaller-bugs@...glegroups.com, viro@...iv.linux.org.uk,
        will@...nel.org
Subject: Re: possible deadlock in send_sigurg (2)

Hi,

On Wed, Nov 04, 2020 at 04:18:08AM -0800, syzbot wrote:
> syzbot has bisected this issue to:
> 
> commit e918188611f073063415f40fae568fa4d86d9044
> Author: Boqun Feng <boqun.feng@...il.com>
> Date:   Fri Aug 7 07:42:20 2020 +0000
> 
>     locking: More accurate annotations for read_lock()
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14142732500000
> start commit:   4ef8451b Merge tag 'perf-tools-for-v5.10-2020-11-03' of gi..
> git tree:       upstream
> final oops:     https://syzkaller.appspot.com/x/report.txt?x=16142732500000
> console output: https://syzkaller.appspot.com/x/log.txt?x=12142732500000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61033507391c77ff
> dashboard link: https://syzkaller.appspot.com/bug?extid=c5e32344981ad9f33750
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15197862500000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=13c59f6c500000
> 
> Reported-by: syzbot+c5e32344981ad9f33750@...kaller.appspotmail.com
> Fixes: e918188611f0 ("locking: More accurate annotations for read_lock()")
> 
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Thanks for reporting this, and this is actually a deadlock potential
detected by the newly added recursive read deadlock detection as my
analysis:

	https://lore.kernel.org/lkml/20200910071523.GF7922@debian-boqun.qqnc3lrjykvubdpftowmye0fmh.lx.internal.cloudapp.net

Besides, other reports[1][2] are caused by the same problem. I made a
fix for this, please have a try and see if it's get fixed.

Regards,
Boqun

[1]: https://lore.kernel.org/lkml/000000000000d7136005aee14bf9@google.com
[2]: https://lore.kernel.org/lkml/0000000000006e29ed05b3009b04@google.com

----------------------------------------------------->8
>From 7fbe730fcff2d7909be034cf6dc8bf0604d0bf14 Mon Sep 17 00:00:00 2001
From: Boqun Feng <boqun.feng@...il.com>
Date: Thu, 5 Nov 2020 14:02:57 +0800
Subject: [PATCH] fs/fcntl: Fix potential deadlock in send_sig{io, urg}()

Syzbot reports a potential deadlock found by the newly added recursive
read deadlock detection in lockdep:

[...] ========================================================
[...] WARNING: possible irq lock inversion dependency detected
[...] 5.9.0-rc2-syzkaller #0 Not tainted
[...] --------------------------------------------------------
[...] syz-executor.1/10214 just changed the state of lock:
[...] ffff88811f506338 (&f->f_owner.lock){.+..}-{2:2}, at: send_sigurg+0x1d/0x200
[...] but this lock was taken by another, HARDIRQ-safe lock in the past:
[...]  (&dev->event_lock){-...}-{2:2}
[...]
[...]
[...] and interrupts could create inverse lock ordering between them.
[...]
[...]
[...] other info that might help us debug this:
[...] Chain exists of:
[...]   &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
[...]
[...]  Possible interrupt unsafe locking scenario:
[...]
[...]        CPU0                    CPU1
[...]        ----                    ----
[...]   lock(&f->f_owner.lock);
[...]                                local_irq_disable();
[...]                                lock(&dev->event_lock);
[...]                                lock(&new->fa_lock);
[...]   <Interrupt>
[...]     lock(&dev->event_lock);
[...]
[...]  *** DEADLOCK ***

The corresponding deadlock case is as followed:

	CPU 0		CPU 1		CPU 2
	read_lock(&fown->lock);
			spin_lock_irqsave(&dev->event_lock, ...)
					write_lock_irq(&filp->f_owner.lock); // wait for the lock
			read_lock(&fown-lock); // have to wait until the writer release
					       // due to the fairness
	<interrupted>
	spin_lock_irqsave(&dev->event_lock); // wait for the lock

The lock dependency on CPU 1 happens if there exists a call sequence:

	input_inject_event():
	  spin_lock_irqsave(&dev->event_lock,...);
	  input_handle_event():
	    input_pass_values():
	      input_to_handler():
	        handler->event(): // evdev_event()
	          evdev_pass_values():
	            spin_lock(&client->buffer_lock);
	            __pass_event():
	              kill_fasync():
	                kill_fasync_rcu():
	                  read_lock(&fa->fa_lock);
	                  send_sigio():
	                    read_lock(&fown->lock);

To fix this, make the reader in send_sigurg() and send_sigio() use
read_lock_irqsave() and read_lock_irqrestore().

Reported-by: syzbot+22e87cdf94021b984aa6@...kaller.appspotmail.com
Reported-by: syzbot+c5e32344981ad9f33750@...kaller.appspotmail.com
Signed-off-by: Boqun Feng <boqun.feng@...il.com>
---
 fs/fcntl.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index 19ac5baad50f..05b36b28f2e8 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -781,9 +781,10 @@ void send_sigio(struct fown_struct *fown, int fd, int band)
 {
 	struct task_struct *p;
 	enum pid_type type;
+	unsigned long flags;
 	struct pid *pid;
 	
-	read_lock(&fown->lock);
+	read_lock_irqsave(&fown->lock, flags);
 
 	type = fown->pid_type;
 	pid = fown->pid;
@@ -804,7 +805,7 @@ void send_sigio(struct fown_struct *fown, int fd, int band)
 		read_unlock(&tasklist_lock);
 	}
  out_unlock_fown:
-	read_unlock(&fown->lock);
+	read_unlock_irqrestore(&fown->lock, flags);
 }
 
 static void send_sigurg_to_task(struct task_struct *p,
@@ -819,9 +820,10 @@ int send_sigurg(struct fown_struct *fown)
 	struct task_struct *p;
 	enum pid_type type;
 	struct pid *pid;
+	unsigned long flags;
 	int ret = 0;
 	
-	read_lock(&fown->lock);
+	read_lock_irqsave(&fown->lock, flags);
 
 	type = fown->pid_type;
 	pid = fown->pid;
@@ -844,7 +846,7 @@ int send_sigurg(struct fown_struct *fown)
 		read_unlock(&tasklist_lock);
 	}
  out_unlock_fown:
-	read_unlock(&fown->lock);
+	read_unlock_irqrestore(&fown->lock, flags);
 	return ret;
 }
 
-- 
2.28.0