[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240428232302.4035-1-hdanton@sina.com>
Date: Mon, 29 Apr 2024 07:23:02 +0800
From: Hillf Danton <hdanton@...a.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: syzbot <syzbot+83e7f982ca045ab4405c@...kaller.appspotmail.com>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
andrii@...nel.org,
bpf@...r.kernel.org,
linux-kernel@...r.kernel.org,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [bpf?] [trace?] possible deadlock in force_sig_info_to_task
On Sun, 28 Apr 2024 13:01:19 -0700 Linus Torvalds wrote:
> On Sat, 27 Apr 2024 at 16:13, Hillf Danton <hdanton@...a.com> wrote:
> >
> > > -> #0 (&sighand->siglock){....}-{2:2}:
> > > check_prev_add kernel/locking/lockdep.c:3134 [inline]
> > > check_prevs_add kernel/locking/lockdep.c:3253 [inline]
> > > validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
> > > __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
> > > lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
> > > __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> > > _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
> > > force_sig_info_to_task+0x68/0x580 kernel/signal.c:1334
> > > force_sig_fault_to_task kernel/signal.c:1733 [inline]
> > > force_sig_fault+0x12c/0x1d0 kernel/signal.c:1738
> > > __bad_area_nosemaphore+0x127/0x780 arch/x86/mm/fault.c:814
> > > handle_page_fault arch/x86/mm/fault.c:1505 [inline]
> >
> > Given page fault with runqueue locked, bpf makes trouble instead of
> > helping anything in this case.
>
> That's not the odd thing here.
>
> Look, the callchain is:
>
> > > exc_page_fault+0x612/0x8e0 arch/x86/mm/fault.c:1563
> > > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> > > rep_movs_alternative+0x22/0x70 arch/x86/lib/copy_user_64.S:48
> > > copy_user_generic arch/x86/include/asm/uaccess_64.h:110 [inline]
> > > raw_copy_from_user arch/x86/include/asm/uaccess_64.h:125 [inline]
> > > __copy_from_user_inatomic include/linux/uaccess.h:87 [inline]
> > > copy_from_user_nofault+0xbc/0x150 mm/maccess.c:125
>
> IOW, this is all doing a copy from user with page faults disabled, and
> it shouldn't have caused a signal to be sent, so the whole
> __bad_area_nosemaphore -> force_sig_fault path is bad.
>
So is game like copying from/putting to user with runqueue locked
at the first place.
Plus as per another syzbot report [1], bpf could make trouble with
workqueue pool locked.
[1] https://lore.kernel.org/lkml/00000000000051348606171f61a1@google.com/
Powered by blists - more mailing lists