[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjBvNvVggy14p9rkHA8W1ZVfoKXvW0oeX5NZWxWUv8gfQ@mail.gmail.com>
Date: Sun, 28 Apr 2024 13:01:19 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Hillf Danton <hdanton@...a.com>
Cc: syzbot <syzbot+83e7f982ca045ab4405c@...kaller.appspotmail.com>,
andrii@...nel.org, bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [bpf?] [trace?] possible deadlock in force_sig_info_to_task
On Sat, 27 Apr 2024 at 16:13, Hillf Danton <hdanton@...a.com> wrote:
>
> > -> #0 (&sighand->siglock){....}-{2:2}:
> > check_prev_add kernel/locking/lockdep.c:3134 [inline]
> > check_prevs_add kernel/locking/lockdep.c:3253 [inline]
> > validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
> > __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
> > lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
> > __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> > _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
> > force_sig_info_to_task+0x68/0x580 kernel/signal.c:1334
> > force_sig_fault_to_task kernel/signal.c:1733 [inline]
> > force_sig_fault+0x12c/0x1d0 kernel/signal.c:1738
> > __bad_area_nosemaphore+0x127/0x780 arch/x86/mm/fault.c:814
> > handle_page_fault arch/x86/mm/fault.c:1505 [inline]
>
> Given page fault with runqueue locked, bpf makes trouble instead of
> helping anything in this case.
That's not the odd thing here.
Look, the callchain is:
> > exc_page_fault+0x612/0x8e0 arch/x86/mm/fault.c:1563
> > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> > rep_movs_alternative+0x22/0x70 arch/x86/lib/copy_user_64.S:48
> > copy_user_generic arch/x86/include/asm/uaccess_64.h:110 [inline]
> > raw_copy_from_user arch/x86/include/asm/uaccess_64.h:125 [inline]
> > __copy_from_user_inatomic include/linux/uaccess.h:87 [inline]
> > copy_from_user_nofault+0xbc/0x150 mm/maccess.c:125
IOW, this is all doing a copy from user with page faults disabled, and
it shouldn't have caused a signal to be sent, so the whole
__bad_area_nosemaphore -> force_sig_fault path is bad.
The *problem* here is that the page fault doesn't actually happen on a
user access, it happens on the *ret* instruction in
rep_movs_alternative itself (which doesn't have a exception fixup,
obviously, because no exception is supposed to happen there!):
RIP: 0010:rep_movs_alternative+0x22/0x70 arch/x86/lib/copy_user_64.S:50
Code: 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 83 f9 40 73 40 83 f9 08
73 21 85 c9 74 0f 8a 06 88 07 48 ff c7 48 ff c6 48 ff c9 75 f1 <c3> cc
cc cc cc 66 0f 1f 84 00 00 0$
RSP: 0000:ffffc90004137468 EFLAGS: 00050002
RAX: ffffffff8205ce4e RBX: dffffc0000000000 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000900 RDI: ffffc900041374e8
RBP: ffff88802d039784 R08: 0000000000000005 R09: ffffffff8205ce37
R10: 0000000000000003 R11: ffff88802d038000 R12: 1ffff11005a072f0
R13: 0000000000000900 R14: 0000000000000002 R15: ffffc900041374e8
where decoding that "Code:" line gives this:
0: f3 0f 1e fa endbr64
4: 48 83 f9 40 cmp $0x40,%rcx
8: 73 40 jae 0x4a
a: 83 f9 08 cmp $0x8,%ecx
d: 73 21 jae 0x30
f: 85 c9 test %ecx,%ecx
11: 74 0f je 0x22
13: 8a 06 mov (%rsi),%al
15: 88 07 mov %al,(%rdi)
17: 48 ff c7 inc %rdi
1a: 48 ff c6 inc %rsi
1d: 48 ff c9 dec %rcx
20: 75 f1 jne 0x13
22:* c3 ret <-- trapping instruction
but I have no idea why the 'ret' instruction would take a page fault.
It really shouldn't.
Now, it's not like 'ret' instructions can't take page faults, but it
sure shouldn't happen in the *kernel*. The reasons for page faults on
'ret' instructions are:
- the instruction itself takes a page fault
- the stack pointer is bogus
- possibly because the stack *contents* are bogus (at least some x86
instructions that jump will check the destination in the jump
instruction itself, although I didn't think 'ret' was one of them)
but for the kernel, none of these actually seem to be the case
normally. And even abnormally I don't see this being an issue, since
the exception backtrace is happily shown (ie the stack looks all
good).
So this dump is just *WEIRD*.
End result: the problem is not about any kind of deadlock on circular
locking. That's just the symptom of that odd page fault that shouldn't
have happened, and that I don't quite see how it happened.
Linus
Powered by blists - more mailing lists