[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZB6be+/X41JNG6dX@dragonet>
Date: Sat, 25 Mar 2023 15:58:03 +0900
From: "Dae R. Jeong" <threeearcat@...il.com>
To: Nadav Amit <namit@...are.com>
Cc: Vishnu Dasa <vdasa@...are.com>, Pv-drivers <Pv-drivers@...are.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"arnd@...db.de" <arnd@...db.de>,
Greg KH <gregkh@...uxfoundation.org>
Subject: Re: [Pv-drivers] general protection fault in vmci_host_poll
On Wed, Aug 10, 2022 at 06:36:02PM +0000, Nadav Amit wrote:
> >> - Crash report:
> >> general protection fault, probably for non-canonical address 0xdffffc000000000b: 0000 [#1] PREEMPT SMP KASAN
> >> KASAN: null-ptr-deref in range [0x0000000000000058-0x000000000000005f]
> >> Call Trace:
> >> <TASK>
> >> lock_acquire+0x1a4/0x4a0 kernel/locking/lockdep.c:5672
> >> __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
> >> _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
> >> spin_lock include/linux/spinlock.h:349 [inline]
> >> vmci_host_poll+0x16b/0x2b0 drivers/misc/vmw_vmci/vmci_host.c:177
> >> vfs_poll include/linux/poll.h:88 [inline]
> >> do_pollfd fs/select.c:873 [inline]
> >> do_poll fs/select.c:921 [inline]
> >> do_sys_poll+0xc7c/0x1aa0 fs/select.c:1015
> >> __do_sys_ppoll fs/select.c:1121 [inline]
> >> __se_sys_ppoll+0x2cc/0x330 fs/select.c:1101
> >> do_syscall_x64 arch/x86/entry/common.c:51 [inline]
> >> do_syscall_64+0x4e/0xa0 arch/x86/entry/common.c:82
>
> Not my module, so just sharing my 2 cents:
>
> It seems that this is a bug that is related to interaction between different
> debugging features, and it might not be related to VMCI. IIUC, KASAN is
> yelling at lock-dependency checker.
>
> The code that the failure points to is the entry to the lock_release(),
> which raises the question whether additional debug features were enabled
> during the failure, specifically ftrace function tracer or kprobes.
>
Hello,
This crash keeps occuring in our fuzzing environment, and we looked
into this. For me it seems that is caused by a race condition as
follows:
CPU1 CPU2
vmci_host_poll vmci_host_do_init_context
----- -----
// Read uninitialized context
context = vmci_host_dev->context;
// Initialize context
vmci_host_dev->context = vmci_ctx_create();
vmci_host_dev->ct_type = VMCIOBJ_CONTEXT;
if (vmci_host_dev->ct_type == VMCIOBJ_CONTEXT) {
// Dereferencing the wrong pointer
poll_wait(..., &context->host_context);
}
I think reading `context` after checking `ct_type` in vmci_host_poll()
should be enough to prevent this. Could you check this?
Best regards,
Dae R. Jeong
Powered by blists - more mailing lists