[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrEgAOL9XrhlSPwr@slm.duckdns.org>
Date: Mon, 5 Aug 2024 08:54:56 -1000
From: Tejun Heo <tj@...nel.org>
To: Jens Axboe <axboe@...nel.dk>
Cc: syzbot <syzbot+b3e4f2f51ed645fd5df2@...kaller.appspotmail.com>,
asml.silence@...il.com, io-uring@...r.kernel.org,
linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
Lai Jiangshan <jiangshanlai@...il.com>
Subject: Re: [syzbot] [io-uring?] KCSAN: data-race in __flush_work /
__flush_work (2)
Hello,
On Mon, Aug 05, 2024 at 08:23:28AM -0600, Jens Axboe wrote:
> > read to 0xffff8881223aa3e8 of 8 bytes by task 50 on cpu 1:
> > __flush_work+0x42a/0x570 kernel/workqueue.c:4188
> > flush_work kernel/workqueue.c:4229 [inline]
> > flush_delayed_work+0x66/0x70 kernel/workqueue.c:4251
> > io_uring_try_cancel_requests+0x35b/0x370 io_uring/io_uring.c:3000
> > io_ring_exit_work+0x148/0x500 io_uring/io_uring.c:2779
> > process_one_work kernel/workqueue.c:3231 [inline]
> > process_scheduled_works+0x483/0x9a0 kernel/workqueue.c:3312
> > worker_thread+0x526/0x700 kernel/workqueue.c:3390
> > kthread+0x1d1/0x210 kernel/kthread.c:389
> > ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147
> > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
The offending line is:
/*
* start_flush_work() returned %true. If @from_cancel is set, we know
* that @work must have been executing during start_flush_work() and
* can't currently be queued. Its data must contain OFFQ bits. If @work
* was queued on a BH workqueue, we also know that it was running in the
* BH context and thus can be busy-waited.
*/
-> data = *work_data_bits(work);
if (from_cancel &&
!WARN_ON_ONCE(data & WORK_STRUCT_PWQ) && (data & WORK_OFFQ_BH)) {
It is benign but the code is also wrong. When @from_cancel, we know that we
own the work item through its pending bit and thus its data bits cannot
change. Also, the read data value is only used when @from_cancel. So, the
code is not necessarily broken but the compiler can easily generate the read
before @from_cancel test, which is what the code is saying anyway and looks
like how the compiler generated the code according to the disassembly of the
vmlinux in the report.
So, it's benign in that the read value won't be used if !@...m_cancel and
data race only exists when !@...m_cancel. The code is wrong in that it can
easily generate the spurious data race read. I'll fix it.
Thanks.
--
tejun
Powered by blists - more mailing lists