[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAG48ez2_u9F5YX9TNyLACyzyGq=sK8YTH+nOdPe4f4DZumLYNg@mail.gmail.com>
Date: Tue, 13 May 2025 18:42:41 +0200
From: Jann Horn <jannh@...gle.com>
To: Marco Elver <elver@...gle.com>
Cc: syzkaller <syzkaller@...glegroups.com>,
syzbot <syzbot+189d4742d07e937d68ea@...kaller.appspotmail.com>,
akpm@...ux-foundation.org, baolin.wang@...ux.alibaba.com, hughd@...gle.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [mm?] KCSAN: data-race in copy_page_from_iter_atomic / pagecache_isize_extended
On Mon, May 12, 2025 at 10:52 PM Marco Elver <elver@...gle.com> wrote:
> On Mon, 12 May 2025 at 20:33, 'Jann Horn' via syzkaller-bugs
> <syzkaller-bugs@...glegroups.com> wrote:
> >
> > On Mon, May 12, 2025 at 7:44 PM Jann Horn <jannh@...gle.com> wrote:
> > > On Tue, May 6, 2025 at 9:52 AM syzbot
> > > <syzbot+189d4742d07e937d68ea@...kaller.appspotmail.com> wrote:
> > > > HEAD commit: 01f95500a162 Merge tag 'uml-for-linux-6.15-rc6' of git://g..
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17abbb68580000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=6154604431d9aaf9
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=189d4742d07e937d68ea
> > > > compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
> > > [...]
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+189d4742d07e937d68ea@...kaller.appspotmail.com
> > > >
> > > > ==================================================================
> > > > BUG: KCSAN: data-race in copy_page_from_iter_atomic / pagecache_isize_extended
> > >
> > > I think this is a problem with the KCSAN implementation.
> > >
> > > This is a race between writing to a userspace-owned page and reading
> > > from a userspace-owned page.
> > >
> > > This kind of pattern should be fairly trivial to trigger: If userspace
> > > tells the kernel to read from a GUP'd page or pagecache on one thread,
> > > and simultaneously tells the kernel to write to the same page on
> > > another thread, we'll get a data race. This is not really a kernel
> > > data race; it is more like a userspace race whose memory accesses
> > > happen to go through the kernel.
> > >
> > > So I think the fix would be for KCSAN to ignore anything in such
> > > pages. The hard part is, I'm not sure how to tell what kind of page
> > > we're dealing with from the kernel, some MM people might know...
> >
> > Or alternatively, if we really do want data_race() operations around
> > any memset() or memcpy() on userspace-controlled pages, I guess we'd
> > have to pepper a lot of those around the kernel.
> >
> > Also, I didn't really think about some of what I wrote here - we
> > certainly wouldn't want to ignore unannotated accesses to some struct
> > located in pagecache that userspace can concurrently write to.
> >
> > Maybe it would actually make sense to do the opposite of what I said
> > to some extent, special-case userspace-mapped pages such that KCSAN
> > _always_ alerts on plain access to them...
> >
> > > distinguishing normal pagecache/anon pages from other pages might be
> > > doable, but I guess it probably gets hard when thinking about
> > > driver-allocated pages that were mapped into userspace vs
> > > driver-allocated pages that are used internally in the driver...
>
> There have been cases where user space was doing something unsafe, and
> KCSAN caught it. While technically it's user space's bug to keep,
> KCSAN is still telling us something's wrong here.
>
> In the past we'd just ignore these bugs (never release them from
> syzbot), but I think we recently changed the rules for some of these
> to be sent to the mailing list. They can safely be ignored if deemed
> "user space is doing something stupid".
>
> I do think we want to surface such issues in one-off testing
> scenarios. However, in the fuzzing/CI context it's not so helpful, so
> we might need a way to suppress them. If there's a way to tell by
> looking at the stacktrace, we could teach syzbot to ignore such data
> races entirely.
Hmm. I think it probably requires a kernel config flag then, I don't
think you can easily filter by stacktrace. In fuzzing builds you could
maybe do some basic checks on the folio to see if it's pagecache, an
anon folio, or a folio mapped into userspace... that would filter out
_most_ but not all cases.
Powered by blists - more mailing lists