[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <202208121143.C014AEF0AA@keescook>
Date:   Fri, 12 Aug 2022 11:44:20 -0700
From:   Kees Cook <keescook@...omium.org>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     Ira Weiny <ira.weiny@...el.com>,
        "Fabio M. De Francesco" <fmdefrancesco@...il.com>,
        ebiederm@...ssion.com, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-next@...r.kernel.org, sfr@...b.auug.org.au,
        syzkaller-bugs@...glegroups.com, viro@...iv.linux.org.uk,
        syzbot <syzbot+3250d9c8925ef29e975f@...kaller.appspotmail.com>
Subject: Re: [syzbot] linux-next boot error: BUG: unable to handle kernel
 paging request in kernel_execve
On Fri, Aug 12, 2022 at 11:29:44AM +0200, Dmitry Vyukov wrote:
> On Fri, 12 Aug 2022 at 02:11, Ira Weiny <ira.weiny@...el.com> wrote:
> >
> > On Thu, Aug 11, 2022 at 02:00:59PM -0700, Kees Cook wrote:
> > > On Thu, Aug 11, 2022 at 11:51:34AM -0700, Ira Weiny wrote:
> > > > On Thu, Aug 11, 2022 at 10:39:29AM -0700, Ira wrote:
> > > > > On Thu, Aug 11, 2022 at 08:33:16AM -0700, Kees Cook wrote:
> > > > > > Hi Fabio,
> > > > > >
> > > > > > It seems likely that the kmap change[1] might be causing this crash. Is
> > > > > > there a boot-time setup race between kmap being available and early umh
> > > > > > usage?
> > > > >
> > > > > I don't see how this is a setup problem with the config reported here.
> > > > >
> > > > > CONFIG_64BIT=y
> > > > >
> > > > > ...and HIGHMEM is not set.
> > > > > ...and PREEMPT_RT is not set.
> > > > >
> > > > > So the kmap_local_page() call in that stack should be a page_address() only.
> > > > >
> > > > > I think the issue must be some sort of race which was being prevented because
> > > > > of the preemption and/or pagefault disable built into kmap_atomic().
> > > > >
> > > > > Is this reproducable?
> > > > >
> > > > > The hunk below will surely fix it but I think the pagefault_disable() is
> > > > > the only thing that is required.  It would be nice to test it.
> > > >
> > > > Fabio and I discussed this.  And he also mentioned that pagefault_disable() is
> > > > all that is required.
> > >
> > > Okay, sounds good.
> > >
> > > > Do we have a way to test this?
> > >
> > > It doesn't look like syzbot has a reproducer yet, so its patch testing
> > > system[1] will not work. But if you can send me a patch, I could land it
> > > in -next and we could see if the reproduction frequency drops to zero.
> > > (Looking at the dashboard, it's seen 2 crashes, most recently 8 hours
> > > ago.)
> >
> > Patch sent.
> >
> > https://lore.kernel.org/lkml/20220812000919.408614-1-ira.weiny@intel.com/
Thank you!
> >
> > But I'm more confused after looking at this again.
> 
> There is splat of random crashes in linux-next happened at the same time:
> 
> https://groups.google.com/g/syzkaller-bugs/search?q=%22linux-next%20boot%20error%3A%22
> 
> There are 10 different crashes in completely random places.
> I would assume they have the same root cause, some silent memory
> corruption or something similar.
Yeah, I noticed the crashes stopped "on their own", so I think I'll
wait a bit more, and if it start back up, we can try Ira's patch, though
I'd agree with the assessment that it looks like it shouldn't be needed.
-Kees
-- 
Kees Cook
Powered by blists - more mailing lists
 
