[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a1cjw419WZ=B5oPs7z4_6b6mxRwMjSJB0Q1eh5TpQoT9g@mail.gmail.com>
Date: Wed, 17 Mar 2021 09:31:47 +0100
From: Arnd Bergmann <arnd@...db.de>
To: Dmitry Vyukov <dvyukov@...gle.com>
Cc: Russell King - ARM Linux admin <linux@...linux.org.uk>,
syzbot <syzbot+0b06ef9b44d00d600183@...kaller.appspotmail.com>,
Linus Walleij <linus.walleij@...aro.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
Uwe Kleine-König
<u.kleine-koenig@...gutronix.de>
Subject: Re: [syzbot] kernel panic: corrupted stack end in openat
On Wed, Mar 17, 2021 at 8:52 AM Dmitry Vyukov <dvyukov@...gle.com> wrote:
> On Tue, Mar 16, 2021 at 5:28 PM Arnd Bergmann <arnd@...db.de> wrote:
> > On Tue, Mar 16, 2021 at 5:13 PM Dmitry Vyukov <dvyukov@...gle.com> wrote:
> > > On Tue, Mar 16, 2021 at 5:03 PM Arnd Bergmann <arnd@...db.de> wrote:
> > > > On Tue, Mar 16, 2021 at 4:51 PM Russell King - ARM Linux admin
> > > > <linux@...linux.org.uk> wrote:
> > > > > On Tue, Mar 16, 2021 at 04:44:45PM +0100, Arnd Bergmann wrote:
> > > > > > On Tue, Mar 16, 2021 at 11:17 AM Dmitry Vyukov <dvyukov@...gle.com> wrote:
> > > > > > > The compiler is gcc version 10.2.1 20210110 (Debian 10.2.1-6)
> > > > > >
> > > > > > Ok, building with Ubuntu 10.2.1-1ubuntu1 20201207 locally, that's
> > > > > > the closest I have installed, and I think the Debian and Ubuntu versions
> > > > > > are generally quite close in case of gcc since they are maintained by
> > > > > > the same packagers.
> > > > >
> > > > > ... which shouldn't be a problem - that's just over 1/4 of the stack
> > > > > space. Could it be the syzbot's gcc is doing something weird and
> > > > > inflating the stack frames?
> > > >
> > > > It's possible, I think that's really unlikely given that it's just Debian's
> > > > gcc, which is as close to mainline as the version I was using.
> > > >
> > > > Uwe's DEBUG_STACKOVERFLOW patch from a while ago might
> > > > help if this was the problem though:
> > > > https://lore.kernel.org/linux-arm-kernel/20200108082913.29710-1-u.kleine-koenig@pengutronix.de/
> > > >
> > > > My best guess is something going wrong in the interrupt
> > > > that triggered the preempt_schedule() which ended up calling
> > > > task_stack_end_corrupted() in schedule_debug(), as you suggested
> > > > earlier.
> > >
> > > FWIW I see slightly larger frames with the config:
> > >
> > > 073ab64 <ima_calc_field_array_hash_tfm>:
> > > 8073ab64: e1a0c00d mov ip, sp
> > > 8073ab68: e92ddff0 push {r4, r5, r6, r7, r8, r9, sl,
> > > fp, ip, lr, pc}
> > > 8073ab6c: e24cb004 sub fp, ip, #4
> > > 8073ab70: e24ddfa7 sub sp, sp, #668 ; 0x29c
> >
> > Yes, this is the one that the compiler complained about when warning
> > for stack over 600 bytes. It's not called in this call chain though.
> >
> > > page_alloc can also do reclaim, I had the impression that reclaim can
> > > be quite heavy-weight in all respects.
> >
> > Yes, that is another possibility. What writable file systems or swap
> > do you normally have mounted that it could be writing to, and on
> > what storage device?
>
> The root fs is ext4 on virtio-blk.
>
> There are also several dozens of shrinkers that can be called during reclaim:
> https://elixir.bootlin.com/linux/latest/C/ident/unregister_shrinker
Right, unfortunately I don't see a smoking gun there either, unless you are
also using NFS or devicemapper.
Implementing VMAP_STACK as you suggested earlier is probably the
best way to figure out if there is an actual overrun of the stack.
Alternatively, adding support for GCC_PLUGIN_STACKLEAK might
also help find out if we ever get close to the limit. This is probably
less work, but it might not actually help in this case.
Arnd
Powered by blists - more mailing lists