linux-kernel - Re: Regression bisected to f2f84b05e02b (bug: consolidate warn_slowpath

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEdQ38F2GP92xB2gMXTrEo-Adbbc9Cy1DWHU9yveGLzJNd2HrA@mail.gmail.com>
Date:   Thu, 11 Jun 2020 21:23:52 -0700
From:   Matt Turner <mattst88@...il.com>
To:     Kees Cook <keescook@...omium.org>
Cc:     Linux-Arch <linux-arch@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-alpha <linux-alpha@...r.kernel.org>,
        Richard Henderson <rth@...ddle.net>,
        Ivan Kokshaysky <ink@...assic.park.msu.ru>
Subject: Re: Regression bisected to f2f84b05e02b (bug: consolidate
 warn_slowpath_fmt() usage)

On Tue, Jun 2, 2020 at 11:03 AM Kees Cook <keescook@...omium.org> wrote:
>
> On Mon, Jun 01, 2020 at 07:48:04PM -0700, Matt Turner wrote:
> > I bisected a regression on alpha to f2f84b05e02b (bug: consolidate
> > warn_slowpath_fmt() usage) which looks totally innocuous.
> >
> > Reverting it on master confirms that it somehow is the trigger. At or a
> > little after starting userspace, I'll see an oops like this:
> >
> > Unable to handle kernel paging request at virtual address 0000000000000000
> > CPU 0
> > kworker/u2:5(98): Oops -1
> > pc = [<0000000000000000>]  ra = [<0000000000000000>]  ps = 0000    Not tainted
> > pc is at 0x0
>
> ^^^^ so, the instruction pointer is NULL. The only way I can imagine
> that happening would be from this line:
>
>         worker->current_func(work);
>
> > ra is at 0x0
> > v0 = 0000000000000007  t0 = 0000000000000001  t1 = 0000000000000001
> > t2 = 0000000000000000  t3 = fffffc00bfe68780  t4 = 0000000000000001
> > t5 = fffffc00bf8cc780  t6 = 00000000026f8000  t7 = fffffc00bfe70000
> > s0 = fffffc000250d310  s1 = fffffc000250d310  s2 = fffffc000250d310
> > s3 = fffffc000250ca40  s4 = fffffc000250caa0  s5 = 0000000000000000
> > s6 = fffffc000250ca40
> > a0 = fffffc00024f0488  a1 = fffffc00bfe73d98  a2 = fffffc00bfe68800
> > a3 = fffffc00bf881400  a4 = 0001000000000000  a5 = 0000000000000002
> > t8 = 0000000000000000  t9 = 0000000000000000  t10= 0000000001321800
> > t11= 000000000000ba4e  pv = fffffc000189ca00  at = 0000000000000000
> > gp = fffffc000253e430  sp = 0000000043a83c2e
> > Disabling lock debugging due to kernel taint
> > Trace:
> > [<fffffc000105c8ac>] process_one_work+0x25c/0x5a0
>
> Can you verify where this     ^^^^^^^^^^^^^^   is?

It is kernel/workqueue.c:2268, which contains

        worker->current_func(work);

as you predicted.

> > [<fffffc000105cc4c>] worker_thread+0x5c/0x7d0
> > [<fffffc0001066c88>] kthread+0x188/0x1f0
> > [<fffffc0001011b48>] ret_from_kernel_thread+0x18/0x20
> > [<fffffc0001066b00>] kthread+0x0/0x1f0
> > [<fffffc000105cbf0>] worker_thread+0x0/0x7d0
> >
> > Code:
> >  00000000
> >  00000000
> >  00063301
> >  000012e2
> >  00001111
> >  0005ffde
> >
> > It seems to cause a hard lock on an SMP system, but not on a system with
> > a single CPU. Similarly, if I boot the SMP system (2 CPUs) with
> > maxcpus=1 the oops doesn't happen. Until I tested on a non-SMP system
> > today I suspected that it was unaffected, but I saw the oops there too.
> > With the revert applied, I don't see a warning or an oops.
> >
> > Any clues how this patch could have triggered the oops?
>
> I cannot begin to imagine. :P Compared to other things I've seen like
> this in the past maybe it's some kind of effect from the code size
> changing the location/alignment or timing of something else?
>
> Various questions ranging in degrees of sanity:
>
> Does alpha use work queues for WARN?

I do not know. I don't see much in a few greps of arch/alpha that
would indicate that it uses work queues.

> Which work queue is getting a NULL function? (And then things like "if
> WARN was much slower or much faster, is there a race to something
> setting itself to NULL?")
>
> Was there a WARN before the above Oops?

No, which I suspect means that your much scarier suggestion that this
is somehow due to code size or alignment is increasingly plausible.

> Does WARN have side-effects on alpha?

alpha just uses the asm-generic implementation of WARN as far as I can
tell, so I think not.

> Does __WARN_printf() do something bad that warn_slowpath_null() doesn't?
>
> Does making incremental changes narrow anything down? (e.g. instead of
> this revert, remove the __warn() call in warn_slowpath_fmt() that was
> added? (I mean, that'll be quite broken for WARN, but will it not oops?)

Commenting out the added __warn does not work around the problem.

Readding warn_slowpath_null and the EXPORT_SYMBOL (but not calling it
from WARN) does not work around the problem.

Calling warn_slowpath_fmt() with fmt=" " instead of fmt=NULL does not
work around the problem.

I also tried GCC-10.1 as a stab in the dark, and that doesn't work
around the problem.

So I'm thinking it's something about code size or alignment. I would
be worried it's to do with memory ordering (since this is on Alpha)
but I'm seeing the problem on a single CPU system, so that should be
ruled out, I think?

Using CONFIG_CC_OPTIMIZE_FOR_SIZE=y doesn't work around the problem.
So that hurts the theory of code size being the trigger.

Since I noticed earlier that using maxcpus=1 on a 2-CPU system
prevented the system from hanging, I tried disabling CONFIG_SMP on my
1-CPU system as well. In doing so, I discovered that the RCU torture
module (RCU_TORTURE_TEST) triggers some null pointer dereferences on
Alpha when CONFIG_SMP is set, but works successfully when CONFIG_SMP
is unset.

That seems likely to be a symptom of the same underlying problem that
started this thread, don't you think? If so, I'll focus my attention
on that.

> Does alpha have hardware breakpoints? When I had to track down a
> corruption in the io scheduler, I ended up setting breakpoints on the
> thing that went crazy (in this case, I assume the work queue function
> pointer) to figure out what touched it.

As far as I know we don't have anything implemented in the kernel, but
they could be implemented by faulting on read/write.

> ... I can't think of anything else.

Thanks for your time and suggestions!

Matt