lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 16 Feb 2023 08:34:00 -0800
From:   Yury Norov <yury.norov@...il.com>
To:     Kees Cook <kees@...nel.org>
Cc:     Bruno Goncalves <bgoncalv@...hat.com>,
        Networking <netdev@...r.kernel.org>, alan.maguire@...cle.com,
        Jakub Kicinski <kuba@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        CKI Project <cki-project@...hat.com>,
        Kees Cook <keescook@...omium.org>,
        Miguel Ojeda <ojeda@...nel.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>
Subject: Re: [6.2.0-rc7] BUG: KASAN: slab-out-of-bounds in hop_cmp+0x26/0x110

On Wed, Feb 15, 2023 at 05:51:39PM -0800, Kees Cook wrote:
> On February 15, 2023 4:31:32 PM PST, Yury Norov <yury.norov@...il.com> wrote:
> >+ Kees Cook <keescook@...omium.org>
> >+ Miguel Ojeda <ojeda@...nel.org>
> >+ Nick Desaulniers <ndesaulniers@...gle.com>
> >
> >On Wed, Feb 15, 2023 at 09:24:52AM +0100, Bruno Goncalves wrote:
> >> On Tue, 14 Feb 2023 at 15:32, Yury Norov <yury.norov@...il.com> wrote:
> >> >
> >> > On Tue, Feb 14, 2023 at 02:23:06PM +0100, Bruno Goncalves wrote:
> >> > > Hello,
> >> > >
> >> > > recently when testing kernel with debug options set from net-next [1]
> >> > > and bpf-next [2] the following call trace happens:
> >> > >
> >> > Hi Bruno,
> >> >
> >> > Thanks for report.
> >> >
> >> > This looks weird, because the hop_cmp() spent for 3 month in -next till
> >> > now. Anyways, can you please share your NUMA configuration so I'll try
> >> > to reproduce the bug locally? What 'numactl -H' outputs?
> >> >
> >> 
> >> Here is the output:
> >> 
> >> numactl -H
> >> available: 4 nodes (0-3)
> >> node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39
> >> node 0 size: 32063 MB
> >> node 0 free: 31610 MB
> >> node 1 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47
> >> node 1 size: 32248 MB
> >> node 1 free: 31909 MB
> >> node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55
> >> node 2 size: 32248 MB
> >> node 2 free: 31551 MB
> >> node 3 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63
> >> node 3 size: 32239 MB
> >> node 3 free: 31468 MB
> >> node distances:
> >> node   0   1   2   3
> >>   0:  10  21  31  21
> >>   1:  21  10  21  31
> >>   2:  31  21  10  21
> >>   3:  21  31  21  10
> >> 
> >> Bruno
> >
> >So, I was able to reproduce it, and it seems like a compiler issue.
> >
> >The problem is that hop_cmp() calculates pointer to a previous hop
> >object unconditionally at the beginning of the function:
> >
> >       struct cpumask **prev_hop = *((struct cpumask ***)b - 1);
> > 
> >Obviously, for the first hop, there's no such thing like a previous
> >one, and later in the code 'prev_hop' is used conditionally on that:
> >
> >       k->w = (b == k->masks) ? 0 : cpumask_weight_and(k->cpus, prev_hop[k->node]);
> >
> >To me the code above looks like it instructs the compiler to dereference
> >'b - 1' only if b != k->masks, i.e. when b is not the first hop. But GCC
> >does that unconditionally, which looks wrong.
> >
> >If I defer dereferencing manually like in the snippet below, the kasan
> >warning goes away.
> >
> >diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> >index 48838a05c008..5f297f81c574 100644
> >--- a/kernel/sched/topology.c
> >+++ b/kernel/sched/topology.c
> >@@ -2081,14 +2081,14 @@ struct __cmp_key {
> >
> > static int hop_cmp(const void *a, const void *b)
> > {
> >-       struct cpumask **prev_hop = *((struct cpumask ***)b - 1);
> >        struct cpumask **cur_hop = *(struct cpumask ***)b;
> >        struct __cmp_key *k = (struct __cmp_key *)a;
> >
> >        if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
> >                return 1;
> >
> >-       k->w = (b == k->masks) ? 0 : cpumask_weight_and(k->cpus, prev_hop[k->node]);
> >+       k->w = (b == k->masks) ? 0 :
> >+               cpumask_weight_and(k->cpus, (*((struct cpumask ***)b - 1))[k->node]);
> >        if (k->w <= k->cpu)
> >                return 0;
> >
> >I don't understand why GCC doesn't optimize out unneeded dereferencing.
> >It does that even if I replace ternary operator with if-else construction.
> >To me it looks like a compiler bug.
> >
> >However, I acknowledge that I'm not a great expert in C standard, so
> >it's quite possible that there may be some rule that prevents from
> >doing such optimizations, even for non-volatile variables.
> >
> >Adding compiler people. Guys, could you please clarify on that?
> >If it's my fault, I'll submit fix shortly.
> 
> My understanding is that without getting inlined, the compiler cannot evaluate "b == k->masks" at compile time (if it can at all). So since the dereference is part of variable initialization, it's not executed later: it's executed at function entry.
> 
> Regardless, this whole function looks kind of hard to read. Why not fully expand it with the if/else logic and put any needed variables into the respective clauses? Then humans can read it and the compiler will optimize it down just as efficiently.

Yes, changing it to if-else would be a simplest solution.

Thanks,
Yury

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ