linux-kernel - Re: [RFC] Disable lockref on arm64

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKv+Gu_8ibO4D01DZv6KjL2GnvKuVBVnt=doxkN0w=4utJ7NvQ@mail.gmail.com>
Date:   Mon, 17 Jun 2019 13:33:19 +0200
From:   Ard Biesheuvel <ard.biesheuvel@...aro.org>
To:     Kees Cook <keescook@...omium.org>
Cc:     Will Deacon <will.deacon@....com>,
        Jayachandran Chandrasekharan Nair <jnair@...vell.com>,
        "catalin.marinas@....com" <catalin.marinas@....com>,
        Jan Glauber <jglauber@...vell.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [RFC] Disable lockref on arm64

On Sun, 16 Jun 2019 at 23:31, Kees Cook <keescook@...omium.org> wrote:
>
> On Sat, Jun 15, 2019 at 04:18:21PM +0200, Ard Biesheuvel wrote:
> > Yes, I am using the same saturation point as x86. In this example, I
> > am not entirely sure I understand why it matters, though: the atomics
> > guarantee that the write by CPU2 fails if CPU1 changed the value in
> > the mean time, regardless of which value it wrote.
> >
> > I think the concern is more related to the likelihood of another CPU
> > doing something nasty between the moment that the refcount overflows
> > and the moment that the handler pins it at INT_MIN/2, e.g.,
> >
> > > CPU 1                   CPU 2
> > > inc()
> > >   load INT_MAX
> > >   about to overflow?
> > >   yes
> > >
> > >   set to 0
> > >                          <insert exploit here>
> > >   set to INT_MIN/2
>
> Ah, gotcha, but the "set to 0" is really "set to INT_MAX+1" (not zero)
> if you're using the same saturation.
>

Of course. So there is no issue here: whatever manipulations are
racing with the overflow handler can never result in the counter to
unsaturate.

And actually, moving the checks before the stores is not as trivial as
I thought, E.g., for the LSE refcount_add case, we have

        "       ldadd           %w[i], w30, %[cval]\n"                  \
        "       adds            %w[i], %w[i], w30\n"                    \
        REFCOUNT_PRE_CHECK_ ## pre (w30))                               \
        REFCOUNT_POST_CHECK_ ## post                                    \

and changing this into load/test/store defeats the purpose of using
the LSE atomics in the first place.

On my single core TX2, the comparative performance is as follows

Baseline: REFCOUNT_TIMING test using REFCOUNT_FULL (LSE cmpxchg)
      191057942484      cycles                    #    2.207 GHz
      148447589402      instructions              #    0.78  insn per
cycle

      86.568269904 seconds time elapsed

Upper bound: ATOMIC_TIMING
      116252672661      cycles                    #    2.207 GHz
       28089216452      instructions              #    0.24  insn per
cycle

      52.689793525 seconds time elapsed

REFCOUNT_TIMING test using LSE atomics
      127060259162      cycles                    #    2.207 GHz
                 0      instructions              #    0.00  insn per
cycle

      57.243690077 seconds time elapsed