linux-kernel - Re: objtool clac/stac handling change..

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87eept6yfq.fsf@mpe.ellerman.id.au>
Date:   Fri, 03 Jul 2020 13:59:37 +1000
From:   Michael Ellerman <mpe@...erman.id.au>
To:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Christophe Leroy <christophe.leroy@...roup.eu>
Cc:     Al Viro <viro@...iv.linux.org.uk>,
        Christophe Leroy <christophe.leroy@....fr>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: objtool clac/stac handling change..

Linus Torvalds <torvalds@...ux-foundation.org> writes:
> On Thu, Jul 2, 2020 at 8:13 AM Christophe Leroy
> <christophe.leroy@...roup.eu> wrote:
>>
>> Isn't it something easy to do in bad_page_fault() ?
>
> Can't the user access functions take any other faults on ppc?

Yes they definitely can.

I think I can enumerate all the possibilities on 64-bit, but I don't
know all the possibilities on all the 32-bit variants.

> On x86-64, we have the "address is non-canonical" case which doesn't
> take a page fault at all, but takes a general protection fault
> instead.

Right. On P9 radix we have an address-out-of-page-table-range exception
which I guess is similar, though that does end up at bad_page_fault() in
our case.

> But note that depending on how you nest and save/restore the state,
> things can be very subtle.
>
> For example, what can happen is:
>
>  (a) user_access_begin()..
>
>  (b) we take a normal interrupt
>
>  (c) the interrupt code does something that has an exception handling
> case entirely unrelated to the user access (on x86, it might be the
> "unsafe_msr' logic, for example.
>
>  (d) we take that exception, do "fixup_exception()" for whatever that
> interrupt did.
>
>  (e) we return from that exception to the fixed up state
>
>  (d) we return from the interrupt
>
>  (e) we should still have user accesses enabled.

Yes.

We broke that a few times when developing the KUAP support, which is why
I added bad_kuap_fault() to report the case where we are in a uaccess
region but are being blocked unexpectedly by KUAP.

> NOTE! on x86, we can have "all fixup_exceptions() will clear AC in the
> exception pt_regs", because AC is part of rflags which is saved on
> (and cleared for the duration of) all interrupt and exceptions.
>
> So what happens is that on x86 all of (b)-(d) will run with AC clear
> and no user accesses allowed, and (e) will have user accesses enabled
> again, because the "fixup_exception()" at (d) only affected the state
> of the interrupt hander (which already had AC clear anyway).
>
> But I don't think exceptions and interrupts save/restore the user
> access state on powerpc, do they?

Not implicitly.

We manually save it into pt_regs on the stack in the exception entry. On
64-bit it's done in kuap_save_amr_and_lock. 32-bit does it in
kuap_save_and_lock.

And then on the return path it's kuap_restore_amr() on 64-bit, and
kuap_restore on 32-bit.

> So on powerpc you do need to be more careful. You would only need to
> disable user access on exceptions that happen _on_ user accesses.
>
> The easiest way to do that is to do what x86 does: different
> exceptions have different handlers. It's not what we did originally,
> but it's been useful.
>
> Hmm.
>
> And again, on x86, this all works fine because of how exceptions
> save/restore the user_access state and it all nests fine. But I'm
> starting to wonder how the nesting works AT ALL for powerpc?
>
> Because that nesting happens even without
>
> IOW, even aside from this whole thing, what happens on PPC, when you have

I'll annotate what happens for the 64-bit case as it's the one I know
best:

>  (a) user_access_begin();
         - mtspr(SPRN_AMR, 0)	// 0 means loads & stores permitted

>      - profile NMI or interrupt happens
         - pt_regs->kuap = mfspr(SPRN_AMR)
         - mtspr(SPRN_AMR, AMR_KUAP_BLOCKED)

>      - it wants to do user stack tracing so does
>                 pagefault_disable();
>        (b)         get_user();
                     mtspr(SPRN_AMR, 0)
                     ld rN, <user pointer)
                     mtspr(SPRN_AMR, AMR_KUAP_BLOCKED)
                     
>                 pagefault_enable();
>    - profile NMI/interrupt returns
       - mtspr(SPRN_AMR, pt_regs->kuap)
       - return from interrupt

>  (c) user accesss should work here!
>
> even if the "get_user()" in (b) would have done a
> "user_access_begin/end" pair, and regardless of whether (b) might have
> triggered a "fixup_exception()", and whether that fixup_exception()
> then did the user_access_end().
>
> On x86, this is all ok exactly because of how we only have the AC bit,
> and it nests very naturally with any exception handling.
>
> Is the ppc code nesting-safe? Particularly since it has that whole
> range-handling?

Yeah I think it is.

The range handling on 32-bit books follows the same pattern as above,
except that on exception entry we don't save the content of an SPR to
pt_regs, instead we save current->thread.kuap. (Because there isn't a
single SPR that contains the KUAP state).

cheers