linux-kernel - Re: [PATCH 1/8] kasan,x86: Frob kasan

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190228174605.GF32494@hirez.programming.kicks-ass.net>
Date:   Thu, 28 Feb 2019 18:46:05 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Julien Thierry <julien.thierry@....com>,
        Will Deacon <will.deacon@....com>,
        Andy Lutomirski <luto@...capital.net>,
        Ingo Molnar <mingo@...nel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        James Morse <james.morse@....com>, valentin.schneider@....com,
        Brian Gerst <brgerst@...il.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Andy Lutomirski <luto@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>
Subject: Re: [PATCH 1/8] kasan,x86: Frob kasan_report() in an exception

On Thu, Feb 28, 2019 at 05:03:09PM +0100, Dmitry Vyukov wrote:

> I am missing some knowledge about SMAP to answer this.
> In short, these tools insert lots of callbacks into runtime for memory
> accesses, function entry/exit, atomicops and some other. These
> callbacks can do things of different complexity.
> Humm... perhaps we could just disable SMAP for KMSAN/KTSAN. It's
> possible, right? If we have it enabled with KASAN, that should be
> enough.

SMAP detects access to _PAGE_USER pages; that is, such access is only
allowed when EFLAGS.AC=1, otherwise they'll fault.

I again don't know enough about KASAN to say if it does that; but I
suspect it only tracks kernel memory state.

> Also, what's the actual problem with KASAN+SMAP? Is it warnings from
> static analysis tool? Or there are also some runtime effects? What
> effects?

Both; so because of the above semantics, things like copy_to_user() will
have to do STAC (set EFLAGS.AC=1), then do the actual copies to the user
addresses, and then CLAC (clear the AC flag again).

The desire is to have AC=1 sections as small as possible, such that as
much code as possible is ran with AC=0 and will trap on unintended
accesses.

Also; the scheduler doesn't (but I have a patch for that, but I'd prefer
to not have to use it) context switch EFLAGS. This means that if we land
in the scheduler while AC=1, the next task will resume with AC=1.

Consequently, if that task returns to userspace before it gets scheduled
again, we'll continue our previous task (that left with AC=1) with AC=0
and it'll then fault where no fault were expected.

Anyway; the objtool annotation basically tracks the EFLAGS.AC state
(through STAC/CLAC instructions -- no PUSHF/POPF) and disallows any
CALL/RET while AC=1.

This is where the __asan_{load,store}*() stuff went *splat*. GCC inserts
those calls in the middle of STAC/CLAC (AC=1) and we then have to mark
the functions as AC-safe. objtool validates those on the same rules, no
further CALLs that are not also safe.

Things like __fentry__ are inherently unsafe because they use
preempt_disable/preempt_enable, where the latter has a CALL
__preempt_schedule (and is thus very unsafe). Similarly with
kasan_report(), it does all sorts of things that are not safe to do.

> Is it possible to disable the SMAP runtime checks once we enter
> kasan_report() past report_enabled() check? We could restrict it to
> "just finish printing this bug report whatever it takes and then
> whatever" if it makes things simpler.
> It would be nice if we could restrict it to something like:
> 
> @@ -291,6 +303,7 @@ void kasan_report(unsigned long addr, size_t size,
>         if (likely(!report_enabled()))
>                 return;
> +       disable_smap();
> 
> And then enforce panic at the end of report if smap is enabled.

That would be a CLAC, and the current rules disallow CLAC for AC-safe
functions.

Furthermore, kasan_report() isn't fatal, right? So it would have to
restore the state on exit. That makes the validation state much more
complicated.

Let me try and frob some of the report_enabled() stuff before the #UD.