[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b0ee2570-c9ba-2a7c-2a8b-0dfa46685e62@amd.com>
Date: Fri, 26 Apr 2019 15:11:17 +0000
From: Gary R Hook <ghook@....com>
To: Borislav Petkov <bp@...en8.de>
CC: Thomas Gleixner <tglx@...utronix.de>,
"Hook, Gary" <Gary.Hook@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"x86@...nel.org" <x86@...nel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"luto@...nel.org" <luto@...nel.org>,
Alexander Potapenko <glider@...gle.com>
Subject: Re: [PATCH] x86/mm/mem_encrypt: Disable all instrumentation for SME
early boot code
On 4/8/19 2:08 PM, Borislav Petkov wrote:On 5/8/19 2:08 PM, Borislav
Petkov wrote:> On Mon, Apr 08, 2019 at 06:41:30PM +0000, Gary R Hook
wrote:
>> Again, not arguing. I completely understand. However, to be fair,
this
>> isn't about SME having trouble with those facilities, this is about
>> using certain features (e.g. command line option processing) early
in
>> the boot. Any complex feature could have had that requirement, don't
you
>> think?
>
> Sure, but then why do we need that patch at all then? Why do we need to
> disable instrumentation for SME early code and not for other early code?
>
> I mean, if you grep around the tree you can see a bunch of
> KASAN_SANITIZE but in lib/ we only have
>
> lib/Makefile:210:KASAN_SANITIZE_stackdepot.o := n
>
> which is special. But the rest of the generic code in lib/ or
> arch/x86/lib/ isn't.
>
> Now, there's this:
>
> arch/x86/boot/Makefile:12:KASAN_SANITIZE := n
> arch/x86/boot/compressed/Makefile:20:KASAN_SANITIZE
:= n
>
> which disables KASAN for all boot code.
>
> And this is what you mean - all early boot code should not be sanitized.
>
> Which also gives the right solution, IMO:
>
> cmdline.o should not be sanitized only when used in the boot code. But
> that is already the case.
>
> So why do you need to disable KASAN for arch/x86/lib/cmdline.c?
>
> Because for those two:
>
> arch/x86/boot/cmdline.c
> arch/x86/boot/compressed/cmdline.c
>
> that should already be the case due to the Makefile defines above.
Except that we're not talking about that code.
I probably should have defined terms, so please allow me to back up.
When I say "early boot" I meant what happens -after- decompression, when
the kernel proper has been laid out in memory and starts to run. This
is -after- the boot code has been executed, which means that the
cmdline.c to which you refer above is no longer extant in memory.
If my usage of the term "early boot" is a misnomer, I can only
apologize. And ask what term is in common use to describe what is
happening at that point in time.
Since, for this discussion, we're already in start_kernel(), the only
cmdline.c available is the one in arch/x86/lib. That's that one that is
instrumented by KASAN, and the one that is causing problems in this
scenario. The strncmp(), too.
>> Right. My goal was to get a conversation started, because folks are
>> running into this problem when KASAN is enabled.
>
> You say KASAN. Why is there KCOV_INSTRUMENT_cmdline.o too?
I don't care if KCOV_INSTRUMENT is enabled or not, but it's disabled for
arch/x86/mm/mem_encrypt_identity.c, so it seems reasonable that it
should be disable for this file, too, in the context of resolving this
problem.
To be more precise, the change is for "instrumentation", in general.
>> N.B. Here's another facet of this problem: cmdline.c doesn't (today)
>> contain anything that would trigger the stack protector. However, it's
>> possible to enable the stack protector globally when building, right? In
>> which case, a boot would fail, so we have the same issue: early boot
>> code has special requirements / restrictions.
>
> How so?
<sidebar>
Makefile contains
stackp-flags-$(CONFIG_STACKPROTECTOR) := -fstack-protector
stackp-flags-$(CONFIG_STACKPROTECTOR_STRONG) :=
-fstack-protector-strong
This means that (as I understand it) stack protection is decided by the
compiler, and is based on certain conditions in the code. This implies
that not every function will necessarily be instrumented.
However, if you decide to force the issue with something like
stackp-flags-$(CONFIG_STACKPROTECTOR) :=
-fstack-protector-full
stackp-flags-$(CONFIG_STACKPROTECTOR_STRONG) :=
-fstack-protector-all
Unless otherwise disabled, I believe this causes everything to be
instrumented. Which results in a boot failure. (Actually, in my tests
the system restarts after the decompression.) Note that intrumentation
such as KASAN isn't involved here. And I figure that doing this is
unsupported. It was just an interesting discovery.
</sidebar>
However, not relevant to the KASAN instrumentation problem.
Recap:
mem_encrypt_identity.c uses two common functions. The code in
mem_encrypt_identity.c runs soon after start_kernel() is invoked. The
SME feature command line parameter is searched for, and uses those two
common functions.
If instrumentation is enabled, it is applied to those to common
functions (but not mem_encrypt_identity.c). But if support
infrastructure for instrumentation is not initialized before the code in
mem_encrypt_identity is invoked, the kernel fails to boot.
After discussion over several weeks, we see the following options for a
solution:
1) Create a local static copy of strncmp in mem_encrypt_identity.c
2) Turn off instrumentation for lib/cmdline.c. The risk is that any
changes to its code would not enjoy the benefits of KASAN/etc testing
(if enabled).
3) Make a local static copy of cmdline_find_option() inside of
mem_encrypt_identity.c.
4) Use #defines and #include to have cmdline.c included within
mem_encrypt_identity.c. This maintains a single source file and
continues to allow the function to be instrumented for use everywhere
elsewhere in the kernel.
We believe (1) is a simple and effective choice, similar to inlining.
If a single source file (for cmdline.c) is preferred, option (4)
maintains that paradigm but gets the job done fairly cleanly. I
understand if there is reticience about include source files, but it's
not IMO an abhorrent practice. This also points to a single file as the
origin of the required function, so no confusion ensues.
I believe TGLX raised the issue of "what happens if we find a bug in
cmdline_find_option()?" Despite the fact that the file has been changed
only once in 4 years (by Thomas) it's an important consideration.
Another reason I lean towards (4), above.
I'm -hoping- this is more clear and succinct. How shall we proceed?
Powered by blists - more mailing lists