[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez3OKj5Y8BURmqU9BAYWFJH8E8B5Dj9c0=UHutqf7r3hhg@mail.gmail.com>
Date: Fri, 2 Oct 2020 07:45:20 +0200
From: Jann Horn <jannh@...gle.com>
To: Marco Elver <elver@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Alexander Potapenko <glider@...gle.com>,
"H . Peter Anvin" <hpa@...or.com>,
"Paul E . McKenney" <paulmck@...nel.org>,
Andrey Konovalov <andreyknvl@...gle.com>,
Andrey Ryabinin <aryabinin@...tuozzo.com>,
Andy Lutomirski <luto@...nel.org>,
Borislav Petkov <bp@...en8.de>,
Catalin Marinas <catalin.marinas@....com>,
Christoph Lameter <cl@...ux.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
David Rientjes <rientjes@...gle.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Eric Dumazet <edumazet@...gle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Hillf Danton <hdanton@...a.com>,
Ingo Molnar <mingo@...hat.com>, Jonathan.Cameron@...wei.com,
Jonathan Corbet <corbet@....net>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Kees Cook <keescook@...omium.org>,
Mark Rutland <mark.rutland@....com>,
Pekka Enberg <penberg@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, sjpark@...zon.com,
Thomas Gleixner <tglx@...utronix.de>,
Vlastimil Babka <vbabka@...e.cz>,
Will Deacon <will@...nel.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
linux-doc@...r.kernel.org,
kernel list <linux-kernel@...r.kernel.org>,
kasan-dev <kasan-dev@...glegroups.com>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH v4 02/11] x86, kfence: enable KFENCE for x86
On Tue, Sep 29, 2020 at 3:38 PM Marco Elver <elver@...gle.com> wrote:
> Add architecture specific implementation details for KFENCE and enable
> KFENCE for the x86 architecture. In particular, this implements the
> required interface in <asm/kfence.h> for setting up the pool and
> providing helper functions for protecting and unprotecting pages.
>
> For x86, we need to ensure that the pool uses 4K pages, which is done
> using the set_memory_4k() helper function.
[...]
> diff --git a/arch/x86/include/asm/kfence.h b/arch/x86/include/asm/kfence.h
[...]
> +/* Protect the given page and flush TLBs. */
> +static inline bool kfence_protect_page(unsigned long addr, bool protect)
> +{
> + unsigned int level;
> + pte_t *pte = lookup_address(addr, &level);
> +
> + if (!pte || level != PG_LEVEL_4K)
Do we actually expect this to happen, or is this just a "robustness"
check? If we don't expect this to happen, there should be a WARN_ON()
around the condition.
> + return false;
> +
> + if (protect)
> + set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT));
> + else
> + set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT));
Hmm... do we have this helper (instead of using the existing helpers
for modifying memory permissions) to work around the allocation out of
the data section?
> + flush_tlb_one_kernel(addr);
> + return true;
> +}
> +
> +#endif /* _ASM_X86_KFENCE_H */
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
[...]
> @@ -701,6 +702,9 @@ no_context(struct pt_regs *regs, unsigned long error_code,
> }
> #endif
>
> + if (kfence_handle_page_fault(address))
> + return;
> +
> /*
> * 32-bit:
> *
The standard 5 lines of diff context don't really make it obvious
what's going on here. Here's a diff with more context:
/*
* Stack overflow? During boot, we can fault near the initial
* stack in the direct map, but that's not an overflow -- check
* that we're in vmalloc space to avoid this.
*/
if (is_vmalloc_addr((void *)address) &&
(((unsigned long)tsk->stack - 1 - address < PAGE_SIZE) ||
address - ((unsigned long)tsk->stack + THREAD_SIZE) < PAGE_SIZE)) {
unsigned long stack = __this_cpu_ist_top_va(DF) -
sizeof(void *);
/*
* We're likely to be running with very little stack space
* left. It's plausible that we'd hit this condition but
* double-fault even before we get this far, in which case
* we're fine: the double-fault handler will deal with it.
*
* We don't want to make it all the way into the oops code
* and then double-fault, though, because we're likely to
* break the console driver and lose most of the stack dump.
*/
asm volatile ("movq %[stack], %%rsp\n\t"
"call handle_stack_overflow\n\t"
"1: jmp 1b"
: ASM_CALL_CONSTRAINT
: "D" ("kernel stack overflow (page fault)"),
"S" (regs), "d" (address),
[stack] "rm" (stack));
unreachable();
}
#endif
+ if (kfence_handle_page_fault(address))
+ return;
+
/*
* 32-bit:
*
* Valid to do another page fault here, because if this fault
* had been triggered by is_prefetch fixup_exception would have
* handled it.
*
* 64-bit:
*
* Hall of shame of CPU/BIOS bugs.
*/
if (is_prefetch(regs, error_code, address))
return;
if (is_errata93(regs, address))
return;
/*
* Buggy firmware could access regions which might page fault, try to
* recover from such faults.
*/
if (IS_ENABLED(CONFIG_EFI))
efi_recover_from_page_fault(address);
oops:
/*
* Oops. The kernel tried to access some bad page. We'll have to
* terminate things with extreme prejudice:
*/
flags = oops_begin();
Shouldn't kfence_handle_page_fault() happen after prefetch handling,
at least? Maybe directly above the "oops" label?
Powered by blists - more mailing lists