[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202210031006.02C79ED58@keescook>
Date: Mon, 3 Oct 2022 10:18:09 -0700
From: Kees Cook <keescook@...omium.org>
To: Rick Edgecombe <rick.p.edgecombe@...el.com>
Cc: x86@...nel.org, "H . Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, linux-mm@...ck.org,
linux-arch@...r.kernel.org, linux-api@...r.kernel.org,
Arnd Bergmann <arnd@...db.de>,
Andy Lutomirski <luto@...nel.org>,
Balbir Singh <bsingharora@...il.com>,
Borislav Petkov <bp@...en8.de>,
Cyrill Gorcunov <gorcunov@...il.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Eugene Syromiatnikov <esyr@...hat.com>,
Florian Weimer <fweimer@...hat.com>,
"H . J . Lu" <hjl.tools@...il.com>, Jann Horn <jannh@...gle.com>,
Jonathan Corbet <corbet@....net>,
Mike Kravetz <mike.kravetz@...cle.com>,
Nadav Amit <nadav.amit@...il.com>,
Oleg Nesterov <oleg@...hat.com>, Pavel Machek <pavel@....cz>,
Peter Zijlstra <peterz@...radead.org>,
Randy Dunlap <rdunlap@...radead.org>,
"Ravi V . Shankar" <ravi.v.shankar@...el.com>,
Weijiang Yang <weijiang.yang@...el.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
joao.moreira@...el.com, John Allen <john.allen@....com>,
kcc@...gle.com, eranian@...gle.com, rppt@...nel.org,
jamorris@...ux.microsoft.com, dethoma@...rosoft.com,
Yu-cheng Yu <yu-cheng.yu@...el.com>
Subject: Re: [PATCH v2 01/39] Documentation/x86: Add CET description
On Thu, Sep 29, 2022 at 03:28:58PM -0700, Rick Edgecombe wrote:
> [...]
> +Overview
> +========
> +
> +Control-flow Enforcement Technology (CET) is term referring to several
> +related x86 processor features that provides protection against control
> +flow hijacking attacks. The HW feature itself can be set up to protect
> +both applications and the kernel. Only user-mode protection is implemented
> +in the 64-bit kernel.
This likely needs rewording, since it's not strictly true any more:
IBT is supported in kernel-mode now (CONFIG_X86_IBT).
> +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is
> +a secondary stack allocated from memory and cannot be directly modified by
> +applications. When executing a CALL instruction, the processor pushes the
> +return address to both the normal stack and the shadow stack. Upon
> +function return, the processor pops the shadow stack copy and compares it
> +to the normal stack copy. If the two differ, the processor raises a
> +control-protection fault. Indirect branch tracking verifies indirect
> +CALL/JMP targets are intended as marked by the compiler with 'ENDBR'
> +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking
> +and only Shadow Stack is currently supported in the kernel.
> +
> +The Kconfig options is X86_SHADOW_STACK, and it can be disabled with
> +the kernel parameter clearcpuid, like this: "clearcpuid=shstk".
> +
> +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1
> +or later are required. To build a CET-enabled application, GLIBC v2.28 or
> +later is also required.
> +
> +At run time, /proc/cpuinfo shows CET features if the processor supports
> +CET.
Maybe call them out by name: shstk ibt
> +CET arch_prctl()'s
> +==================
> +
> +Elf features should be enabled by the loader using the below arch_prctl's.
> +
> +arch_prctl(ARCH_CET_ENABLE, unsigned int feature)
> + Enable a single feature specified in 'feature'. Can only operate on
> + one feature at a time.
Does this mean only 1 bit out of the 32 may be specified?
> +
> +arch_prctl(ARCH_CET_DISABLE, unsigned int feature)
> + Disable features specified in 'feature'. Can only operate on
> + one feature at a time.
> +
> +arch_prctl(ARCH_CET_LOCK, unsigned int features)
> + Lock in features at their current enabled or disabled status.
How is the "features" argument processed here?
> [...]
> +Proc status
> +===========
> +To check if an application is actually running with shadow stack, the
> +user can read the /proc/$PID/arch_status. It will report "wrss" or
> +"shstk" depending on what is enabled.
TIL about "arch_status". :) Why is this a separate file? "status" is
already has unique field names.
> +Fork
> +----
> +
> +The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required
> +to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a
> +shadow access triggers a page fault with the shadow stack access bit set
> +in the page fault error code.
> +
> +When a task forks a child, its shadow stack PTEs are copied and both the
> +parent's and the child's shadow stack PTEs are cleared of the dirty bit.
> +Upon the next shadow stack access, the resulting shadow stack page fault
> +is handled by page copy/re-use.
> +
> +When a pthread child is created, the kernel allocates a new shadow stack
> +for the new thread.
Perhaps speak to the ASLR characteristics of the shstk here?
Also, it seems if there is a "Fork" section, there should be an "Exec"
section? I suspect it would be short: shstk is disabled when execve() is
called and must be re-enabled from userspace, yes?
-Kees
--
Kees Cook
Powered by blists - more mailing lists