[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1536957543.12990.9.camel@intel.com>
Date: Fri, 14 Sep 2018 13:39:03 -0700
From: Yu-cheng Yu <yu-cheng.yu@...el.com>
To: Peter Zijlstra <peterz@...radead.org>,
Dave Hansen <dave.hansen@...ux.intel.com>
Cc: Jann Horn <jannh@...gle.com>,
the arch/x86 maintainers <x86@...nel.org>,
"H . Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
kernel list <linux-kernel@...r.kernel.org>,
linux-doc@...r.kernel.org, Linux-MM <linux-mm@...ck.org>,
linux-arch <linux-arch@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>,
Arnd Bergmann <arnd@...db.de>,
Andy Lutomirski <luto@...capital.net>,
Balbir Singh <bsingharora@...il.com>,
Cyrill Gorcunov <gorcunov@...il.com>,
Florian Weimer <fweimer@...hat.com>, hjl.tools@...il.com,
Jonathan Corbet <corbet@....net>, keescook@...omium.org,
Mike Kravetz <mike.kravetz@...cle.com>,
Nadav Amit <nadav.amit@...il.com>,
Oleg Nesterov <oleg@...hat.com>, Pavel Machek <pavel@....cz>,
ravi.v.shankar@...el.com, vedvyas.shanbhogue@...el.com
Subject: Re: [RFC PATCH v3 12/24] x86/mm: Modify ptep_set_wrprotect and
pmdp_set_wrprotect for _PAGE_DIRTY_SW
On Fri, 2018-08-31 at 18:29 +0200, Peter Zijlstra wrote:
> On Fri, Aug 31, 2018 at 08:58:39AM -0700, Dave Hansen wrote:
> >
> > On 08/31/2018 08:48 AM, Yu-cheng Yu wrote:
> > >
> > > To trigger a race in ptep_set_wrprotect(), we need to fork from one of
> > > three pthread siblings.
> > >
> > > Or do we measure only how much this affects fork?
> > > If there is no racing, the effect should be minimal.
> > We don't need a race.
> >
> > I think the cmpxchg will be slower, even without a race, than the code
> > that was there before. The cmpxchg is a simple, straightforward
> > solution, but we're putting it in place of a plain memory write, which
> > is suboptimal.
> Note quite, the clear_bit() is LOCK prefixed.
With the updated ptep_set_wrprotect() below, I did MADV_WILLNEED to a shadow
stack of 8 MB, then 10,000 fork()'s, but could not prove it is more or less
efficient than the other. So can we say this is probably fine in terms of
efficiency?
Yu-cheng
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1203,7 +1203,36 @@ static inline pte_t ptep_get_and_clear_full(struct
mm_struct *mm,
static inline void ptep_set_wrprotect(struct mm_struct *mm,
unsigned long addr, pte_t *ptep)
{
+#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER
+ pte_t new_pte, pte = READ_ONCE(*ptep);
+
+ /*
+ * Some processors can start a write, but end up
+ * seeing a read-only PTE by the time they get
+ * to the Dirty bit. In this case, they will
+ * set the Dirty bit, leaving a read-only, Dirty
+ * PTE which looks like a Shadow Stack PTE.
+ *
+ * However, this behavior has been improved and
+ * will not occur on processors supporting
+ * Shadow Stacks. Without this guarantee, a
+ * transition to a non-present PTE and flush the
+ * TLB would be needed.
+ *
+ * When changing a writable PTE to read-only and
+ * if the PTE has _PAGE_DIRTY_HW set, we move
+ * that bit to _PAGE_DIRTY_SW so that the PTE is
+ * not a valid Shadow Stack PTE.
+ */
+ do {
+ new_pte = pte_wrprotect(pte);
+ new_pte.pte |= (new_pte.pte & _PAGE_DIRTY_HW) >>
+ _PAGE_BIT_DIRTY_HW << _PAGE_BIT_DIRTY_SW;
+ new_pte.pte &= ~_PAGE_DIRTY_HW;
+ } while (!try_cmpxchg(ptep, &pte, new_pte));
+#else
clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte);
+#endif
}
Powered by blists - more mailing lists