[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fc2274d4-4f1d-d86b-38ad-d80141c3115c@intel.com>
Date: Thu, 10 Feb 2022 11:27:34 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Rick Edgecombe <rick.p.edgecombe@...el.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, linux-mm@...ck.org,
linux-arch@...r.kernel.org, linux-api@...r.kernel.org,
Arnd Bergmann <arnd@...db.de>,
Andy Lutomirski <luto@...nel.org>,
Balbir Singh <bsingharora@...il.com>,
Borislav Petkov <bp@...en8.de>,
Cyrill Gorcunov <gorcunov@...il.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Eugene Syromiatnikov <esyr@...hat.com>,
Florian Weimer <fweimer@...hat.com>,
"H . J . Lu" <hjl.tools@...il.com>, Jann Horn <jannh@...gle.com>,
Jonathan Corbet <corbet@....net>,
Kees Cook <keescook@...omium.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Nadav Amit <nadav.amit@...il.com>,
Oleg Nesterov <oleg@...hat.com>, Pavel Machek <pavel@....cz>,
Peter Zijlstra <peterz@...radead.org>,
Randy Dunlap <rdunlap@...radead.org>,
"Ravi V . Shankar" <ravi.v.shankar@...el.com>,
Dave Martin <Dave.Martin@....com>,
Weijiang Yang <weijiang.yang@...el.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
joao.moreira@...el.com, John Allen <john.allen@....com>,
kcc@...gle.com, eranian@...gle.com
Cc: Yu-cheng Yu <yu-cheng.yu@...el.com>
Subject: Re: [PATCH 21/35] mm/mprotect: Exclude shadow stack from
preserve_write
On 1/30/22 13:18, Rick Edgecombe wrote:
> In change_pte_range(), when a PTE is changed for prot_numa, _PAGE_RW is
> preserved to avoid the additional write fault after the NUMA hinting fault.
> However, pte_write() now includes both normal writable and shadow stack
> (RW=0, Dirty=1) PTEs, but the latter does not have _PAGE_RW and has no need
> to preserve it.
This series creates an interesting situation: it causes a logical
disconnection between things that were tightly coupled before. For
instance, before this series, _PAGE_RW=1 and "writable" really were
synonyms. They meant the same thing.
One of the complexities in this series is differentiating the two. For
instance, a shadow stack page can be written to, even though it has
_PAGE_RW=0.
This particular patch seems to be hacking around the problem that a
p*_mkwrite() doesn't work on shadow stack PTE/PMDs. First, that makes
me wonder what *actually* happens if we do a plain pte_mkwrite() on a
shadow stack PTE. I *think* it will take the [Write=0,Dirty=1] PTE and
pte = pte_set_flags(pte, _PAGE_RW);
so we'll end up with [Write=1,Dirty=1], which is bad.
Let's say pte_mkwrite() can't be fixed. We should probably make it
VM_BUG_ON() if it's ever asked to muck with a shadow stack PTE.
It's also weird because we have this pte_write()==1 PTE in a !VM_WRITE
VMA. Then, we're trying to pte_mkwrite() under this !VM_WRITE VMA.
pte_write() <-- returns true for on shadow stack PTE!
pte_mkwrite() <-- illegal on shadow stack PTE
I need to think about this a little more. I don't have a solution.
But, as-is, it seems untenable. The rules are just too counter
intuitive to live.
Powered by blists - more mailing lists