[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8065c333-0911-04a2-f91e-7c2e0cc7ec51@intel.com>
Date: Wed, 9 Feb 2022 13:51:42 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Rick Edgecombe <rick.p.edgecombe@...el.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, linux-mm@...ck.org,
linux-arch@...r.kernel.org, linux-api@...r.kernel.org,
Arnd Bergmann <arnd@...db.de>,
Andy Lutomirski <luto@...nel.org>,
Balbir Singh <bsingharora@...il.com>,
Borislav Petkov <bp@...en8.de>,
Cyrill Gorcunov <gorcunov@...il.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Eugene Syromiatnikov <esyr@...hat.com>,
Florian Weimer <fweimer@...hat.com>,
"H . J . Lu" <hjl.tools@...il.com>, Jann Horn <jannh@...gle.com>,
Jonathan Corbet <corbet@....net>,
Kees Cook <keescook@...omium.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Nadav Amit <nadav.amit@...il.com>,
Oleg Nesterov <oleg@...hat.com>, Pavel Machek <pavel@....cz>,
Peter Zijlstra <peterz@...radead.org>,
Randy Dunlap <rdunlap@...radead.org>,
"Ravi V . Shankar" <ravi.v.shankar@...el.com>,
Dave Martin <Dave.Martin@....com>,
Weijiang Yang <weijiang.yang@...el.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
joao.moreira@...el.com, John Allen <john.allen@....com>,
kcc@...gle.com, eranian@...gle.com
Cc: Yu-cheng Yu <yu-cheng.yu@...el.com>
Subject: Re: [PATCH 17/35] mm: Fixup places that call pte_mkwrite() directly
On 1/30/22 13:18, Rick Edgecombe wrote:
> - do_anonymous_page() and migrate_vma_insert_page() check VM_WRITE directly
> and call pte_mkwrite(), which is the same as maybe_mkwrite(). Change
> them to maybe_mkwrite().
Those look OK.
> - In do_numa_page(), if the numa entry was writable, then pte_mkwrite()
> is called directly. Fix it by doing maybe_mkwrite(). Make the same
> changes to do_huge_pmd_numa_page().
This is another "what", not "why" changelog. This change puzzles me.
*Why* is this needed? It sounds like pte_mkwrite() doesn't work for
shadow stack PTEs. Let's say that explicitly.
I also this this is ab/misuse of maybe_mkwrite().
The shadow stack VMA *REQUIRES* PTEs with Dirty=1. There's no *maybe*
about it. The rest of this is essentially a hack to get
VM_SHADOW_STACK-required bits into the PTE. We have a place where we
store those VMA-required bits: vma->vm_page_prot. Look at how we store
the pkey bits in there for instance.
Let's say we set _PAGE_DIRTY in vma->vm_page_prot. We'd come into
do_anonymous_page() for instance and do this:
> entry = mk_pte(page, vma->vm_page_prot); <--- PTE is Write=0,Dirty=1 Yay!
> entry = pte_sw_mkyoung(entry);
> if (vma->vm_flags & VM_WRITE) <--- False, skip the pte_mkwrite()
> entry = pte_mkwrite(pte_mkdirty(entry));
In other words, it "just works" because shadow stack VMAs don't have
VM_WRITE set.
I think the other VM_WRITE checks would be fine too, although I'm unsure
about the change_page_attr() one.
Powered by blists - more mailing lists