[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e55875fa-1264-7e08-3bb8-ed984f6ea5b3@intel.com>
Date: Mon, 25 Oct 2021 07:20:27 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Nadav Amit <nadav.amit@...il.com>, linux-mm@...ck.org
Cc: linux-kernel@...r.kernel.org, Nadav Amit <namit@...are.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Andrew Cooper <andrew.cooper3@...rix.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Peter Xu <peterx@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
Nick Piggin <npiggin@...il.com>, x86@...nel.org
Subject: Re: [PATCH v2 3/5] x86/mm: check exec permissions on fault
On 10/21/21 5:21 AM, Nadav Amit wrote:
> access_error() currently does not check for execution permission
> violation.
Ye
> As a result, spurious page-faults due to execution permission
> violation cause SIGSEGV.
While I could totally believe that something is goofy when VMAs are
being changed underneath a page fault, I'm having trouble figuring out
why the "if (error_code & X86_PF_WRITE)" code is being modified.
> It appears not to be an issue so far, but the next patches avoid TLB
> flushes on permission promotion, which can lead to this scenario. nodejs
> for instance crashes when TLB flush is avoided on permission promotion.
Just to be clear, "promotion" is going from something like:
W=0->W=1
or
NX=1->NX=0
right? I tend to call that "relaxing" permissions.
Currently, X86_PF_WRITE faults are considered an access error unless the
VMA to which the write occurred allows writes. Returning "no access
error" permits continuing and handling the copy-on-write.
It sounds like you want to expand that. You want to add a whole class
of new faults that can be ignored: not just that some COW handling might
be necessary, but that the PTE itself might be out of date. Just like
a "COW fault" may just result in setting the PTE.W=1 and moving on with
our day, an instruction fault might now just end up with setting
PTE.NX=0 and also moving on with our day.
I'm really confused why the "error_code & X86_PF_WRITE" case is getting
modified. I would have expected it to be something like just adding:
/* read, instruction fetch */
if (error_code & X86_PF_INSN) {
/* Avoid enforcing access error if spurious: */
if (unlikely(!(vma->vm_flags & VM_EXEC)))
return 1;
return 0;
}
I'm really confused what X86_PF_WRITE and X86_PF_INSN have in common
other than both being able to (now) be generated spuriously.
Powered by blists - more mailing lists