[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4ghxbdWoRF6U=PSLLQaUKGx55MzYSPVrtsBug7ETv5ybg@mail.gmail.com>
Date: Fri, 15 Dec 2017 08:38:02 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Dave Hansen <dave.hansen@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>, X86 ML <x86@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirsky <luto@...nel.org>,
Borislav Petkov <bpetkov@...e.de>,
Greg KH <gregkh@...uxfoundation.org>, keescook@...gle.com,
Hugh Dickins <hughd@...gle.com>,
Brian Gerst <brgerst@...il.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Juergen Gross <jgross@...e.com>,
David Laight <David.Laight@...lab.com>,
Eduardo Valentin <eduval@...zon.com>,
"Liguori, Anthony" <aliguori@...zon.com>,
Will Deacon <will.deacon@....com>,
Linux MM <linux-mm@...ck.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH v2 01/17] mm/gup: Fixup p*_access_permitted()
On Fri, Dec 15, 2017 at 3:38 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Fri, Dec 15, 2017 at 11:25:29AM +0100, Peter Zijlstra wrote:
>> The memory one is also clearly wrong, not having access does not a write
>> fault make. If we have pte_write() set we should not do_wp_page() just
>> because we don't have access. This falls under the "doing anything other
>> than hard failure for !access is crazy" header.
>
> So per the very same reasoning I think the below is warranted too; also
> rename that @dirty variable, because its also wrong.
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 5eb3d2524bdc..0d43b347eb0a 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3987,7 +3987,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
> .pgoff = linear_page_index(vma, address),
> .gfp_mask = __get_fault_gfp_mask(vma),
> };
> - unsigned int dirty = flags & FAULT_FLAG_WRITE;
> + unsigned int write = flags & FAULT_FLAG_WRITE;
> struct mm_struct *mm = vma->vm_mm;
> pgd_t *pgd;
> p4d_t *p4d;
> @@ -4013,7 +4013,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
>
> /* NUMA case for anonymous PUDs would go here */
>
> - if (dirty && !pud_access_permitted(orig_pud, WRITE)) {
> + if (write && !pud_write(orig_pud)) {
> ret = wp_huge_pud(&vmf, orig_pud);
> if (!(ret & VM_FAULT_FALLBACK))
> return ret;
> @@ -4046,7 +4046,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
> if (pmd_protnone(orig_pmd) && vma_is_accessible(vma))
> return do_huge_pmd_numa_page(&vmf, orig_pmd);
>
> - if (dirty && !pmd_access_permitted(orig_pmd, WRITE)) {
> + if (write && !pmd_write(orig_pmd)) {
> ret = wp_huge_pmd(&vmf, orig_pmd);
> if (!(ret & VM_FAULT_FALLBACK))
> return ret;
>
>
> I still cannot make sense of what the intention behind these changes
> were, the Changelog that went with them is utter crap, it doesn't
> explain anything.
The motivation was that I noticed that get_user_pages_fast() was doing
a full pud_access_permitted() check, but the get_user_pages() slow
path was only doing a pud_write() check. That was inconsistent so I
went to go resolve that across all the pte types and ended up making a
mess of things, I'm fine if the answer is that we should have went the
other way to only do write checks. However, when I was investigating
which way to go the aspect that persuaded me to start sprinkling
p??_access_permitted checks around was that the application behavior
changed between mmap access and direct-i/o access to the same buffer.
I assumed that different access behavior between those would be an
inconsistent surprise to userspace. Although, infinitely looping in
handle_mm_fault is an even worse surprise, apologies for that.
Powered by blists - more mailing lists