lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <09F31D01-E818-4538-A6E9-3E4779FC4B53@gmail.com>
Date:   Wed, 13 Oct 2021 08:59:04 -0700
From:   Nadav Amit <nadav.amit@...il.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Andrew Cooper <andrew.cooper3@...rix.com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
        Nick Piggin <npiggin@...il.com>, x86@...nel.org
Subject: Re: [PATCH 1/2] mm/mprotect: use mmu_gather



> On Oct 12, 2021, at 4:20 PM, Peter Xu <peterx@...hat.com> wrote:
> 
> On Tue, Oct 12, 2021 at 10:31:45AM -0700, Nadav Amit wrote:
>> 
>> 
>>> On Oct 12, 2021, at 3:16 AM, Peter Xu <peterx@...hat.com> wrote:
>>> 
>>> On Sat, Sep 25, 2021 at 01:54:22PM -0700, Nadav Amit wrote:
>>>> @@ -338,25 +344,25 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
>>>> 	struct mm_struct *mm = vma->vm_mm;
>>>> 	pgd_t *pgd;
>>>> 	unsigned long next;
>>>> -	unsigned long start = addr;
>>>> 	unsigned long pages = 0;
>>>> +	struct mmu_gather tlb;
>>>> 
>>>> 	BUG_ON(addr >= end);
>>>> 	pgd = pgd_offset(mm, addr);
>>>> 	flush_cache_range(vma, addr, end);
>>>> 	inc_tlb_flush_pending(mm);
>>>> +	tlb_gather_mmu(&tlb, mm);
>>>> +	tlb_start_vma(&tlb, vma);
>>> 
>>> Pure question:
>>> 
>>> I actually have no idea why tlb_start_vma() is needed here, as protection range
>>> can be just a single page, but anyway.. I do see that tlb_start_vma() contains
>>> a whole-vma flush_cache_range() when the arch needs it, then does it mean that
>>> besides the inc_tlb_flush_pending() to be dropped, so as to the other call to
>>> flush_cache_range() above?
>> 
>> Good point.
>> 
>> tlb_start_vma() and tlb_end_vma() are required since some archs do not
>> batch TLB flushes across VMAs (e.g., ARM).
> 
> Sorry I didn't follow here - as change_protection() is per-vma anyway, so I
> don't see why it needs to consider vma crossing.
> 
> In all cases, it'll be great if you could add some explanation into commit
> message on why we need tlb_{start|end}_vma(), as I think it could not be
> obvious to all people.

tlb_start_vma() is required when we switch from flush_tlb_range() because
certain properties of the VMA (e.g., executable) are needed on certain
arch. That’s the reason flush_tlb_range() requires the VMA that is
invalidated to be provided.

Regardless, there is an interface and that is the way it is used. I am not
inclined to break it, even if it was possible, for unclear performance
benefits.

As I discussed offline with Andrea and David, switching to tlb_gather_mmu()
interface has additional advantages than batching and avoiding unnecessary
flushes on PTE permission promotion (as done in patch 2). If a single PTE
is updated out of a bigger range, currently flush_tlb_range() would flush
the whole range instead of the single page. In addition, once I fix this
patch-set, if you update a THP, you would (at least on x86) be able to
flush a single PTE instead of flushing 512 entries (which would actually
be done using a full TLB flush).

I would say that as I mentioned in a different thread, and was not
upfront about before, one of the motivations of mine behind this patch
is that I need a vectored UFFDIO_WRITEPROTECTV interface for performance.
Nevertheless, I think these two patches stand by themselves and have
independent value.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ