lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 17 Mar 2023 09:58:17 +0800
From:   "Yin, Fengwei" <fengwei.yin@...el.com>
To:     Matthew Wilcox <willy@...radead.org>,
        Ryan Roberts <ryan.roberts@....com>
CC:     <linux-arch@...r.kernel.org>, <will@...nel.org>,
        <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 35/36] mm: Convert do_set_pte() to set_pte_range()



On 3/17/2023 1:52 AM, Matthew Wilcox wrote:
> On Thu, Mar 16, 2023 at 04:38:58PM +0000, Ryan Roberts wrote:
>> On 16/03/2023 16:23, Yin, Fengwei wrote:
>>>> I think you are changing behavior here - is this intentional? Previously this
>>>> would be evaluated per page, now its evaluated once for the whole range. The
>>>> intention below is that directly faulted pages are mapped young and prefaulted
>>>> pages are mapped old. But now a whole range will be mapped the same.
>>>
>>> Yes. You are right here.
>>>
>>> Look at the prefault and cpu_has_hw_af for ARM64, it looks like we
>>> can avoid to handle vmf->address == addr specially. It's OK to 
>>> drop prefault and change the logic here a little bit to:
>>>   if (arch_wants_old_prefaulted_pte())
>>>       entry = pte_mkold(entry);
>>>   else
>>>       entry = pte_sw_mkyong(entry);
>>>
>>> It's not necessary to use pte_sw_mkyong for vmf->address == addr
>>> because HW will set the ACCESS bit in page table entry.
>>>
>>> Add Will Deacon in case I missed something here. Thanks.
>>
>> I'll defer to Will's response, but not all arm HW supports HW access flag
>> management. In that case it's done by SW, so I would imagine that by setting
>> this to old initially, we will get a second fault to set the access bit, which
>> will slow things down. I wonder if you will need to split this into (up to) 3
>> calls to set_ptes()?
> 
> I don't think we should do that.  The limited information I have from
> various microarchitectures is that the PTEs must differ only in their
> PFN bits in order to use larger TLB entries.  That includes the Accessed
> bit (or equivalent).  So we should mkyoung all the PTEs in the same
> folio, at least initially.
> 
> That said, we should still do this conditionally.  We'll prefault some
> other folios too.  So I think this should be:
> 
>         bool prefault = (addr > vmf->address) || ((addr + nr) < vmf->address);
> 
According to commit 46bdb4277f98e70d0c91f4289897ade533fe9e80, if hardware access
flag is supported on ARM64, there is benefit if prefault PTEs is set as "old".
If we change prefault like above, the PTEs is set as "yong" which loose benefit
on ARM64 with hardware access flag.

ITOH, if from "old" to "yong" is cheap, why not leave all PTEs of folio as "old"
and let hardware to update it to "yong"?

Regards
Yin, Fengwei

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ