lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <36B9DE22-E3BC-4CB2-8E3F-B21B61434CD3@vmware.com>
Date:   Wed, 21 Sep 2022 07:17:56 +0000
From:   Nadav Amit <namit@...are.com>
To:     Anshuman Khandual <anshuman.khandual@....com>
CC:     Yicong Yang <yangyicong@...wei.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux MM <linux-mm@...ck.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "catalin.marinas@....com" <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "corbet@....net" <corbet@....net>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "arnd@...db.de" <arnd@...db.de>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "darren@...amperecomputing.com" <darren@...amperecomputing.com>,
        "yangyicong@...ilicon.com" <yangyicong@...ilicon.com>,
        "huzhanyuan@...o.com" <huzhanyuan@...o.com>,
        "lipeifeng@...o.com" <lipeifeng@...o.com>,
        "zhangshiming@...o.com" <zhangshiming@...o.com>,
        "guojian@...o.com" <guojian@...o.com>,
        "realmz6@...il.com" <realmz6@...il.com>,
        "linux-mips@...r.kernel.org" <linux-mips@...r.kernel.org>,
        "openrisc@...ts.librecores.org" <openrisc@...ts.librecores.org>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
        "linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
        Barry Song <21cnbao@...il.com>,
        "wangkefeng.wang@...wei.com" <wangkefeng.wang@...wei.com>,
        "xhao@...ux.alibaba.com" <xhao@...ux.alibaba.com>,
        "prime.zeng@...ilicon.com" <prime.zeng@...ilicon.com>,
        Barry Song <v-songbaohua@...o.com>,
        Mel Gorman <mgorman@...e.de>
Subject: Re: [PATCH v3 4/4] arm64: support batched/deferred tlb shootdown
 during page reclamation

On Sep 20, 2022, at 11:53 PM, Anshuman Khandual <anshuman.khandual@....com> wrote:

> ⚠ External Email
> 
> On 8/22/22 13:51, Yicong Yang wrote:
>> +static inline void arch_tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch,
>> +                                     struct mm_struct *mm,
>> +                                     unsigned long uaddr)
>> +{
>> +     __flush_tlb_page_nosync(mm, uaddr);
>> +}
>> +
>> +static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>> +{
>> +     dsb(ish);
>> +}
> 
> Just wondering if arch_tlbbatch_add_mm() could also detect continuous mapping
> TLB invalidation requests on a given mm and try to generate a range based TLB
> invalidation such as flush_tlb_range().
> 
> struct arch_tlbflush_unmap_batch via task->tlb_ubc->arch can track continuous
> ranges while being queued up via arch_tlbbatch_add_mm(), any range formed can
> later be flushed in subsequent arch_tlbbatch_flush() ?
> 
> OR
> 
> It might not be worth the effort and complexity, in comparison to performance
> improvement, TLB range flush brings in ?

So here are my 2 cents, based on my experience with Intel-x86. It is likely
different on arm64, but perhaps it can provide you some insight into what
parameters you should measure and consider.

In general there is a tradeoff between full TLB flushes and entry-specific
ones. Flushing specific entries takes more time than flushing the entire
TLB, but sade TLB refills.

Dave Hansen made some calculations in the past and came up with 33 as a
magic cutoff number, i.e., if you need to flush more than 33 entries, just
flush the entire TLB. I am not sure that this exact number is very
meaningful, since one might argue that it should’ve taken PTI into account
(which might require twice as many TLB invalidations).

Anyhow, back to arch_tlbbatch_add_mm(). It may be possible to track ranges,
but the question is whether you would actually succeed in forming continuous
ranges that are eventually (on x86) smaller than the full TLB flush cutoff
(=33). Questionable (perhaps better with MGLRU?).

Then, you should remember that tracking should be very efficient, since even
few cache misses might have greater cost than what you save by
selective-flushing. Finally, on x86 you would need to invoke the smp/IPI
layer multiple times to send different cores the relevant range they need to
flush.

IOW: It is somewhat complicated to implement efficeintly. On x86, and
probably other IPI-based TLB shootdown systems, does not have clear
performance benefit (IMHO).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ