[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2AB9EC05-16B4-46F8-B716-53941C1C9A50@vmware.com>
Date: Thu, 15 Sep 2022 14:31:45 +0000
From: Nadav Amit <namit@...are.com>
To: Barry Song <21cnbao@...il.com>
CC: Anshuman Khandual <anshuman.khandual@....com>,
Yicong Yang <yangyicong@...wei.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux MM <linux-mm@...ck.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"x86@...nel.org" <x86@...nel.org>,
"catalin.marinas@....com" <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"corbet@....net" <corbet@....net>,
"peterz@...radead.org" <peterz@...radead.org>,
"arnd@...db.de" <arnd@...db.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"darren@...amperecomputing.com" <darren@...amperecomputing.com>,
"yangyicong@...ilicon.com" <yangyicong@...ilicon.com>,
"huzhanyuan@...o.com" <huzhanyuan@...o.com>,
"lipeifeng@...o.com" <lipeifeng@...o.com>,
"zhangshiming@...o.com" <zhangshiming@...o.com>,
"guojian@...o.com" <guojian@...o.com>,
"realmz6@...il.com" <realmz6@...il.com>,
"linux-mips@...r.kernel.org" <linux-mips@...r.kernel.org>,
"openrisc@...ts.librecores.org" <openrisc@...ts.librecores.org>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
"linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
"wangkefeng.wang@...wei.com" <wangkefeng.wang@...wei.com>,
"xhao@...ux.alibaba.com" <xhao@...ux.alibaba.com>,
"prime.zeng@...ilicon.com" <prime.zeng@...ilicon.com>,
Barry Song <v-songbaohua@...o.com>,
Mel Gorman <mgorman@...e.de>
Subject: Re: [PATCH v3 4/4] arm64: support batched/deferred tlb shootdown
during page reclamation
> On Sep 14, 2022, at 11:42 PM, Barry Song <21cnbao@...il.com> wrote:
>
>>
>> The very idea behind TLB deferral is the opportunity it (might) provide
>> to accumulate address ranges and cpu masks so that individual TLB flush
>> can be replaced with a more cost effective range based TLB flush. Hence
>> I guess unless address range or cpumask based cost effective TLB flush
>> is available, deferral does not improve the unmap performance as much.
>
>
> After sending tlbi, if we wait for the completion of tlbi, we have to get Ack
> from all cpus in the system, tlbi is not scalable. The point here is that we
> avoid waiting for each individual TLBi. Alternatively, they are batched. If
> you read the benchmark in the commit log, you can find the great decline
> in the cost to swap out a page.
Just a minor correction: arch_tlbbatch_flush() does not collect ranges.
On x86 it only accumulate CPU mask.
Powered by blists - more mailing lists