[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEHKu-0g6_MBiAST@google.com>
Date: Thu, 5 Jun 2025 09:50:03 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Rik van Riel <riel@...riel.com>
Cc: linux-kernel@...r.kernel.org, kernel-team@...a.com,
dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
bp@...en8.de, x86@...nel.org, nadav.amit@...il.com, tglx@...utronix.de
Subject: Re: [RFC PATCH v3 0/7] Intel RAR TLB invalidation
On Thu, Jun 05, 2025, Rik van Riel wrote:
> This patch series adds support for IPI-less TLB invalidation
> using Intel RAR technology.
>
> Intel RAR differs from AMD INVLPGB in a few ways:
> - RAR goes through (emulated?) APIC writes, not instructions
> - RAR flushes go through a memory table with 64 entries
> - RAR flushes can be targeted to a cpumask
> - The RAR functionality must be set up at boot time before it can be used
>
> The cpumask targeting has resulted in Intel RAR and AMD INVLPGB having
> slightly different rules:
> - Processes with dynamic ASIDs use IPI based shootdowns
> - INVLPGB: processes with a global ASID
> - always have the TLB up to date, on every CPU
> - never need to flush the TLB at context switch time
> - RAR: processes with global ASIDs
> - have the TLB up to date on CPUs in the mm_cpumask
> - can skip a TLB flush at context switch time if the CPU is in the mm_cpumask
> - need to flush the TLB when scheduled on a cpu not in the mm_cpumask,
> in case it used to run there before and the TLB has stale entries
>
> RAR functionality is present on Sapphire Rapids and newer CPUs.
>
> Information about Intel RAR can be found in this whitepaper.
>
> https://www.intel.com/content/dam/develop/external/us/en/documents/341431-remote-action-request-white-paper.pdf
>
> This patch series is based off a 2019 patch series created by Intel
Do you have performance numbers? IIRC, the reason that 2019 series never went
anywhere is because RAR wasn't a win for bare metal. Though I believe it _was_
a win for KVM, due to the cost of an IPI being significantly higher (requires a
VM-Exit => VM-Enter roundtrip).
> with patches later in the series modified to fit into
> the TLB flush code structure we have after AMD INVLPGB functionality
> was integrated.
Please provide diff stats in the cover letter, e.g. so that sub-maintainers like
myself don't have to scroll through every patch to figure out whether or not there's
something that requires their review/ack.
Powered by blists - more mailing lists