[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmh34gkk3ls.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Tue, 11 Feb 2025 14:33:51 +0100
From: Valentin Schneider <vschneid@...hat.com>
To: Jann Horn <jannh@...gle.com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
virtualization@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org,
loongarch@...ts.linux.dev, linux-riscv@...ts.infradead.org,
linux-perf-users@...r.kernel.org, xen-devel@...ts.xenproject.org,
kvm@...r.kernel.org, linux-arch@...r.kernel.org, rcu@...r.kernel.org,
linux-hardening@...r.kernel.org, linux-mm@...ck.org,
linux-kselftest@...r.kernel.org, bpf@...r.kernel.org,
bcm-kernel-feedback-list@...adcom.com, Juergen Gross <jgross@...e.com>,
Ajay Kaher <ajay.kaher@...adcom.com>, Alexey Makhalov
<alexey.amakhalov@...adcom.com>, Russell King <linux@...linux.org.uk>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
Huacai Chen <chenhuacai@...nel.org>, WANG Xuerui <kernel@...0n.name>, Paul
Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt <palmer@...belt.com>,
Albert Ou <aou@...s.berkeley.edu>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Dave
Hansen <dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>, Arnaldo Carvalho de Melo
<acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, Mark Rutland
<mark.rutland@....com>, Alexander Shishkin
<alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, Ian
Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
"Liang, Kan" <kan.liang@...ux.intel.com>, Boris Ostrovsky
<boris.ostrovsky@...cle.com>, Josh Poimboeuf <jpoimboe@...nel.org>, Pawan
Gupta <pawan.kumar.gupta@...ux.intel.com>, Sean Christopherson
<seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>, Andy Lutomirski
<luto@...nel.org>, Arnd Bergmann <arnd@...db.de>, Frederic Weisbecker
<frederic@...nel.org>, "Paul E. McKenney" <paulmck@...nel.org>, Jason
Baron <jbaron@...mai.com>, Steven Rostedt <rostedt@...dmis.org>, Ard
Biesheuvel <ardb@...nel.org>, Neeraj Upadhyay
<neeraj.upadhyay@...nel.org>, Joel Fernandes <joel@...lfernandes.org>,
Josh Triplett <josh@...htriplett.org>, Boqun Feng <boqun.feng@...il.com>,
Uladzislau Rezki <urezki@...il.com>, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com>, Lai Jiangshan <jiangshanlai@...il.com>,
Zqiang <qiang.zhang1211@...il.com>, Juri Lelli <juri.lelli@...hat.com>,
Clark Williams <williams@...hat.com>, Yair Podemsky <ypodemsk@...hat.com>,
Tomas Glozar <tglozar@...hat.com>, Vincent Guittot
<vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, Kees Cook
<kees@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>, Christoph
Hellwig <hch@...radead.org>, Shuah Khan <shuah@...nel.org>, Sami Tolvanen
<samitolvanen@...gle.com>, Miguel Ojeda <ojeda@...nel.org>, Alice Ryhl
<aliceryhl@...gle.com>, "Mike Rapoport (Microsoft)" <rppt@...nel.org>,
Samuel Holland <samuel.holland@...ive.com>, Rong Xu <xur@...gle.com>,
Nicolas Saenz Julienne <nsaenzju@...hat.com>, Geert Uytterhoeven
<geert@...ux-m68k.org>, Yosry Ahmed <yosryahmed@...gle.com>, "Kirill A.
Shutemov" <kirill.shutemov@...ux.intel.com>, "Masami Hiramatsu (Google)"
<mhiramat@...nel.org>, Jinghao Jia <jinghao7@...inois.edu>, Luis
Chamberlain <mcgrof@...nel.org>, Randy Dunlap <rdunlap@...radead.org>,
Tiezhu Yang <yangtiezhu@...ngson.cn>
Subject: Re: [PATCH v4 29/30] x86/mm, mm/vmalloc: Defer
flush_tlb_kernel_range() targeting NOHZ_FULL CPUs
On 10/02/25 23:08, Jann Horn wrote:
> On Mon, Feb 10, 2025 at 7:36 PM Valentin Schneider <vschneid@...hat.com> wrote:
>> What if isolated CPUs unconditionally did a TLBi as late as possible in
>> the stack right before returning to userspace? This would mean that upon
>> re-entering the kernel, an isolated CPU's TLB wouldn't contain any kernel
>> range translation - with the exception of whatever lies between the
>> last-minute flush and the actual userspace entry, which should be feasible
>> to vet? Then AFAICT there wouldn't be any work/flush to defer, the IPI
>> could be entirely silenced if it targets an isolated CPU.
>
> Two issues with that:
>
Firstly, thank you for entertaining the idea :-)
> 1. I think the "Common not Private" feature Will Deacon referred to is
> incompatible with this idea:
> <https://developer.arm.com/documentation/101811/0104/Address-spaces/Common-not-Private>
> says "When the CnP bit is set, the software promises to use the ASIDs
> and VMIDs in the same way on all processors, which allows the TLB
> entries that are created by one processor to be used by another"
>
Sorry for being obtuse - I can understand inconsistent TLB states (old vs
new translations being present in separate TLBs) due to not sending the
flush IPI causing an issue with that, but not "flushing early". Even if TLB
entries can be shared/accessed between CPUs, a CPU should be allowed not to
have a shared entry in its TLB - what am I missing?
> 2. It's wrong to assume that TLB entries are only populated for
> addresses you access - thanks to speculative execution, you have to
> assume that the CPU might be populating random TLB entries all over
> the place.
Gotta love speculation. Now it is supposed to be limited to genuinely
accessible data & code, right? Say theoretically we have a full TLBi as
literally the last thing before doing the return-to-userspace, speculation
should be limited to executing maybe bits of the return-from-userspace
code?
Furthermore, I would hope that once a CPU is executing in userspace, it's
not going to populate the TLB with kernel address translations - AIUI the
whole vulnerability mitigation debacle was about preventing this sort of
thing.
Powered by blists - more mailing lists