lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmh34gkk3ls.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Tue, 11 Feb 2025 14:33:51 +0100
From: Valentin Schneider <vschneid@...hat.com>
To: Jann Horn <jannh@...gle.com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
 virtualization@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org,
 loongarch@...ts.linux.dev, linux-riscv@...ts.infradead.org,
 linux-perf-users@...r.kernel.org, xen-devel@...ts.xenproject.org,
 kvm@...r.kernel.org, linux-arch@...r.kernel.org, rcu@...r.kernel.org,
 linux-hardening@...r.kernel.org, linux-mm@...ck.org,
 linux-kselftest@...r.kernel.org, bpf@...r.kernel.org,
 bcm-kernel-feedback-list@...adcom.com, Juergen Gross <jgross@...e.com>,
 Ajay Kaher <ajay.kaher@...adcom.com>, Alexey Makhalov
 <alexey.amakhalov@...adcom.com>, Russell King <linux@...linux.org.uk>,
 Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
 Huacai Chen <chenhuacai@...nel.org>, WANG Xuerui <kernel@...0n.name>, Paul
 Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt <palmer@...belt.com>,
 Albert Ou <aou@...s.berkeley.edu>, Thomas Gleixner <tglx@...utronix.de>,
 Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Dave
 Hansen <dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>,
 Peter Zijlstra <peterz@...radead.org>, Arnaldo Carvalho de Melo
 <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, Mark Rutland
 <mark.rutland@....com>, Alexander Shishkin
 <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, Ian
 Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
 "Liang, Kan" <kan.liang@...ux.intel.com>, Boris Ostrovsky
 <boris.ostrovsky@...cle.com>, Josh Poimboeuf <jpoimboe@...nel.org>, Pawan
 Gupta <pawan.kumar.gupta@...ux.intel.com>, Sean Christopherson
 <seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>, Andy Lutomirski
 <luto@...nel.org>, Arnd Bergmann <arnd@...db.de>, Frederic Weisbecker
 <frederic@...nel.org>, "Paul E. McKenney" <paulmck@...nel.org>, Jason
 Baron <jbaron@...mai.com>, Steven Rostedt <rostedt@...dmis.org>, Ard
 Biesheuvel <ardb@...nel.org>, Neeraj Upadhyay
 <neeraj.upadhyay@...nel.org>, Joel Fernandes <joel@...lfernandes.org>,
 Josh Triplett <josh@...htriplett.org>, Boqun Feng <boqun.feng@...il.com>,
 Uladzislau Rezki <urezki@...il.com>, Mathieu Desnoyers
 <mathieu.desnoyers@...icios.com>, Lai Jiangshan <jiangshanlai@...il.com>,
 Zqiang <qiang.zhang1211@...il.com>, Juri Lelli <juri.lelli@...hat.com>,
 Clark Williams <williams@...hat.com>, Yair Podemsky <ypodemsk@...hat.com>,
 Tomas Glozar <tglozar@...hat.com>, Vincent Guittot
 <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
 Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, Kees Cook
 <kees@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>, Christoph
 Hellwig <hch@...radead.org>, Shuah Khan <shuah@...nel.org>, Sami Tolvanen
 <samitolvanen@...gle.com>, Miguel Ojeda <ojeda@...nel.org>, Alice Ryhl
 <aliceryhl@...gle.com>, "Mike Rapoport (Microsoft)" <rppt@...nel.org>,
 Samuel Holland <samuel.holland@...ive.com>, Rong Xu <xur@...gle.com>,
 Nicolas Saenz Julienne <nsaenzju@...hat.com>, Geert Uytterhoeven
 <geert@...ux-m68k.org>, Yosry Ahmed <yosryahmed@...gle.com>, "Kirill A.
 Shutemov" <kirill.shutemov@...ux.intel.com>, "Masami Hiramatsu (Google)"
 <mhiramat@...nel.org>, Jinghao Jia <jinghao7@...inois.edu>, Luis
 Chamberlain <mcgrof@...nel.org>, Randy Dunlap <rdunlap@...radead.org>,
 Tiezhu Yang <yangtiezhu@...ngson.cn>
Subject: Re: [PATCH v4 29/30] x86/mm, mm/vmalloc: Defer
 flush_tlb_kernel_range() targeting NOHZ_FULL CPUs

On 10/02/25 23:08, Jann Horn wrote:
> On Mon, Feb 10, 2025 at 7:36 PM Valentin Schneider <vschneid@...hat.com> wrote:
>> What if isolated CPUs unconditionally did a TLBi as late as possible in
>> the stack right before returning to userspace? This would mean that upon
>> re-entering the kernel, an isolated CPU's TLB wouldn't contain any kernel
>> range translation - with the exception of whatever lies between the
>> last-minute flush and the actual userspace entry, which should be feasible
>> to vet? Then AFAICT there wouldn't be any work/flush to defer, the IPI
>> could be entirely silenced if it targets an isolated CPU.
>
> Two issues with that:
>

Firstly, thank you for entertaining the idea :-)

> 1. I think the "Common not Private" feature Will Deacon referred to is
> incompatible with this idea:
> <https://developer.arm.com/documentation/101811/0104/Address-spaces/Common-not-Private>
> says "When the CnP bit is set, the software promises to use the ASIDs
> and VMIDs in the same way on all processors, which allows the TLB
> entries that are created by one processor to be used by another"
>


Sorry for being obtuse - I can understand inconsistent TLB states (old vs
new translations being present in separate TLBs) due to not sending the
flush IPI causing an issue with that, but not "flushing early". Even if TLB
entries can be shared/accessed between CPUs, a CPU should be allowed not to
have a shared entry in its TLB - what am I missing?

> 2. It's wrong to assume that TLB entries are only populated for
> addresses you access - thanks to speculative execution, you have to
> assume that the CPU might be populating random TLB entries all over
> the place.

Gotta love speculation. Now it is supposed to be limited to genuinely
accessible data & code, right? Say theoretically we have a full TLBi as
literally the last thing before doing the return-to-userspace, speculation
should be limited to executing maybe bits of the return-from-userspace
code?

Furthermore, I would hope that once a CPU is executing in userspace, it's
not going to populate the TLB with kernel address translations - AIUI the
whole vulnerability mitigation debacle was about preventing this sort of
thing.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ