linux-kernel - Re: [PATCH bpf-next v3 08/11] bpf: Fall back to nospec for Spectre v1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP01T76kOixPct5cOPHGKubFWSbSS7ztEnZc02v2wWGPOUYRCQ@mail.gmail.com>
Date: Fri, 2 May 2025 01:55:54 +0200
From: Kumar Kartikeya Dwivedi <memxor@...il.com>
To: Luis Gerhorst <luis.gerhorst@....de>
Cc: Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, 
	Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>, 
	Yonghong Song <yonghong.song@...ux.dev>, John Fastabend <john.fastabend@...il.com>, 
	KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>, 
	Jiri Olsa <jolsa@...nel.org>, Puranjay Mohan <puranjay@...nel.org>, 
	Xu Kuohai <xukuohai@...weicloud.com>, Catalin Marinas <catalin.marinas@....com>, 
	Will Deacon <will@...nel.org>, Hari Bathini <hbathini@...ux.ibm.com>, 
	Christophe Leroy <christophe.leroy@...roup.eu>, Naveen N Rao <naveen@...nel.org>, 
	Madhavan Srinivasan <maddy@...ux.ibm.com>, Michael Ellerman <mpe@...erman.id.au>, 
	Nicholas Piggin <npiggin@...il.com>, Mykola Lysenko <mykolal@...com>, Shuah Khan <shuah@...nel.org>, 
	Henriette Herzog <henriette.herzog@....de>, Saket Kumar Bhaskar <skb99@...ux.ibm.com>, 
	Cupertino Miranda <cupertino.miranda@...cle.com>, Jiayuan Chen <mrpre@....com>, 
	Matan Shachnai <m.shachnai@...il.com>, Dimitar Kanaliev <dimitar.kanaliev@...eground.com>, 
	Shung-Hsi Yu <shung-hsi.yu@...e.com>, Daniel Xu <dxu@...uu.xyz>, bpf@...r.kernel.org, 
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org, 
	linuxppc-dev@...ts.ozlabs.org, linux-kselftest@...r.kernel.org, 
	Maximilian Ott <ott@...fau.de>, Milan Stephan <milan.stephan@....de>
Subject: Re: [PATCH bpf-next v3 08/11] bpf: Fall back to nospec for Spectre v1

On Thu, 1 May 2025 at 10:00, Luis Gerhorst <luis.gerhorst@....de> wrote:
>
> This implements the core of the series and causes the verifier to fall
> back to mitigating Spectre v1 using speculation barriers. The approach
> was presented at LPC'24 [1] and RAID'24 [2].
>
> If we find any forbidden behavior on a speculative path, we insert a
> nospec (e.g., lfence speculation barrier on x86) before the instruction
> and stop verifying the path. While verifying a speculative path, we can
> furthermore stop verification of that path whenever we encounter a
> nospec instruction.
>
> A minimal example program would look as follows:
>
>         A = true
>         B = true
>         if A goto e
>         f()
>         if B goto e
>         unsafe()
> e:      exit
>
> There are the following speculative and non-speculative paths
> (`cur->speculative` and `speculative` referring to the value of the
> push_stack() parameters):
>
> - A = true
> - B = true
> - if A goto e
>   - A && !cur->speculative && !speculative
>     - exit
>   - !A && !cur->speculative && speculative
>     - f()
>     - if B goto e
>       - B && cur->speculative && !speculative
>         - exit
>       - !B && cur->speculative && speculative
>         - unsafe()
>
> If f() contains any unsafe behavior under Spectre v1 and the unsafe
> behavior matches `state->speculative &&
> error_recoverable_with_nospec(err)`, do_check() will now add a nospec
> before f() instead of rejecting the program:
>
>         A = true
>         B = true
>         if A goto e
>         nospec
>         f()
>         if B goto e
>         unsafe()
> e:      exit
>
> Alternatively, the algorithm also takes advantage of nospec instructions
> inserted for other reasons (e.g., Spectre v4). Taking the program above
> as an example, speculative path exploration can stop before f() if a
> nospec was inserted there because of Spectre v4 sanitization.
>
> In this example, all instructions after the nospec are dead code (and
> with the nospec they are also dead code speculatively).
>
> On x86_64, this depends on the following property of lfence [3]:
>
>         An LFENCE instruction or a serializing instruction will ensure that no
>         later instructions execute, even speculatively, until all prior
>         instructions complete locally. [...] Inserting an LFENCE instruction
>         after a bounds check prevents later operations from executing before
>         the bound check completes.
>
> Regarding the example, this implies that `if B goto e` will not execute
> before `if A goto e` completes. Once `if A goto e` completes, the CPU
> should find that the speculation was wrong and continue with `exit`.
>
> If there is any other path that leads to `if B goto e` (and therefore
> `unsafe()`) without going through `if A goto e`, then a nospec will
> still be needed there. However, this patch assumes this other path will
> be explored separately and therefore be discovered by the verifier even
> if the exploration discussed here stops at the nospec.
>
> This patch furthermore has the unfortunate consequence that Spectre v1
> mitigations now only support architectures which implement BPF_NOSPEC.
> Before this commit, Spectre v1 mitigations prevented exploits by
> rejecting the programs on all architectures. Because some JITs do not
> implement BPF_NOSPEC, this patch therefore may regress unpriv BPF's
> security to a limited extent:
>
> * The regression is limited to systems vulnerable to Spectre v1, have
>   unprivileged BPF enabled, and do NOT emit insns for BPF_NOSPEC. The
>   latter is not the case for x86 64- and 32-bit, arm64, and powerpc
>   64-bit and they are therefore not affected by the regression.
>   According to commit a6f6a95f2580 ("LoongArch, bpf: Fix jit to skip
>   speculation barrier opcode"), LoongArch is not vulnerable to Spectre
>   v1 and therefore also not affected by the regression.
>
> * To the best of my knowledge this regression may therefore only affect
>   MIPS. This is deemed acceptable because unpriv BPF is still disabled
>   there by default. As stated in a previous commit, BPF_NOSPEC could be
>   implemented for MIPS based on GCC's speculation_barrier
>   implementation.
>
> * It is unclear which other architectures (besides x86 64- and 32-bit,
>   ARM64, PowerPC 64-bit, LoongArch, and MIPS) supported by the kernel
>   are vulnerable to Spectre v1. Also, it is not clear if barriers are
>   available on these architectures. Implementing BPF_NOSPEC on these
>   architectures therefore is non-trivial. Searching GCC and the kernel
>   for speculation barrier implementations for these architectures
>   yielded no result.
>
> * If any of those regressed systems is also vulnerable to Spectre v4,
>   the system was already vulnerable to Spectre v4 attacks based on
>   unpriv BPF before this patch and the impact is therefore further
>   limited.
>
> As an alternative to regressing security, one could still reject
> programs if the architecture does not emit BPF_NOSPEC (e.g., by removing
> the empty BPF_NOSPEC-case from all JITs except for LoongArch where it
> appears justified). However, this will cause rejections on these archs
> that are likely unfounded in the vast majority of cases.
>
> In the tests, some are now successful where we previously had a
> false-positive (i.e., rejection). Change them to reflect where the
> nospec should be inserted (using __xlated_unpriv) and modify the error
> message if the nospec is able to mitigate a problem that previously
> shadowed another problem (in that case __xlated_unpriv does not work,
> therefore just add a comment).
>
> Define SPEC_V1 to avoid duplicating this ifdef whenever we check for
> nospec insns using __xlated_unpriv, define it here once. This also
> improves readability. PowerPC can probably also be added here. However,
> omit it for now because the BPF CI currently does not include a test.
>
> Briefly went through all the occurrences of EPERM, EINVAL, and EACCESS
> in the verifier in order to validate that catching them like this makes
> sense.
>
> [1] https://lpc.events/event/18/contributions/1954/ ("Mitigating
>     Spectre-PHT using Speculation Barriers in Linux eBPF")
> [2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and
>     Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
> [3] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/runtime-speculative-side-channel-mitigations.html
>     ("Managed Runtime Speculative Execution Side Channel Mitigations")
>
> Signed-off-by: Luis Gerhorst <luis.gerhorst@....de>
> Acked-by: Henriette Herzog <henriette.herzog@....de>
> Cc: Maximilian Ott <ott@...fau.de>
> Cc: Milan Stephan <milan.stephan@....de>
> ---

The patches from here on look good in general, I will ack after taking
another look later.

I had a more high-level question though.
Back when all of this surfaced, compiler folks came up with another
solution, to rely on Intel's guarantee that conditional moves are not
predicted.

if (condition) {
   mask = !condition ? 0UL : ~0UL; // CMOVcc
   ptr &= mask;
   x = *ptr;
}

In case the condition being true in the speculative domain leads to
problems, the speculative domain will just read from NULL and not leak
sensitive data.
The assumption is that cost of instrumentation in speculative domain <
completely stalling it until prior instructions are done using lfence.
So speculation is still helpful when the branch is not mispredicted.
Now I imagine it's not fun to do such analysis in the verifier (I've
tried), but theoretically we could break it down into emitting
bytecode from the compiler side, and lifting the compiler to do it for
us, and ensuring the end result produced is sane (by still following
speculative paths) from the verifier.

You talk about this in the paper and in the presentation as future work.
My question is mainly whether you considered implementing this, if
yes, what made you choose a nospec barrier over something like above?
Was it the complexity of realizing this during verification?
Are there any implications of reading from NULL that would cause problems?
Also, did you characterize how much difference it could make?
The drop in SCTP throughput seems to suggest so, since CPU-bound
computation was moved into the program.
Otherwise most programs mostly defer to helpers for heavy lifting.
Not that it was as fast as a helper would be, even without nospec, but still.

Also a bit sad we don't split the program into BBs already, which
could help reduce your mitigation's cost + plus also reduce cost of
instruction patching (unrelated).

Anyway, all that said, this is valuable stuff, so I was just curious.