linux-kernel - Re: [PATCH bpf-next v3 11/11] bpf: Fall back to nospec for sanitization-failures

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ecwr14mt.fsf@fau.de>
Date: Wed, 14 May 2025 19:30:50 +0200
From: Luis Gerhorst <luis.gerhorst@....de>
To: Kumar Kartikeya Dwivedi <memxor@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>,  Daniel Borkmann
 <daniel@...earbox.net>,  Andrii Nakryiko <andrii@...nel.org>,  Martin
 KaFai Lau <martin.lau@...ux.dev>,  Eduard Zingerman <eddyz87@...il.com>,
  Song Liu <song@...nel.org>,  Yonghong Song <yonghong.song@...ux.dev>,
  John Fastabend <john.fastabend@...il.com>,  KP Singh
 <kpsingh@...nel.org>,  Stanislav Fomichev <sdf@...ichev.me>,  Hao Luo
 <haoluo@...gle.com>,  Jiri Olsa <jolsa@...nel.org>,  Puranjay Mohan
 <puranjay@...nel.org>,  Xu Kuohai <xukuohai@...weicloud.com>,  Catalin
 Marinas <catalin.marinas@....com>,  Will Deacon <will@...nel.org>,  Hari
 Bathini <hbathini@...ux.ibm.com>,  Christophe Leroy
 <christophe.leroy@...roup.eu>,  Naveen N Rao <naveen@...nel.org>,
  Madhavan Srinivasan <maddy@...ux.ibm.com>,  Michael Ellerman
 <mpe@...erman.id.au>,  Nicholas Piggin <npiggin@...il.com>,  Mykola
 Lysenko <mykolal@...com>,  Shuah Khan <shuah@...nel.org>,  Henriette
 Herzog <henriette.herzog@....de>,  Saket Kumar Bhaskar
 <skb99@...ux.ibm.com>,  Cupertino Miranda <cupertino.miranda@...cle.com>,
  Jiayuan Chen <mrpre@....com>,  Matan Shachnai <m.shachnai@...il.com>,
  Dimitar Kanaliev <dimitar.kanaliev@...eground.com>,  Shung-Hsi Yu
 <shung-hsi.yu@...e.com>,  Daniel Xu <dxu@...uu.xyz>,  bpf@...r.kernel.org,
  linux-arm-kernel@...ts.infradead.org,  linux-kernel@...r.kernel.org,
  linuxppc-dev@...ts.ozlabs.org,  linux-kselftest@...r.kernel.org,
  Maximilian Ott <ott@...fau.de>,  Milan Stephan <milan.stephan@....de>
Subject: Re: [PATCH bpf-next v3 11/11] bpf: Fall back to nospec for
 sanitization-failures

Kumar Kartikeya Dwivedi <memxor@...il.com> writes:

(including relevant part from other message)

> On Thu, 1 May 2025 at 04:00, Luis Gerhorst <luis.gerhorst@....de> wrote:
> 
>> +static bool error_recoverable_with_nospec(int err)
>> +{
>> +       /* Should only return true for non-fatal errors that are allowed to
>> +        * occur during speculative verification. For these we can insert a
>> +        * nospec and the program might still be accepted. Do not include
>> +        * something like ENOMEM because it is likely to re-occur for the next
>> +        * architectural path once it has been recovered-from in all speculative
>> +        * paths.
>> +        */
>> +       return err == -EPERM || err == -EACCES || err == -EINVAL;
>> +}
> 
> Why can't we unconditionally do this? So the path with speculation
> that encounters an error (even if EFAULT) is not explored for the
> remaining pushed speculative states. If the error remains regardless
> of speculation normal symbolic execution will encounter it. The
> instructions only explored as part of speculative execution are not
> marked as seen (see: sanitize_mark_insn_seen), so they'll be dead code
> eliminated and the code doesn't reach the JIT, so no "unsafe gadget"
> remains in the program where execution can be steered.
> 
> So the simplest thing (without having to reason about these three
> error codes, I'm sure things will get out of sync or we'll miss
> potential candidates) is probably to just unconditionally mark
> cur_aux(env)->nospec.

[...]

> Hm, now looking at this and thinking more about this, I think
> recoverable error logic is probably ok as is.
> Scratch my earlier suggestion about unconditional handling. I guess
> what would be better would be
> handling everything except fatal ones. In case of fatal ones we should
> really quit verification and return.
> We may make partial changes to verifier state / env and try to bail
> out using -ENOMEM and -EFAULT.
> So unconditional continuation would be problematic as we'd act in a
> partial state never meant to be seen.
>
> The logic otherwise looks ok, so:
>
> Acked-by: Kumar Kartikeya Dwivedi <memxor@...il.com>

Thank you very much for having a look, so then I will leave it as is if
I understand you correctly.

Please let me know if "what would be better would be handling everything
except fatal ones" was meant literally. Such a deny-list approach would
likely be:

  return err != -ENOMEM && err != -EFAULT;

I initially decided to limit it to -EPERM, -EACCES, and -EINVAL as I was
relatively confident all their cases were safe to "catch" and because it
already had the desired effect for most real-world programs. However, if
you find the deny-list approach easier to reason about, I can also do
that. In that case, I will review the remaining errors (besides -EPERM,
-EACCES, and -EINVAL) and make sure they can be caught.

Also, thanks for the pointer regarding sanitize_check_bounds() (sorry
for the delay; the message is still on my to-do list). I will address it
in v4 if it is safe or send a separate fix if it is indeed a bug.