linux-kernel - Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ae6fd0d54ff2650d0f6724fb44b33723e26ea49.camel@gmail.com>
Date: Mon, 07 Jul 2025 17:57:32 -0700
From: Eduard Zingerman <eddyz87@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Paul Chaignon <paul.chaignon@...il.com>, syzbot	
 <syzbot+c711ce17dd78e5d4fdcf@...kaller.appspotmail.com>, Andrii Nakryiko	
 <andrii@...nel.org>, Alexei Starovoitov <ast@...nel.org>, bpf	
 <bpf@...r.kernel.org>, Daniel Borkmann <daniel@...earbox.net>, Hao Luo	
 <haoluo@...gle.com>, John Fastabend <john.fastabend@...il.com>, Jiri Olsa	
 <jolsa@...nel.org>, KP Singh <kpsingh@...nel.org>, LKML	
 <linux-kernel@...r.kernel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, 
 Network Development <netdev@...r.kernel.org>, Stanislav Fomichev
 <sdf@...ichev.me>, Song Liu <song@...nel.org>,  syzkaller-bugs
 <syzkaller-bugs@...glegroups.com>, Yonghong Song <yonghong.song@...ux.dev>
Subject: Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check

On Mon, 2025-07-07 at 17:51 -0700, Alexei Starovoitov wrote:
> On Mon, Jul 7, 2025 at 5:37 PM Eduard Zingerman <eddyz87@...il.com> wrote:
> >
> > On Mon, 2025-07-07 at 16:29 -0700, Eduard Zingerman wrote:
> > > On Tue, 2025-07-08 at 00:30 +0200, Paul Chaignon wrote:
> > >
> > > [...]
> > >
> > > > This is really nice! I think we can extend it to detect some
> > > > always-true branches as well, and thus handle the initial case reported
> > > > by syzbot.
> > > >
> > > > - if a_min == 0: we don't deduce anything
> > > > - bits that may be set in 'a' are: possible_a = or_range(a_min, a_max)
> > > > - bits that are always set in 'b' are: always_b = b_value & ~b_mask
> > > > - if possible_a & always_b == possible_a: only true branch is possible
> > > > - otherwise, we can't deduce anything
> > > >
> > > > For BPF_X case, we probably want to also check the reverse with
> > > > possible_b & always_a.
> > >
> > > So, this would extend existing predictions:
> > > - [old] always_a & always_b -> infer always true
> > > - [old] !(possible_a & possible_b) -> infer always false
> > > - [new] if possible_a & always_b == possible_a -> infer true
> > >         (but make sure 0 is not in possible_a)
> > >
> > > And it so happens, that it covers example at hand.
> > > Note that or_range(1, (u64)-1) == (u64)-1, so maybe tnum would be
> > > sufficient, w/o the need for or_range().
> > >
> > > The part of the verifier that narrows the range after prediction:
> > >
> > >   regs_refine_cond_op:
> > >
> > >          case BPF_JSET | BPF_X: /* reverse of BPF_JSET, see rev_opcode() */
> > >                  if (!is_reg_const(reg: reg2, subreg32: is_jmp32))
> > >                          swap(reg1, reg2);
> > >                  if (!is_reg_const(reg: reg2, subreg32: is_jmp32))
> > >                          break;
> > >                  val = reg_const_value(reg: reg2, subreg32: is_jmp32);
> > >                ...
> > >                          reg1->var_off = tnum_and(a: reg1->var_off, b: tnum_const(value: ~val));
> > >                ...
> > >                  break;
> > >
> > > And after suggested change this part would be executed only if tnum
> > > bounds can be changed by jset. So, this eliminates at-least a
> > > sub-class of a problem.
> >
> > But I think the program below would still be problematic:
> >
> > SEC("socket")
> > __success
> > __retval(0)
> > __naked void jset_bug1(void)
> > {
> >         asm volatile ("                                 \
> >         call %[bpf_get_prandom_u32];                    \
> >         if r0 < 2 goto 1f;                              \
> >         r0 |= 1;                                        \
> >         if r0 & -2 goto 1f;                             \
> > 1:      r0 = 0;                                         \
> >         exit;                                           \
> > "       :
> >         : __imm(bpf_get_prandom_u32)
> >         : __clobber_all);
> > }
> >
> > The possible_r0 would be changed by `if r0 & -2`, so new rule will not hit.
> > And the problem remains unsolved. I think we need to reset min/max
> > bounds in regs_refine_cond_op for JSET:
> > - in some cases range is more precise than tnum
> > - in these cases range cannot be compressed to a tnum
> > - predictions in jset are done for a tnum
> > - to avoid issues when narrowing tnum after prediction, forget the
> >   range.
>
> You're digging too deep. llvm doesn't generate JSET insn,
> so this is syzbot only issue. Let's address it with minimal changes.
> Do not introduce fancy branch taken analysis.
> I would be fine with reverting this particular verifier_bug() hunk.

My point is that the fix should look as below (but extract it as a
utility function):

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 53007182b46b..b2fe665901b7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -16207,6 +16207,14 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state
                        swap(reg1, reg2);
                if (!is_reg_const(reg2, is_jmp32))
                        break;
+               reg1->u32_max_value = U32_MAX;
+               reg1->u32_min_value = 0;
+               reg1->s32_max_value = S32_MAX;
+               reg1->s32_min_value = S32_MIN;
+               reg1->umax_value = U64_MAX;
+               reg1->umin_value = 0;
+               reg1->smax_value = S64_MAX;
+               reg1->smin_value = S32_MIN;
                val = reg_const_value(reg2, is_jmp32);
                if (is_jmp32) {
                        t = tnum_and(tnum_subreg(reg1->var_off), tnum_const(~val));

----

Because of irreconcilable differences in what can be represented as a
tnum and what can be represented as a range.