netdev - Re: Curious bpf regression in 5.18 already fixed in stable 5.18.3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKH8qBv=+QVBqHd=9rAWe3d5d47dSkppYc1JbS+WgQs8XgB+Yg@mail.gmail.com>
Date:   Thu, 16 Jun 2022 08:57:14 -0700
From:   Stanislav Fomichev <sdf@...gle.com>
To:     Maciej Żenczykowski <maze@...gle.com>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Linux NetDev <netdev@...r.kernel.org>,
        BPF Mailing List <bpf@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Sasha Levin <sashal@...nel.org>,
        Carlos Llamas <cmllamas@...gle.com>,
        YiFei Zhu <zhuyifei@...gle.com>
Subject: Re: Curious bpf regression in 5.18 already fixed in stable 5.18.3

On Wed, Jun 15, 2022 at 6:36 PM Maciej Żenczykowski <maze@...gle.com> wrote:
>
> > > > I've bisected the original issue to:
> > > >
> > > > b44123b4a3dc ("bpf: Add cgroup helpers bpf_{get,set}_retval to get/set
> > > > syscall return value")
> > > >
> > > > And I believe it's these two lines from the original patch:
> > > >
> > > >  #define BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY(array, ctx, func)            \
> > > >     ({                                              \
> > > > @@ -1398,10 +1398,12 @@ out:
> > > >             u32 _ret;                               \
> > > >             _ret = BPF_PROG_RUN_ARRAY_CG_FLAGS(array, ctx, func, 0, &_flags); \
> > > >             _cn = _flags & BPF_RET_SET_CN;          \
> > > > +           if (_ret && !IS_ERR_VALUE((long)_ret))  \
> > > > +                   _ret = -EFAULT;
> > > >
> > > > _ret is u32 and ret gets -1 (ffffffff). IS_ERR_VALUE((long)ffffffff)
> > > returns
> > > > false in this case because it doesn't sign-expand the argument and
> > > internally
> > > > does ffff_ffff >= ffff_ffff_ffff_f001 comparison.
> > > >
> > > > I'll try to see what I've changed in my unrelated patch to fix it. But
> > > I think
> > > > we should audit all these IS_ERR_VALUE((long)_ret) regardless; they
> > > don't
> > > > seem to work the way we want them to...
> >
> > > Ok, and my patch fixes it because I'm replacing 'u32 _ret' with 'int ret'.
> >
> > > So, basically, with u32 _ret we have to do IS_ERR_VALUE((long)(int)_ret).
> >
> > > Sigh..
> >
> > And to follow up on that, the other two places we have are fine:
> >
> > IS_ERR_VALUE((long)run_ctx.retval))
> >
> > run_ctx.retval is an int.
>
> I'm guessing this means the regression only affects 64-bit archs,
> where long = void* is 8 bytes > u32 of 4 bytes, but not 32-bit ones,
> where long = u32 = 4 bytes
>
> Unfortunately my dev machine's 32-bit build capability has somehow
> regressed again and I can't check this.

Seems so, yes. But I'm actually not sure whether we should at all
treat it as a regression. There is a question of whether that EPERM is
UAPI or not. That's why we most likely haven't caught it in the
selftests; most of the time we only check that syscall has returned -1
and don't pay attention to the particular errno.