[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKH8qBv=+QVBqHd=9rAWe3d5d47dSkppYc1JbS+WgQs8XgB+Yg@mail.gmail.com>
Date: Thu, 16 Jun 2022 08:57:14 -0700
From: Stanislav Fomichev <sdf@...gle.com>
To: Maciej Żenczykowski <maze@...gle.com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Linux NetDev <netdev@...r.kernel.org>,
BPF Mailing List <bpf@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Sasha Levin <sashal@...nel.org>,
Carlos Llamas <cmllamas@...gle.com>,
YiFei Zhu <zhuyifei@...gle.com>
Subject: Re: Curious bpf regression in 5.18 already fixed in stable 5.18.3
On Wed, Jun 15, 2022 at 6:36 PM Maciej Żenczykowski <maze@...gle.com> wrote:
>
> > > > I've bisected the original issue to:
> > > >
> > > > b44123b4a3dc ("bpf: Add cgroup helpers bpf_{get,set}_retval to get/set
> > > > syscall return value")
> > > >
> > > > And I believe it's these two lines from the original patch:
> > > >
> > > > #define BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY(array, ctx, func) \
> > > > ({ \
> > > > @@ -1398,10 +1398,12 @@ out:
> > > > u32 _ret; \
> > > > _ret = BPF_PROG_RUN_ARRAY_CG_FLAGS(array, ctx, func, 0, &_flags); \
> > > > _cn = _flags & BPF_RET_SET_CN; \
> > > > + if (_ret && !IS_ERR_VALUE((long)_ret)) \
> > > > + _ret = -EFAULT;
> > > >
> > > > _ret is u32 and ret gets -1 (ffffffff). IS_ERR_VALUE((long)ffffffff)
> > > returns
> > > > false in this case because it doesn't sign-expand the argument and
> > > internally
> > > > does ffff_ffff >= ffff_ffff_ffff_f001 comparison.
> > > >
> > > > I'll try to see what I've changed in my unrelated patch to fix it. But
> > > I think
> > > > we should audit all these IS_ERR_VALUE((long)_ret) regardless; they
> > > don't
> > > > seem to work the way we want them to...
> >
> > > Ok, and my patch fixes it because I'm replacing 'u32 _ret' with 'int ret'.
> >
> > > So, basically, with u32 _ret we have to do IS_ERR_VALUE((long)(int)_ret).
> >
> > > Sigh..
> >
> > And to follow up on that, the other two places we have are fine:
> >
> > IS_ERR_VALUE((long)run_ctx.retval))
> >
> > run_ctx.retval is an int.
>
> I'm guessing this means the regression only affects 64-bit archs,
> where long = void* is 8 bytes > u32 of 4 bytes, but not 32-bit ones,
> where long = u32 = 4 bytes
>
> Unfortunately my dev machine's 32-bit build capability has somehow
> regressed again and I can't check this.
Seems so, yes. But I'm actually not sure whether we should at all
treat it as a regression. There is a question of whether that EPERM is
UAPI or not. That's why we most likely haven't caught it in the
selftests; most of the time we only check that syscall has returned -1
and don't pay attention to the particular errno.
Powered by blists - more mailing lists