[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ+HfNhXJbhp3z+fL-3iXJD-5yFtEL0P1OmdsTkmZHUMviuAcQ@mail.gmail.com>
Date: Tue, 11 Sep 2018 19:15:56 +0200
From: Björn Töpel <bjorn.topel@...il.com>
To: yhs@...com
Cc: Netdev <netdev@...r.kernel.org>, ast@...nel.org,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <brouer@...hat.com>
Subject: Re: tools/bpf regression causing samples/bpf/ to hang
Den tis 11 sep. 2018 kl 18:47 skrev Yonghong Song <yhs@...com>:
>
>
>
> On 9/11/18 4:11 AM, Björn Töpel wrote:
> > Hi Yonghong, I tried to run the XDP samples from the bpf-next tip
> > today, and was hit by a regression.
> >
> > Commit f7010770fbac ("tools/bpf: move bpf/lib netlink related
> > functions into a new file") adds a while(1) around the recv call in
> > bpf_set_link_xdp_fd making that call getting stuck in an infinite
> > loop.
> >
> > I simply removed the loop, and that solved my problem (patch below).
> >
> > However, I don't know if removing the loop would break bpftool for
> > you. If not, I can submit the patch as a proper one for bpf-next.
>
> Hi, Björn, thanks for reporting the problem.
> The while loop is needed since the "recv" syscall buffer size
> may not be big enough to hold all the returned information, in
> which cases, multiple "recv" calls are needed.
>
> Could you try the following patch to see whether it fixed your
> issue? Thanks!
>
Nope, it doesn't -- but if you move that hunk after the for-loop it works.
Cheers,
Björn
> commit 3eb1c0249dfc3ea4ad61aa223dce32262af7e049 (HEAD -> fix)
> Author: Yonghong Song <yhs@...com>
> Date: Tue Sep 11 08:58:20 2018 -0700
>
> tools/bpf: fix a netlink recv issue
>
> Commit f7010770fbac ("tools/bpf: move bpf/lib netlink related
> functions into a new file") introduced a while loop for the
> netlink recv path. This while loop is needed since the
> buffer in recv syscall may not be big enough to hold all the
> information and in such cases multiple recv calls are needed.
>
> When netlink recv returns message length of 0, there will be
> no more messages for returning data so the while loop
> can end.
>
> Fixes: f7010770fbac ("tools/bpf: move bpf/lib netlink related
> functions into a new file")
> Reported-by: Björn Töpel <bjorn.topel@...el.com>
> Signed-off-by: Yonghong Song <yhs@...com>
>
> diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c
> index 469e068dd0c5..37827319a50a 100644
> --- a/tools/lib/bpf/netlink.c
> +++ b/tools/lib/bpf/netlink.c
> @@ -77,6 +77,9 @@ static int bpf_netlink_recv(int sock, __u32 nl_pid,
> int seq,
> goto done;
> }
>
> + if (len == 0)
> + break;
> +
> for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
> nh = NLMSG_NEXT(nh, len)) {
> if (nh->nlmsg_pid != nl_pid) {
>
>
> >
> > Thanks!
> > Björn
> >
> > From: =?UTF-8?q?Bj=C3=B6rn=20T=C3=B6pel?= <bjorn.topel@...el.com>
> > Date: Tue, 11 Sep 2018 12:35:44 +0200
> > Subject: [PATCH] tools/bpf: remove loop around netlink recv
> >
> > Commit f7010770fbac ("tools/bpf: move bpf/lib netlink related
> > functions into a new file") moved the bpf_set_link_xdp_fd and split it
> > up into multiple functions. The added receive function
> > bpf_netlink_recv added a loop around the recv syscall leading to
> > multiple recv calls. This resulted in all XDP samples in the
> > samples/bpf/ to stop working, since they were stuck in a blocking
> > recv.
> >
> > This commits removes the while (1)-statement.
> >
> > Fixes: f7010770fbac ("tools/bpf: move bpf/lib netlink related
> > functions into a new file")
> > Signed-off-by: Björn Töpel <bjorn.topel@...el.com>
> > ---
> > tools/lib/bpf/netlink.c | 64 ++++++++++++++++++++---------------------
> > 1 file changed, 31 insertions(+), 33 deletions(-)
> >
> > diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c
> > index 469e068dd0c5..0eae1fbf46c6 100644
> > --- a/tools/lib/bpf/netlink.c
> > +++ b/tools/lib/bpf/netlink.c
> > @@ -70,41 +70,39 @@ static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq,
> > char buf[4096];
> > int len, ret;
> >
> > - while (1) {
> > - len = recv(sock, buf, sizeof(buf), 0);
> > - if (len < 0) {
> > - ret = -errno;
> > + len = recv(sock, buf, sizeof(buf), 0);
> > + if (len < 0) {
> > + ret = -errno;
> > + goto done;
> > + }
> > +
> > + for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
> > + nh = NLMSG_NEXT(nh, len)) {
> > + if (nh->nlmsg_pid != nl_pid) {
> > + ret = -LIBBPF_ERRNO__WRNGPID;
> > goto done;
> > }
> > -
> > - for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
> > - nh = NLMSG_NEXT(nh, len)) {
> > - if (nh->nlmsg_pid != nl_pid) {
> > - ret = -LIBBPF_ERRNO__WRNGPID;
> > - goto done;
> > - }
> > - if (nh->nlmsg_seq != seq) {
> > - ret = -LIBBPF_ERRNO__INVSEQ;
> > - goto done;
> > - }
> > - switch (nh->nlmsg_type) {
> > - case NLMSG_ERROR:
> > - err = (struct nlmsgerr *)NLMSG_DATA(nh);
> > - if (!err->error)
> > - continue;
> > - ret = err->error;
> > - nla_dump_errormsg(nh);
> > - goto done;
> > - case NLMSG_DONE:
> > - return 0;
> > - default:
> > - break;
> > - }
> > - if (_fn) {
> > - ret = _fn(nh, fn, cookie);
> > - if (ret)
> > - return ret;
> > - }
> > + if (nh->nlmsg_seq != seq) {
> > + ret = -LIBBPF_ERRNO__INVSEQ;
> > + goto done;
> > + }
> > + switch (nh->nlmsg_type) {
> > + case NLMSG_ERROR:
> > + err = (struct nlmsgerr *)NLMSG_DATA(nh);
> > + if (!err->error)
> > + continue;
> > + ret = err->error;
> > + nla_dump_errormsg(nh);
> > + goto done;
> > + case NLMSG_DONE:
> > + return 0;
> > + default:
> > + break;
> > + }
> > + if (_fn) {
> > + ret = _fn(nh, fn, cookie);
> > + if (ret)
> > + return ret;
> > }
> > }
> > ret = 0;
> >
Powered by blists - more mailing lists