[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1481065966.11028.3.camel@redhat.com>
Date: Tue, 06 Dec 2016 17:12:46 -0600
From: Dan Williams <dcbw@...hat.com>
To: Guillaume Nault <g.nault@...halink.fr>,
Brad Campbell <lists2009@...rfbargle.com>
Cc: netdev <netdev@...r.kernel.org>, Thomas Graf <tgraf@...g.ch>,
David Miller <davem@...emloft.net>, thaller@...hat.com
Subject: Re: commit : ppp: add rtnetlink device creation support - breaks
netcf on my machine.
On Wed, 2016-12-07 at 00:08 +0100, Guillaume Nault wrote:
> (Cc Thomas and David)
CC Thomas Haller who is the current libnl maintainer...
Dan
> On Tue, Dec 06, 2016 at 03:47:20PM +0800, Brad Campbell wrote:
> >
> > On 06/12/16 01:53, Guillaume Nault wrote:
> > >
> > > >
> > > >
> > > Probably not a mistake on your side. I've started looking at
> > > netcf'
> > > source code, but haven't found anything that could explain your
> > > issue.
> > > It'd really help if you could provide steps to reproduce the bug.
> >
> > Further to my message this morning, I started with a clean
> > linux.git
> > 4.9.0-rc7-00198-g0cb65c8 and did two runs. One untouched and one
> > with the
> > identified patch reverted. I logged both of these with NLCB=debug,
> > then
> > split out the ppp section and diffed them.
> >
> > It appears the only difference of note is the new ATTR 18. I did a
> > diff of
> > the entire dump for both and nothing else popped out.
> >
> Thanks for the detailed report. Things are getting clear now.
>
> >
> >
> > brad@...t:~$ diff -u ppp-ok ppp-fail
> > --- ppp-ok 2016-12-06 13:32:04.358393578 +0800
> > +++ ppp-fail 2016-12-06 13:32:18.577864406 +0800
> > @@ -1,10 +1,10 @@
> > -------------------------- BEGIN NETLINK MESSAGE
> > ---------------------------
> > [HEADER] 16 octets
> > - .nlmsg_len = 628
> > + .nlmsg_len = 644
> > .nlmsg_type = 16 <route/link::new>
> > .nlmsg_flags = 2 <MULTI>
> > - .nlmsg_seq = 1481001940
> > - .nlmsg_pid = 7462
> > + .nlmsg_seq = 1481002252
> > + .nlmsg_pid = 7376
> > [PAYLOAD] 16 octets
> > 00 00 00 02 0a 00 00 00 d1 10 01 00 00 00 00
> > 00 ................
> > [ATTR 03] 5 octets
> > @@ -71,6 +71,8 @@
> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > ..................
> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > ..................
> > 00 00 00 00 00 00 ......
> > + [ATTR 18] 12 octets
> > + 08 00 01 00 70 70 70 00 04 00 02
> > 00 ....ppp.....
> > [ATTR 26] 132 octets
> > 84 00 02 00 80 00 01 00 01 00 00 00 00 00 00 00 00 00
> > ..................
> > 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00
> > ..................
> > @@ -81,3 +83,4 @@
> > 00 00 00 00 10 27 00 00 e8 03 00 00 00 00 00 00 00 00
> > .....'............
> > 00 00 00 00 00 00 ......
> > --------------------------- END NETLINK MESSAGE
> > ---------------------------
> >
> 'ATTR 18' is the IFLA_LINKINFO attribute. It contains two sub-
> attributes:
> * IFLA_INFO_KIND ('08 00 01 00 70 70 70 00'), containing the "ppp"
> string,
> * and IFLA_INFO_DATA ('04 00 02 00') which has no payload because,
> currently, ppp has no device specific data to return.
>
> >
> > Running with NLDBG=4 seems to generate this :
> > DBG<2>: While picking up for 0x26d2e00 <route/link>, recvmsgs()
> > returned
> > -34: (errno = Numerical result out of range)DBG<1>: Clearing cache
> > 0x26d2e00 <route/link>...
> >
> libnl1 rejects the IFLA_INFO_DATA attribute because it expects it to
> contain a sub-attribute. Since the payload size is zero it doesn't
> match the policy and parsing fails.
>
> There's no problem with libnl3 because its policy accepts empty
> payloads for NLA_NESTED attributes (see libnl3 commit 4be02ace4826
> "Be
> liberal when receiving an empty nested attribute").
>
> I think empty nested attributes make perfect sense. At least we
> accept
> them from user space since commit ea5693ccc553 ("netlink: allow empty
> nested attributes"), so it should be fine to generate some from the
> kernel.
> OTOH, since some user space programs broke because of this, it might
> be
> better to always add attributes in the .fill_info() callbacks, to
> work
> around libnl1's policy. David, Thomas, do you have any opinion on
> this?
Powered by blists - more mailing lists