lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 13 Jun 2024 16:21:15 +0200
From: Maciej Żenczykowski <maze@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Linux NetDev <netdev@...r.kernel.org>, Eric Dumazet <edumazet@...gle.com>, 
	Paolo Abeni <pabeni@...hat.com>, "David S. Miller" <davem@...emloft.net>
Subject: Re: Some sort of netlink RTM_GET(ROUTE|RULE|NEIGH) regression(?) in
 6.10-rc3 vs 6.9

On Thu, Jun 13, 2024 at 3:29 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Thu, 13 Jun 2024 14:18:41 +0200 Maciej Żenczykowski wrote:
> > The Android net tests
> > (available at https://cs.android.com/android/platform/superproject/main/+/main:kernel/tests/net/test/
> > more specifically multinetwork_test.py & neighbour_test.py)
> > run via:
> >   /...aosp-tests.../net/test/run_net_test.sh --builder
> > from within a 6.10-rc3 kernel tree are falling over due to a *plethora* of:
> >   TypeError: NLMsgHdr requires a bytes object of length 16, got 4
> >
> > The problems might be limited to RTM_GETROUTE and RTM_GETRULE and RTM_GETNEIGH,
> > as various other netlink using xfrm tests appear to be okay...
> >
> > (note: 6.10-rc3 also fails to build for UML due to a buggy bpf change,
> > but I sent out a 1-line fix for that already:
> > https://patchwork.kernel.org/project/netdevbpf/patch/20240613112520.1526350-1-maze@google.com/
> > )
> >
> > It is of course entirely possible the test code is buggy in how it
> > parses netlink, but it has worked for years and years...
> >
> > Before I go trying to bisect this... anyone have any idea what might
> > be the cause?
> > Perhaps some sort of change to how these dumps work? Some sort of new
> > netlink extended errors?
>
> Take a look at commit 5b4b62a169e1 ("rtnetlink: make the "split"
> NLM_DONE handling generic"), there may be more such workarounds missing.

Ok, I sent out 2 patches adding the flag in 3 more spots that are
enough to get both tests working.

The first in RTM_GETNEIGH seems obvious enough.

$ git grep rtnl_register.*RTM_GETNEIGH,
net/core/neighbour.c:3894:      rtnl_register(PF_UNSPEC, RTM_GETNEIGH,
neigh_get, neigh_dump_info,
net/core/rtnetlink.c:6752:      rtnl_register(PF_BRIDGE, RTM_GETNEIGH,
rtnl_fdb_get, rtnl_fdb_dump, 0);
net/mctp/neigh.c:331:   rtnl_register_module(THIS_MODULE, PF_MCTP, RTM_GETNEIGH,

but there is also PF_BRIDGE and PF_MCTP... (though obviously the test
doesn't care)
(and also RTM_GETNEIGHTBL...)

The RTM_GETRULE portion of the second one seems fine too:

$ git grep rtnl_register.*RTM_GETRULE
net/core/fib_rules.c:1296:      rtnl_register(PF_UNSPEC, RTM_GETRULE,
NULL, fib_nl_dumprule,

but I'm less certain about the GET_ROUTE portion there-of... as
there's a lot of hits:

$ git grep rtnl_register.*RTM_GETROUTE
net/can/gw.c:1293:      ret = rtnl_register_module(THIS_MODULE,
PF_CAN, RTM_GETROUTE,
net/core/rtnetlink.c:6743:      rtnl_register(PF_UNSPEC, RTM_GETROUTE,
NULL, rtnl_dump_all, 0);
net/ipv4/fib_frontend.c:1662:   rtnl_register(PF_INET, RTM_GETROUTE,
NULL, inet_dump_fib,
net/ipv4/ipmr.c:3162:   rtnl_register(RTNL_FAMILY_IPMR, RTM_GETROUTE,
net/ipv4/route.c:3696:  rtnl_register(PF_INET, RTM_GETROUTE,
inet_rtm_getroute, NULL,
net/ipv6/ip6_fib.c:2516:        ret =
rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETROUTE, NULL,
net/ipv6/ip6mr.c:1394:  err = rtnl_register_module(THIS_MODULE,
RTNL_FAMILY_IP6MR, RTM_GETROUTE,
net/ipv6/route.c:6737:  ret = rtnl_register_module(THIS_MODULE,
PF_INET6, RTM_GETROUTE,
net/mctp/route.c:1481:  rtnl_register_module(THIS_MODULE, PF_MCTP, RTM_GETROUTE,
net/mpls/af_mpls.c:2755:        rtnl_register_module(THIS_MODULE,
PF_MPLS, RTM_GETROUTE,
net/phonet/pn_netlink.c:304:    rtnl_register_module(THIS_MODULE,
PF_PHONET, RTM_GETROUTE,

It seems like maybe v4 and both mr's should be changed too?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ