lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZfohdcQvfdqvkoWT@zatzit>
Date: Wed, 20 Mar 2024 10:36:21 +1100
From: David Gibson <david@...son.dropbear.id.au>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Stefano Brivio <sbrivio@...hat.com>, davem@...emloft.net,
	netdev@...r.kernel.org, edumazet@...gle.com, pabeni@...hat.com,
	jiri@...nulli.us, idosch@...sch.org, johannes@...solutions.net,
	fw@...len.de, pablo@...filter.org, Martin Pitt <mpitt@...hat.com>,
	Paul Holzinger <pholzing@...hat.com>
Subject: Re: [PATCH net-next v2 3/3] genetlink: fit NLMSG_DONE into same
 read() as families

On Tue, Mar 19, 2024 at 08:55:45AM -0700, Jakub Kicinski wrote:
> On Fri, 15 Mar 2024 12:48:08 +0100 Stefano Brivio wrote:
> > > Make sure ctrl_fill_info() returns sensible error codes and
> > > propagate them out to netlink core. Let netlink core decide
> > > when to return skb->len and when to treat the exit as an
> > > error. Netlink core does better job at it, if we always
> > > return skb->len the core doesn't know when we're done
> > > dumping and NLMSG_DONE ends up in a separate read().  
> > 
> > While this change is obviously correct, it breaks... well, broken
> > applications that _wrongly_ rely on the fact that NLMSG_DONE is
> > delivered in a separate datagram.
> > 
> > This was the (embarrassing) case for passt(1), which I just fixed:
> >   https://archives.passt.top/passt-dev/20240315112432.382212-1-sbrivio@redhat.com/
> > 
> > but the "separate" NLMSG_DONE is such an established behaviour,
> > I think, that this might raise a more general concern.
> > 
> > From my perspective, I'm just happy that this change revealed the
> > issue, but I wanted to report this anyway in case somebody has
> > similar possible breakages in mind.
> 
> Hi Stefano! I was worried this may happen :( I think we should revert
> offending commits, but I'd like to take it on case by case basis. 
> I'd imagine majority of netlink is only exercised by iproute2 and
> libmnl-based tools. Does passt hang specifically on genetlink family
> dump? Your commit also mentions RTM_GETROUTE. This is not the only
> commit which removed DONE:

I don't think there's anything specirfic to RTM_GETROUTE here from the
kernel side.  We've looked at the problem in passt more closely now,
and it turns out we handled a merged NLMSG_DONE correctly in most
cases.  For various reasons internal to passt, our handling of
RTM_GETROUTE on one path is more complex, and we had a subtle error
there which broke the handling of a merged NLMSG_DONE.

> 
> $ git log --since='1 month ago' --grep=NLMSG_DONE --no-merges  --oneline 
> 
> 9cc4cc329d30 ipv6: use xa_array iterator to implement inet6_dump_addr()
> 87d381973e49 genetlink: fit NLMSG_DONE into same read() as families
> 4ce5dc9316de inet: switch inet_dump_fib() to RCU protection
> 6647b338fc5c netlink: fix netlink_diag_dump() return value
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ