netdev - Re: [PATCH net] inet: bring NLM_DONE out to a separate recv() in inet_dump

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CANn89iLJe2HOQkuujJrCr=3R-DN_S2ALLcBBWuK3je2Nup4obw@mail.gmail.com>
Date: Mon, 3 Jun 2024 17:34:12 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>, 
	Stephen Hemminger <stephen@...workplumber.org>, davem@...emloft.net, netdev@...r.kernel.org, 
	pabeni@...hat.com, Jaroslav Pulchart <jaroslav.pulchart@...ddata.com>
Subject: Re: [PATCH net] inet: bring NLM_DONE out to a separate recv() in inet_dump_ifaddr()

On Mon, Jun 3, 2024 at 4:05 PM Jamal Hadi Salim <jhs@...atatu.com> wrote:
>
> On Sat, Jun 1, 2024 at 10:23 PM David Ahern <dsahern@...nel.org> wrote:
> >
> > On 6/1/24 5:48 PM, Jakub Kicinski wrote:
> > > On Sat, 1 Jun 2024 16:10:13 -0700 Stephen Hemminger wrote:
> > >> Sorry, I disagree.
> > >>
> > >> You can't just fix the problem areas. The split was an ABI change, and there could
> > >> be a problem in any dump. This the ABI version of the old argument
> > >>   If a tree falls in a forest and no one is around to hear it, does it make a sound?
> > >>
> > >> All dumps must behave the same. You are stuck with the legacy behavior.
> >
> > I don't agree with such a hard line stance. Mistakes made 20 years ago
> > cannot hold Linux back from moving forward. We have to continue
> > searching for ways to allow better or more performant behavior.
> >
> > >
> > > The dump partitioning is up to the family. Multiple families
> > > coalesce NLM_DONE from day 1. "All dumps must behave the same"
> > > is saying we should convert all families to be poorly behaved.
> > >
> > > Admittedly changing the most heavily used parts of rtnetlink is very
> > > risky. And there's couple more corner cases which I'm afraid someone
> > > will hit. I'm adding this helper to clearly annotate "legacy"
> > > callbacks, so we don't regress again. At the same time nobody should
> > > use this in new code or "just to be safe" (read: because they don't
> > > understand netlink).
> >
> > What about a socket option that says "I am a modern app and can handle
> > the new way" - similar to the strict mode option that was added? Then
> > the decision of requiring a separate message for NLM_DONE can be based
> > on the app. Could even throw a `pr_warn_once("modernize app %s/%d\n")`
> > to help old apps understand they need to move forward.
> >
>
> Sorry, being a little lazy so asking instead:
> NLMSG_DONE is traditionally the "EOT" (end of transaction) signal, if
> you get rid of it  - how does the user know there are more msgs coming
> or the dump transaction is over? In addition to the user->kernel "I am
> modern", perhaps set the nlmsg_flag in the reverse path to either say
> "there's more coming" which you dont set on the last message or "we
> are doing this the new way". Backward compat is very important - there
> are dinosaur apps out there that will break otherwise.

The NLMSG_DONE was not removed.

Some applications expected it to be carried in a standalone message
for some of the rtnetlink operations,
because old kernel implementations accidentally had this
(undocumented) behavior.

When the kernel started to be smart and piggy-back the NLMSG_DONE in
the 'last given answer',
these applications started to complain.

Basically these applications do not correctly parse the full answer
the kernel gives to them.