netdev - Re: Netlink NLM_F_DUMP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220616171016.56d4ec9c@pirotess>
Date:   Thu, 16 Jun 2022 17:10:16 +0200
From:   Ismael Luceno <iluceno@...e.de>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Paolo Abeni <pabeni@...hat.com>,
        David Ahern <dsahern@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Netlink NLM_F_DUMP_INTR flag lost

On Wed, 15 Jun 2022 09:00:44 -0700
Jakub Kicinski <kuba@...nel.org> wrote:
> On Wed, 15 Jun 2022 17:11:13 +0200 Ismael Luceno wrote:
> > It seems a RTM_GETADDR request with AF_UNSPEC has a corner case
> > where the NLM_F_DUMP_INTR flag is lost.
> > 
> > After a change in an address table, if a packet has been fully
> > filled just previous, and if the end of the table is found at the
> > same time, then the next packet should be flagged, which works fine
> > when it's NLMSG_DONE, but gets clobbered when another table is to
> > be dumped next.
> 
> Could you describe how it gets clobbered? You mean that prev_seq gets
> updated somewhere without setting the flag or something overwrites
> nlmsg_flags? Or we set _INTR on an empty skb which never ends up
> getting sent? Or..

It seems to me that in most functions, but specifically in the case of
net/ipv4/devinet.c:in_dev_dump_addr or inet_netconf_dump_devconf,
nl_dump_check_consistent is called after each address/attribute is
dumped, meaning a hash table generation change happening just after it
adds an entry, if it also causes it to find the end of the table,
wouldn't be flagged.

Adding an extra call at the end of all these functions should fix that,
and spill the flag into the next packet, but would that be correct?

It seems this condition is flagged correctly when NLM_DONE is produced,
I couldn't see why, but I'm guessing another call to
nl_dump_check_consistent...

Also, I noticed that in net/core/rtnetlink.c:rtnl_dump_all: 

	if (idx > s_idx) {
		memset(&cb->args[0], 0, sizeof(cb->args));
		cb->prev_seq = 0;
		cb->seq = 0;
	}
	ret = dumpit(skb, cb);

This also prevents it to be detect the condition when dumping the next
table, but that seems desirable...

Am I grasping it correctly?

Some functions like net/core/rtnetlink.c:rtnl_dump_ifinfo do call
nl_dump_check_consistent when finishing, but I'm seeing most others
don't do that, instead doing it only when adding an entry to the packet
(another example is: rtnl_stats_dump).

Again, while adding the check at the end of each function would solve
these inconsistencies, it isn't so clear to me that spilling this flag
into the next packet when it's going to be from another table is a good
idea.

It might make more sense to emit a new packet type just for the flag,
that way, in the sequence of packets, the client can reliably tell the
dump of which tables was interrupted, and make some decision based on
that, vs having to deem all tables affected...

-- 
Ismael Luceno
SUSE L3 Support