netdev - Re: Netlink NLM_F_DUMP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220616171612.66638e54@kernel.org>
Date:   Thu, 16 Jun 2022 17:16:12 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Ismael Luceno <iluceno@...e.de>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Paolo Abeni <pabeni@...hat.com>,
        David Ahern <dsahern@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Netlink NLM_F_DUMP_INTR flag lost

On Thu, 16 Jun 2022 17:10:16 +0200 Ismael Luceno wrote:
> On Wed, 15 Jun 2022 09:00:44 -0700
> Jakub Kicinski <kuba@...nel.org> wrote:
> > On Wed, 15 Jun 2022 17:11:13 +0200 Ismael Luceno wrote:  
> > > It seems a RTM_GETADDR request with AF_UNSPEC has a corner case
> > > where the NLM_F_DUMP_INTR flag is lost.
> > > 
> > > After a change in an address table, if a packet has been fully
> > > filled just previous, and if the end of the table is found at the
> > > same time, then the next packet should be flagged, which works fine
> > > when it's NLMSG_DONE, but gets clobbered when another table is to
> > > be dumped next.  
> > 
> > Could you describe how it gets clobbered? You mean that prev_seq gets
> > updated somewhere without setting the flag or something overwrites
> > nlmsg_flags? Or we set _INTR on an empty skb which never ends up
> > getting sent? Or..  
> 
> It seems to me that in most functions, but specifically in the case of
> net/ipv4/devinet.c:in_dev_dump_addr or inet_netconf_dump_devconf,
> nl_dump_check_consistent is called after each address/attribute is
> dumped, meaning a hash table generation change happening just after it
> adds an entry, if it also causes it to find the end of the table,
> wouldn't be flagged.
> 
> Adding an extra call at the end of all these functions should fix that,
> and spill the flag into the next packet, but would that be correct?

The whole _DUMP_INTR scheme does indeed seem to lack robustness.
The update side changes the atomic after the update, and the read
side validates it before. So there's plenty time for a race to happen.
But I'm not sure if that's what you mean or you see more issues.

> It seems this condition is flagged correctly when NLM_DONE is produced,
> I couldn't see why, but I'm guessing another call to
> nl_dump_check_consistent...
> 
> Also, I noticed that in net/core/rtnetlink.c:rtnl_dump_all: 
> 
> 	if (idx > s_idx) {
> 		memset(&cb->args[0], 0, sizeof(cb->args));
> 		cb->prev_seq = 0;
> 		cb->seq = 0;
> 	}
> 	ret = dumpit(skb, cb);
> 
> This also prevents it to be detect the condition when dumping the next
> table, but that seems desirable...

That's iterating over protocols, AFAICT, we don't guarantee consistency
across protocols.

> Am I grasping it correctly?
> 
> Some functions like net/core/rtnetlink.c:rtnl_dump_ifinfo do call
> nl_dump_check_consistent when finishing, but I'm seeing most others
> don't do that, instead doing it only when adding an entry to the packet
> (another example is: rtnl_stats_dump).
> 
> Again, while adding the check at the end of each function would solve
> these inconsistencies, it isn't so clear to me that spilling this flag
> into the next packet when it's going to be from another table is a good
> idea.
> 
> It might make more sense to emit a new packet type just for the flag,
> that way, in the sequence of packets, the client can reliably tell the
> dump of which tables was interrupted, and make some decision based on
> that, vs having to deem all tables affected...
>