lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 17 Jun 2022 08:55:53 -0600
From:   David Ahern <dsahern@...il.com>
To:     Ismael Luceno <iluceno@...e.de>, Jakub Kicinski <kuba@...nel.org>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Paolo Abeni <pabeni@...hat.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Netlink NLM_F_DUMP_INTR flag lost

On 6/17/22 7:01 AM, Ismael Luceno wrote:
> On Thu, 16 Jun 2022 17:16:12 -0700
> Jakub Kicinski <kuba@...nel.org> wrote:
>> On Thu, 16 Jun 2022 17:10:16 +0200 Ismael Luceno wrote:
>>> On Wed, 15 Jun 2022 09:00:44 -0700
>>> Jakub Kicinski <kuba@...nel.org> wrote:  
>>>> On Wed, 15 Jun 2022 17:11:13 +0200 Ismael Luceno wrote:    
>>>>> It seems a RTM_GETADDR request with AF_UNSPEC has a corner case
>>>>> where the NLM_F_DUMP_INTR flag is lost.
>>>>>
>>>>> After a change in an address table, if a packet has been fully
>>>>> filled just previous, and if the end of the table is found at
>>>>> the same time, then the next packet should be flagged, which
>>>>> works fine when it's NLMSG_DONE, but gets clobbered when
>>>>> another table is to be dumped next.    
>>>>
>>>> Could you describe how it gets clobbered? You mean that prev_seq
>>>> gets updated somewhere without setting the flag or something
>>>> overwrites nlmsg_flags? Or we set _INTR on an empty skb which
>>>> never ends up getting sent? Or..    
>>>
>>> It seems to me that in most functions, but specifically in the case
>>> of net/ipv4/devinet.c:in_dev_dump_addr or inet_netconf_dump_devconf,
>>> nl_dump_check_consistent is called after each address/attribute is
>>> dumped, meaning a hash table generation change happening just after
>>> it adds an entry, if it also causes it to find the end of the table,
>>> wouldn't be flagged.
>>>
>>> Adding an extra call at the end of all these functions should fix
>>> that, and spill the flag into the next packet, but would that be
>>> correct?  
>>
>> The whole _DUMP_INTR scheme does indeed seem to lack robustness.
>> The update side changes the atomic after the update, and the read
>> side validates it before. So there's plenty time for a race to happen.
>> But I'm not sure if that's what you mean or you see more issues.
> 
> No, I'm concerned that while in the dumping loop, the table might
> change between iterations, and if this results in the loop not finding
> more entries, because in most these functions there's no
> consistency check after the loop, this will go undetected.

Specific example? e.g., fib dump and address dumps have a generation id
that gets recorded before the start of the dump and checked at the end
of the dump.

> 
>>> It seems this condition is flagged correctly when NLM_DONE is
>>> produced, I couldn't see why, but I'm guessing another call to
>>> nl_dump_check_consistent...
>>>
>>> Also, I noticed that in net/core/rtnetlink.c:rtnl_dump_all: 
>>>
>>> 	if (idx > s_idx) {
>>> 		memset(&cb->args[0], 0, sizeof(cb->args));
>>> 		cb->prev_seq = 0;
>>> 		cb->seq = 0;
>>> 	}
>>> 	ret = dumpit(skb, cb);
>>>
>>> This also prevents it to be detect the condition when dumping the
>>> next table, but that seems desirable...  
>>
>> That's iterating over protocols, AFAICT, we don't guarantee
>> consistency across protocols.
> 
> That's reasonable, I was just wondering about it because it does seem
> reasonable that the flags affect only the packets describing the table
> whose dump got interrupted...
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ