[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <96ec3de5-76a8-4d72-b8d7-feedff4a3af8@orange.com>
Date: Tue, 19 Nov 2024 07:46:32 +0100
From: Alexandre Ferrieux <alexandre.ferrieux@...il.com>
To: Cong Wang <xiyou.wangcong@...il.com>,
Alexandre Ferrieux <alexandre.ferrieux@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, edumazet@...gle.com, jhs@...atatu.com,
jiri@...nulli.us, horms@...nel.org, netdev@...r.kernel.org
Subject: Re: RFC: chasing all idr_remove() misses
On 19/11/2024 04:51, Cong Wang wrote:
> On Thu, Nov 14, 2024 at 07:24:27PM +0100, Alexandre Ferrieux wrote:
>> Hi,
>>
>> In the recent fix of u32's IDR leaks, one side remark is that the problem went
>> unnoticed for 7 years due to the NULL result from idr_remove() being ignored at
>> this call site.
>> [...]
>> So, unless we have reasons to think cls_u32 was the only place where two ID
>> encodings might lend themselves to confusion, I'm wondering if it wouldn't make
>> sense to chase the issue more systematically:
>>
>> - either with WARN_ON[_ONCE](idr_remove()==NULL) on each call site individually
>> (a year-long endeavor implying tens of maintainers)
>>
>> - or with WARN_ON[_ONCE] just before returning NULL within idr_remove() itself,
>> or even radix_tree_delete_item().
>>
>> Opinions ?
>
> Yeah, or simply WARN_ON uncleaned IDR in idr_destroy(), which is a more
> common pattern.
No, in the general case, idr_destroy() only happens at the end of life of an IDR
set. Some structures in the kernel have a long lifetime, which means possibly
splipping out of fuzzers' scrutiny.
As an illustration, in cls_u32 itself, in the 2048-delete-add loop I use in the
tdc test committed with the fix, idr_destroy(&tp_c->handle_idr) is called only
at the "cleanup" step, when deleting the interface.
You can only imagine, in the hundreds of other uses of IDR, the "miss rate" that
would follow from targeting idr_destroy() instead of idr_remove().
Powered by blists - more mailing lists