[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b3884ff9-d903-948d-797a-1830a39b1e71@intel.com>
Date: Thu, 20 Jul 2023 20:13:07 +0200
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Larysa Zaremba <larysa.zaremba@...el.com>,
Yunsheng Lin <linyunsheng@...wei.com>,
Alexander Duyck <alexanderduyck@...com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
"Ilias Apalodimas" <ilias.apalodimas@...aro.org>,
<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC net-next v2 7/7] net: skbuff: always try to recycle PP
pages directly when in softirq
From: Alexander Lobakin <aleksander.lobakin@...el.com>
Date: Thu, 20 Jul 2023 20:01:28 +0200
> From: Jakub Kicinski <kuba@...nel.org>
> Date: Thu, 20 Jul 2023 11:00:27 -0700
>
>> On Thu, 20 Jul 2023 19:48:06 +0200 Alexander Lobakin wrote:
>>>>> My question was "how can two things race on one CPU in one context if it
>>>>> implies they won't ever happen simultaneously", but maybe my zero
>>>>> knowledge of netcons hides something from me.
>>>>
>>>> One of them is in hardirq.
>>>
>>> If I got your message correctly, that means softirq_count() can return
>>> `true` even if we're in hardirq context, but there are some softirqs
>>> pending?
>>
>> Not pending, being executed. Hardirq can come during softirq.
>>
>>> I.e. if I call local_irq_save() inside NAPI poll loop,
>>> in_softirq() will still return `true`? (I'll check it myself in a bit,
>>> but why not ask).
>>
>> Yes.
My code, run from the NAPI poll loop:
pr_info("BH. irq_count(): %lu", irq_count());
pr_info("BH. in_hardirq(): %lu", in_hardirq());
pr_info("BH. in_softirq(): %lu", in_softirq());
pr_info("BH. in_serving_softirq(): %lu", in_serving_softirq());
pr_info("BH. interrupt_context_level(): %u",
interrupt_context_level());
local_irq_save(flags);
pr_info("TH. irq_count(): %lu", irq_count());
pr_info("TH. in_hardirq(): %lu", in_hardirq());
pr_info("TH. in_softirq(): %lu", in_softirq());
pr_info("TH. in_serving_softirq(): %lu", in_serving_softirq());
pr_info("TH. interrupt_context_level(): %u",
interrupt_context_level());
local_irq_restore(flags);
The result:
BH. irq_count(): 256
BH. in_hardirq(): 0
BH. in_softirq(): 256
BH. in_serving_softirq(): 256
BH. interrupt_context_level(): 1
TH. irq_count(): 256
TH. in_hardirq(): 0
TH. in_softirq(): 256
TH. in_serving_softirq(): 256
TH. interrupt_context_level(): 1
IOW, it reports we're in softirq no bloody matter if interrupts are
enabled or not. Either I did something wrong or the entire in_*irq()
family, including interrupt_context_level(), doesn't protect from
anything at all and doesn't work the way that most devs expect it to work?
(or was it just me? :D)
I guess the only way to be sure is to always check irqs_disabled() when
in_softirq() returns true.
>>
>>> Isn't checking for `interrupt_context_level() == 1` more appropriate
>>> then? Page Pool core code also uses in_softirq(), as well as a hellaton
>>> of other networking-related places.
>>
>> Right now page pool only supports BH and process contexts. IOW the
>> "else" branch of if (in_softirq()) in page pool is expecting to be
>> in process context.
>>
>> Supporting hard irq would mean we need to switch to _irqsave() locking.
>> That's likely way too costly.
>>
>> Or stash the freed pages away and free them lazily.
>>
>> Or add a lockdep warning and hope nobody will ever free a page-pool
>> backed skb from hard IRQ context :)
>
> I told you under the previous version that this function is not supposed
> to be called under hardirq context, so we don't need to check for it :D
> But I was assuming nobody would try to do that. Seems like not really
> (netcons) if you want to sanitize this...
>
> Thanks,
> Olek
Thanks,
Olek
Powered by blists - more mailing lists