lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8e65c3d3-c628-2176-2fc2-a1bc675ad607@intel.com>
Date:   Thu, 20 Jul 2023 19:48:06 +0200
From:   Alexander Lobakin <aleksander.lobakin@...el.com>
To:     Jakub Kicinski <kuba@...nel.org>
CC:     "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Paolo Abeni <pabeni@...hat.com>,
        Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
        Larysa Zaremba <larysa.zaremba@...el.com>,
        Yunsheng Lin <linyunsheng@...wei.com>,
        Alexander Duyck <alexanderduyck@...com>,
        Jesper Dangaard Brouer <hawk@...nel.org>,
        "Ilias Apalodimas" <ilias.apalodimas@...aro.org>,
        <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC net-next v2 7/7] net: skbuff: always try to recycle PP
 pages directly when in softirq

From: Jakub Kicinski <kuba@...nel.org>
Date: Thu, 20 Jul 2023 10:12:31 -0700

> On Thu, 20 Jul 2023 18:46:02 +0200 Alexander Lobakin wrote:
>> From: Jakub Kicinski <kuba@...nel.org>
>> Date: Wed, 19 Jul 2023 13:51:50 -0700
>>
>>> On Wed, 19 Jul 2023 18:34:46 +0200 Alexander Lobakin wrote:  
>>  [...]  
>>>>
>>>> If we're on the same CPU where the NAPI would run and in the same
>>>> context, i.e. softirq, in which the NAPI would run, what is the problem?
>>>> If there really is a good one, I can handle it here.  
>>>
>>> #define SOFTIRQ_BITS		8
>>> #define SOFTIRQ_MASK		(__IRQ_MASK(SOFTIRQ_BITS) << SOFTIRQ_SHIFT)
>>> # define softirq_count()	(preempt_count() & SOFTIRQ_MASK)
>>> #define in_softirq()		(softirq_count())  
>>
>> I do remember those, don't worry :)
>>
>>> I don't know what else to add beyond that and the earlier explanation.  
>>
>> My question was "how can two things race on one CPU in one context if it
>> implies they won't ever happen simultaneously", but maybe my zero
>> knowledge of netcons hides something from me.
> 
> One of them is in hardirq.

If I got your message correctly, that means softirq_count() can return
`true` even if we're in hardirq context, but there are some softirqs
pending? I.e. if I call local_irq_save() inside NAPI poll loop,
in_softirq() will still return `true`? (I'll check it myself in a bit,
but why not ask).
Isn't checking for `interrupt_context_level() == 1` more appropriate
then? Page Pool core code also uses in_softirq(), as well as a hellaton
of other networking-related places.


> 
>>> AFAIK pages as allocated by page pool do not benefit from the usual
>>> KASAN / KMSAN checkers, so if we were to double-recycle a page once
>>> a day because of a netcons race - it's going to be a month long debug
>>> for those of us using Linux in production.  
>>
>> if (!test_bit(&napi->state, NPSVC))
> 
> if you have to the right check is !in_hardirq()
> 
>> ? It would mean we're not netpolling.
>> Otherwise, if this still is not enough, I'do go back to my v1 approach
>> with having a NAPI flag, which would tell for sure we're good to go. I
>> got confused by your "wouldn't just checking for softirq be enough"! T.T
>> Joking :D
> 
> I guess the problem I'm concerned about can already happen.
> I'll send a lockdep annotation shortly.

Interesten.

Thanks,
Olek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ