[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c2a64979-73d1-2c22-e048-c275c9f81558@samsung.com>
Date: Thu, 17 Feb 2022 15:08:55 +0100
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
"David S. Miller" <davem@...emloft.net>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Toke Høiland-Jørgensen
<toke@...e.dk>,
Toke Høiland-Jørgensen
<toke@...hat.com>
Subject: Re: [PATCH net-next] net: Correct wrong BH disable in
hard-interrupt.
Hi All,
On 17.02.2022 07:35, Marek Szyprowski wrote:
> Hi Andrzej,
>
> On 16.02.2022 18:50, Sebastian Andrzej Siewior wrote:
>> I missed the obvious case where netif_ix() is invoked from hard-IRQ
>> context.
>>
>> Disabling bottom halves is only needed in process context. This ensures
>> that the code remains on the current CPU and that the soft-interrupts
>> are processed at local_bh_enable() time.
>> In hard- and soft-interrupt context this is already the case and the
>> soft-interrupts will be processed once the context is left (at irq-exit
>> time).
>>
>> Disable bottom halves if neither hard-interrupts nor soft-interrupts are
>> disabled. Update the kernel-doc, mention that interrupts must be enabled
>> if invoked from process context.
>>
>> Fixes: baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked
>> in any context.")
>> Reported-by: Marek Szyprowski <m.szyprowski@...sung.com>
>> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
>> ---
>> Marek, does this work for you?
>
> Yes, this fixed the issue. Thanks!
>
> Tested-by: Marek Szyprowski <m.szyprowski@...sung.com>
I've just noticed that there is one more issue left to fix (the $subject
patch is already applied) - this one comes from threaded irq (if I got
the stack trace right):
------------[ cut here ]------------
WARNING: CPU: 0 PID: 147 at kernel/softirq.c:363
__local_bh_enable_ip+0xa8/0x1c0
Modules linked in: cpufreq_powersave cpufreq_conservative brcmfmac
brcmutil cfg80211 s3fwrn5_i2c s3fwrn5 nci crct10dif_ce exynos_gsc nfc
s5p_jpeg hci_uart btqca s
5p_mfc v4l2_mem2mem btbcm bluetooth videobuf2_dma_contig
videobuf2_memops ecdh_generic videobuf2_v4l2 panfrost drm_shmem_helper
ecc videobuf2_common gpu_sched rfkill videodev mc ip_tables x_tab
les ipv6
CPU: 0 PID: 147 Comm: irq/150-dwc3 Not tainted 5.17.0-rc4-next-20220217+
#4557
Hardware name: Samsung TM2E board (DT)
pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __local_bh_enable_ip+0xa8/0x1c0
lr : netif_rx+0xa4/0x2c0
...
Call trace:
__local_bh_enable_ip+0xa8/0x1c0
netif_rx+0xa4/0x2c0
rx_complete+0x214/0x250
usb_gadget_giveback_request+0x58/0x170
dwc3_gadget_giveback+0xe4/0x200
dwc3_gadget_endpoint_trbs_complete+0x100/0x388
dwc3_thread_interrupt+0x46c/0xe20
irq_thread_fn+0x28/0x98
irq_thread+0x184/0x238
kthread+0x100/0x120
ret_from_fork+0x10/0x20
irq event stamp: 645
hardirqs last enabled at (643): [<ffff8000080c93b8>]
finish_task_switch+0x98/0x288
hardirqs last disabled at (644): [<ffff8000090a6e34>]
_raw_spin_lock_irqsave+0xb4/0x148
softirqs last enabled at (252): [<ffff800008010488>] _stext+0x488/0x5cc
softirqs last disabled at (645): [<ffff800008ed71b0>] netif_rx+0x188/0x2c0
---[ end trace 0000000000000000 ]---
>
>> net/core/dev.c | 11 ++++++++---
>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 909fb38159108..87729491460fc 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -4860,7 +4860,9 @@ EXPORT_SYMBOL(__netif_rx);
>> * congestion control or by the protocol layers.
>> * The network buffer is passed via the backlog NAPI device.
>> Modern NIC
>> * driver should use NAPI and GRO.
>> - * This function can used from any context.
>> + * This function can used from interrupt and from process
>> context. The
>> + * caller from process context must not disable interrupts before
>> invoking
>> + * this function.
>> *
>> * return values:
>> * NET_RX_SUCCESS (no congestion)
>> @@ -4870,12 +4872,15 @@ EXPORT_SYMBOL(__netif_rx);
>> int netif_rx(struct sk_buff *skb)
>> {
>> int ret;
>> + bool need_bh_off = !(hardirq_count() | softirq_count());
>> - local_bh_disable();
>> + if (need_bh_off)
>> + local_bh_disable();
>> trace_netif_rx_entry(skb);
>> ret = netif_rx_internal(skb);
>> trace_netif_rx_exit(ret);
>> - local_bh_enable();
>> + if (need_bh_off)
>> + local_bh_enable();
>> return ret;
>> }
>> EXPORT_SYMBOL(netif_rx);
>
> Best regards
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Powered by blists - more mailing lists