lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 17 Feb 2022 15:08:55 +0100
From:   Marek Szyprowski <m.szyprowski@...sung.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     bpf@...r.kernel.org, netdev@...r.kernel.org,
        "David S. Miller" <davem@...emloft.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Jesper Dangaard Brouer <hawk@...nel.org>,
        John Fastabend <john.fastabend@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Toke Høiland-Jørgensen 
        <toke@...e.dk>,
        Toke Høiland-Jørgensen 
        <toke@...hat.com>
Subject: Re: [PATCH net-next] net: Correct wrong BH disable in
 hard-interrupt.

Hi All,

On 17.02.2022 07:35, Marek Szyprowski wrote:
> Hi Andrzej,
>
> On 16.02.2022 18:50, Sebastian Andrzej Siewior wrote:
>> I missed the obvious case where netif_ix() is invoked from hard-IRQ
>> context.
>>
>> Disabling bottom halves is only needed in process context. This ensures
>> that the code remains on the current CPU and that the soft-interrupts
>> are processed at local_bh_enable() time.
>> In hard- and soft-interrupt context this is already the case and the
>> soft-interrupts will be processed once the context is left (at irq-exit
>> time).
>>
>> Disable bottom halves if neither hard-interrupts nor soft-interrupts are
>> disabled. Update the kernel-doc, mention that interrupts must be enabled
>> if invoked from process context.
>>
>> Fixes: baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked 
>> in any context.")
>> Reported-by: Marek Szyprowski <m.szyprowski@...sung.com>
>> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
>> ---
>> Marek, does this work for you?
>
> Yes, this fixed the issue. Thanks!
>
> Tested-by: Marek Szyprowski <m.szyprowski@...sung.com>

I've just noticed that there is one more issue left to fix (the $subject 
patch is already applied) - this one comes from threaded irq (if I got 
the stack trace right):

------------[ cut here ]------------
WARNING: CPU: 0 PID: 147 at kernel/softirq.c:363 
__local_bh_enable_ip+0xa8/0x1c0
Modules linked in: cpufreq_powersave cpufreq_conservative brcmfmac 
brcmutil cfg80211 s3fwrn5_i2c s3fwrn5 nci crct10dif_ce exynos_gsc nfc 
s5p_jpeg hci_uart btqca s
5p_mfc v4l2_mem2mem btbcm bluetooth videobuf2_dma_contig 
videobuf2_memops ecdh_generic videobuf2_v4l2 panfrost drm_shmem_helper 
ecc videobuf2_common gpu_sched rfkill videodev mc ip_tables x_tab
les ipv6
CPU: 0 PID: 147 Comm: irq/150-dwc3 Not tainted 5.17.0-rc4-next-20220217+ 
#4557
Hardware name: Samsung TM2E board (DT)
pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __local_bh_enable_ip+0xa8/0x1c0
lr : netif_rx+0xa4/0x2c0
...

Call trace:
  __local_bh_enable_ip+0xa8/0x1c0
  netif_rx+0xa4/0x2c0
  rx_complete+0x214/0x250
  usb_gadget_giveback_request+0x58/0x170
  dwc3_gadget_giveback+0xe4/0x200
  dwc3_gadget_endpoint_trbs_complete+0x100/0x388
  dwc3_thread_interrupt+0x46c/0xe20
  irq_thread_fn+0x28/0x98
  irq_thread+0x184/0x238
  kthread+0x100/0x120
  ret_from_fork+0x10/0x20
irq event stamp: 645
hardirqs last  enabled at (643): [<ffff8000080c93b8>] 
finish_task_switch+0x98/0x288
hardirqs last disabled at (644): [<ffff8000090a6e34>] 
_raw_spin_lock_irqsave+0xb4/0x148
softirqs last  enabled at (252): [<ffff800008010488>] _stext+0x488/0x5cc
softirqs last disabled at (645): [<ffff800008ed71b0>] netif_rx+0x188/0x2c0
---[ end trace 0000000000000000 ]---


>
>>   net/core/dev.c | 11 ++++++++---
>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 909fb38159108..87729491460fc 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -4860,7 +4860,9 @@ EXPORT_SYMBOL(__netif_rx);
>>    *    congestion control or by the protocol layers.
>>    *    The network buffer is passed via the backlog NAPI device. 
>> Modern NIC
>>    *    driver should use NAPI and GRO.
>> - *    This function can used from any context.
>> + *    This function can used from interrupt and from process 
>> context. The
>> + *    caller from process context must not disable interrupts before 
>> invoking
>> + *    this function.
>>    *
>>    *    return values:
>>    *    NET_RX_SUCCESS    (no congestion)
>> @@ -4870,12 +4872,15 @@ EXPORT_SYMBOL(__netif_rx);
>>   int netif_rx(struct sk_buff *skb)
>>   {
>>       int ret;
>> +    bool need_bh_off = !(hardirq_count() | softirq_count());
>>   -    local_bh_disable();
>> +    if (need_bh_off)
>> +        local_bh_disable();
>>       trace_netif_rx_entry(skb);
>>       ret = netif_rx_internal(skb);
>>       trace_netif_rx_exit(ret);
>> -    local_bh_enable();
>> +    if (need_bh_off)
>> +        local_bh_enable();
>>       return ret;
>>   }
>>   EXPORT_SYMBOL(netif_rx);
>
> Best regards

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ