[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <992dcaf7-2b24-4e91-8c69-a5471da209ae@alu.unizg.hr>
Date: Tue, 17 Oct 2023 22:43:36 +0200
From: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To: Simon Horman <horms@...nel.org>
Cc: Heiner Kallweit <hkallweit1@...il.com>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, nic_swsd@...ltek.com,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Marco Elver <elver@...gle.com>
Subject: Re: [PATCH v2 3/3] r8169: fix the KCSAN reported data-race in rtl_tx
while reading TxDescArray[entry].opts1
On 10/17/23 22:01, Simon Horman wrote:
> On Mon, Oct 16, 2023 at 11:47:56PM +0200, Mirsad Goran Todorovac wrote:
>> KCSAN reported the following data-race:
>>
>> ==================================================================
>> BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
>>
>> race at unknown origin, with read to 0xffff888140d37570 of 4 bytes by interrupt on cpu 21:
>> rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
>> __napi_poll (net/core/dev.c:6527)
>> net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
>> __do_softirq (kernel/softirq.c:553)
>> __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
>> irq_exit_rcu (kernel/softirq.c:647)
>> sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
>> asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
>> cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
>> cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
>> call_cpuidle (kernel/sched/idle.c:135)
>> do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
>> cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
>> start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
>> secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
>>
>> value changed: 0xb0000042 -> 0x00000000
>>
>> Reported by Kernel Concurrency Sanitizer on:
>> CPU: 21 PID: 0 Comm: swapper/21 Tainted: G L 6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
>> Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>> ==================================================================
>>
>> The read side is in
>>
>> drivers/net/ethernet/realtek/r8169_main.c
>> =========================================
>> 4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
>> 4356 int budget)
>> 4357 {
>> 4358 unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
>> 4359 struct sk_buff *skb;
>> 4360
>> 4361 dirty_tx = tp->dirty_tx;
>> 4362
>> 4363 while (READ_ONCE(tp->cur_tx) != dirty_tx) {
>> 4364 unsigned int entry = dirty_tx % NUM_TX_DESC;
>> 4365 u32 status;
>> 4366
>> → 4367 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
>> 4368 if (status & DescOwn)
>> 4369 break;
>> 4370
>> 4371 skb = tp->tx_skb[entry].skb;
>> 4372 rtl8169_unmap_tx_skb(tp, entry);
>> 4373
>> 4374 if (skb) {
>> 4375 pkts_compl++;
>> 4376 bytes_compl += skb->len;
>> 4377 napi_consume_skb(skb, budget);
>> 4378 }
>> 4379 dirty_tx++;
>> 4380 }
>> 4381
>> 4382 if (tp->dirty_tx != dirty_tx) {
>> 4383 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
>> 4384 WRITE_ONCE(tp->dirty_tx, dirty_tx);
>> 4385
>> 4386 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
>> 4387 rtl_tx_slots_avail(tp),
>> 4388 R8169_TX_START_THRS);
>> 4389 /*
>> 4390 * 8168 hack: TxPoll requests are lost when the Tx packets are
>> 4391 * too close. Let's kick an extra TxPoll request when a burst
>> 4392 * of start_xmit activity is detected (if it is not detected,
>> 4393 * it is slow enough). -- FR
>> 4394 * If skb is NULL then we come here again once a tx irq is
>> 4395 * triggered after the last fragment is marked transmitted.
>> 4396 */
>> 4397 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
>> 4398 rtl8169_doorbell(tp);
>> 4399 }
>> 4400 }
>>
>> tp->TxDescArray[entry].opts1 is reported to have a data-race and READ_ONCE() fixes
>> this KCSAN warning.
>>
>> 4366
>> → 4367 status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1));
>> 4368 if (status & DescOwn)
>> 4369 break;
>> 4370
>>
>> Fixes: ^1da177e4c3f4 ("initial git repository build")
>
> Hi Mirsad,
>
> The fixes tag above seems wrong.
Hi, Simon,
It is taken directly from "git blame" as you can check for yourself.
It is supposed to tag the commits prior to the introduction of git.
If you have a better idea how to denote those, I will be happy to learn,
but I have no better clue than what "git blame" gives ...
Best regards,
Mirsad Todorovac
>> Cc: Heiner Kallweit <hkallweit1@...il.com>
>> Cc: nic_swsd@...ltek.com
>> Cc: "David S. Miller" <davem@...emloft.net>
>> Cc: Eric Dumazet <edumazet@...gle.com>
>> Cc: Jakub Kicinski <kuba@...nel.org>
>> Cc: Paolo Abeni <pabeni@...hat.com>
>> Cc: Marco Elver <elver@...gle.com>
>> Cc: netdev@...r.kernel.org
>> Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
>> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
>> Acked-by: Marco Elver <elver@...gle.com>
>
> ...
Powered by blists - more mailing lists