[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <38269b10-e3d4-c49e-c240-0d511c6fa185@eikelenboom.it>
Date: Sun, 10 Feb 2019 15:18:22 +0100
From: Sander Eikelenboom <linux@...elenboom.it>
To: Heiner Kallweit <hkallweit1@...il.com>,
Realtek linux nic maintainers <nic_swsd@...ltek.com>,
David Miller <davem@...emloft.net>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net 2/2] Revert "r8169: make use of xmit_more and
__netdev_sent_queue"
On 10/02/2019 14:50, Heiner Kallweit wrote:
> This reverts commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.
>
> Sander reported a regression [1], therefore let's revert these commits.
> Removal of the barriers doesn't seem to contribute to the issue, the
> patch just overlaps with the problematic one and reverting
> "r8169: make use of xmit_more and __netdev_sent_queue" only wasn't
> tested.
This commit message is incorrect, it is only correct for the other commit bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3.
Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 caused the kernel panic on the BUG_ON() in lib/dynamic_queue_limits.c
it could well be the smp_wmb() barrier is what is needed (or the construction from the patch Eric proposed.
The splat from the BUG_ON() is below.
Same goes for the cover letter which make it seem rather benign, while the regression is actually a kernel panic.
--
Sander
[ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
[ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
[ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
[ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
[ 6466.611579] RIP: e030:dql_completed+0x126/0x140
[ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
[ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
[ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
[ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
[ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
[ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
[ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
[ 6466.722165] FS: 00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
[ 6466.733228] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
[ 6466.758366] Call Trace:
[ 6466.768118] <IRQ>
[ 6466.778214] rtl8169_poll+0x4f4/0x640
[ 6466.789198] net_rx_action+0x23d/0x370
[ 6466.798467] __do_softirq+0xed/0x229
[ 6466.807039] irq_exit+0xb7/0xc0
[ 6466.815471] xen_evtchn_do_upcall+0x27/0x40
[ 6466.826647] xen_do_hypervisor_callback+0x29/0x40
[ 6466.835902] </IRQ>
[ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
[ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
[ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
[ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
[ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
[ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
[ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
[ 6466.926647] ? xen_hypercall_mmu_update+0xa/0x20
[ 6466.938195] ? xen_set_pte_at+0x78/0xe0
[ 6466.947046] ? __handle_mm_fault+0xc43/0x1060
[ 6466.955772] ? do_mmap+0x44b/0x5b0
[ 6466.964410] ? handle_mm_fault+0xf8/0x200
[ 6466.973290] ? __do_page_fault+0x231/0x4a0
[ 6466.981973] ? page_fault+0x8/0x30
[ 6466.990904] ? page_fault+0x1e/0x30
[ 6466.999585] Modules linked in:
[ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
[ 6467.016751] RIP: e030:dql_completed+0x126/0x140
[ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
[ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
[ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
[ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
[ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
[ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
[ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
[ 6467.084916] FS: 00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
[ 6467.093352] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
[ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
[ 6467.118166] Kernel Offset: disabled
(XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>
> [1] https://marc.info/?t=154965066400001&r=1&w=2
>
> Signed-off-by: Heiner Kallweit <hkallweit1@...il.com>
> ---
> drivers/net/ethernet/realtek/r8169.c | 19 ++++++++++---------
> 1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index bba806ce57d3..6e36b88ca7c9 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -6074,7 +6074,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
> struct device *d = tp_to_dev(tp);
> dma_addr_t mapping;
> u32 opts[2], len;
> - bool stop_queue;
> int frags;
>
> if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
> @@ -6116,6 +6115,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>
> txd->opts2 = cpu_to_le32(opts[1]);
>
> + netdev_sent_queue(dev, skb->len);
> +
> skb_tx_timestamp(skb);
>
> /* Force memory writes to complete before releasing descriptor */
> @@ -6128,16 +6129,16 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>
> tp->cur_tx += frags + 1;
>
> - stop_queue = !rtl_tx_slots_avail(tp, MAX_SKB_FRAGS);
> - if (unlikely(stop_queue))
> - netif_stop_queue(dev);
> + RTL_W8(tp, TxPoll, NPQ);
>
> - if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
> - RTL_W8(tp, TxPoll, NPQ);
> - mmiowb();
> - }
> + mmiowb();
>
> - if (unlikely(stop_queue)) {
> + if (!rtl_tx_slots_avail(tp, MAX_SKB_FRAGS)) {
> + /* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
> + * not miss a ring update when it notices a stopped queue.
> + */
> + smp_wmb();
> + netif_stop_queue(dev);
> /* Sync with rtl_tx:
> * - publish queue status and cur_tx ring index (write barrier)
> * - refresh dirty_tx ring index (read barrier).
>
Powered by blists - more mailing lists