[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SN6PR11MB35188DA27C2DAF0FEE874A65EB7E9@SN6PR11MB3518.namprd11.prod.outlook.com>
Date: Tue, 6 Sep 2022 19:01:04 +0000
From: "Switzer, David" <david.switzer@...el.com>
To: Martin Zaharinov <micron10@...il.com>,
Eric Dumazet <eric.dumazet@...il.com>,
Eric Dumazet <edumazet@...gle.com>,
"Fijalkowski, Maciej" <maciej.fijalkowski@...el.com>,
"Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
"David S . Miller" <davem@...emloft.net>,
"Jakub Kicinski" <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
"Nguyen, Anthony L" <anthony.l.nguyen@...el.com>,
netdev <netdev@...r.kernel.org>,
"Dubel, Helena Anna" <helena.anna.dubel@...el.com>,
"intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>
Subject: RE: [Intel-wired-lan] Urgent Kernel Bug NETDEV WATCHDOG ixgbe
transmit queue 2 timed out after kernel 5.19.2 to 5.19.6
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@...osl.org> On Behalf Of
>Martin Zaharinov
>Sent: Saturday, September 3, 2022 2:54 AM
>To: Eric Dumazet <eric.dumazet@...il.com>; Eric Dumazet
><edumazet@...gle.com>; Fijalkowski, Maciej
><maciej.fijalkowski@...el.com>; Brandeburg, Jesse
><jesse.brandeburg@...el.com>; David S . Miller <davem@...emloft.net>;
>Jakub Kicinski <kuba@...nel.org>; Paolo Abeni <pabeni@...hat.com>;
>Nguyen, Anthony L <anthony.l.nguyen@...el.com>; netdev
><netdev@...r.kernel.org>; Dubel, Helena Anna
><helena.anna.dubel@...el.com>; intel-wired-lan@...ts.osuosl.org
>Subject: [Intel-wired-lan] Urgent Kernel Bug NETDEV WATCHDOG ixgbe
>transmit queue 2 timed out after kernel 5.19.2 to 5.19.6
>
>Hi All
Hello Martin!
I'm Dave, a driver validation engineer at Intel Corp.
>
>
>after move to release 5.19.x (2 and up to 6 ) start geting this bug report and
>machine reboot automatic after that.
I'm sorry you're having this issue, I'm working on reproducing your issue so that we can have our developers look into it.
I will reach out to you if I have any questions.
Have a great day!
Dave
>
>With kernel 5.18 this problem is not happen.
>
>Machine run with 2x 10G Intel 82599 card in bonding .
>its a simple router with 6 core cpu.
>
>Sep 3 10:05:39 [193378.949952][ C10] ------------[ cut here ]------------
>Sep 3 10:05:39 [193378.949965][ C10] NETDEV WATCHDOG: eth1 (ixgbe):
>transmit queue 2 timed out
>Sep 3 10:05:39 [193378.949980][ C10] WARNING: CPU: 10 PID: 0 at
>net/sched/sch_generic.c:529 dev_watchdog+0x167/0x170
>Sep 3 10:05:39 [193378.949992][ C10] Modules linked in:
>nf_conntrack_netlink nft_limit pppoe pppox ppp_generic slhc nft_nat
>nft_chain_nat nf_tables team_mode_loadbalance team netconsole coretemp
>ixgbe mdio_devres libphy mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp
>nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp
>nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
>acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos
>Sep 3 10:05:39 [193378.950023][ C10] CPU: 10 PID: 0 Comm: swapper/10
>Tainted: G O 5.19.4 #1
>Sep 3 10:05:39 [193378.950028][ C10] Hardware name: Supermicro Super
>Server/X10SRD-F, BIOS 3.3 10/28/2020
>Sep 3 10:05:39 [193378.950032][ C10] RIP: 0010:dev_watchdog+0x167/0x170
>Sep 3 10:05:39 [193378.950037][ C10] Code: 28 e9 77 ff ff ff 48 89 df c6 05 95
>3d c4 00 01 e8 9e 5a fb ff 48 89 c2 44 89 e1 48 89 de 48 c7 c7 f0 d0 ec a7 e8 c2 c2
>13 00 <0f> 0b eb 85 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 47 50 4c 8b 28
>Sep 3 10:05:39 [193378.950043][ C10] RSP: 0018:ffff96320030cee8 EFLAGS:
>00010292
>Sep 3 10:05:39 [193378.950048][ C10] RAX: 0000000000000039 RBX:
>ffff898a4da00000 RCX: 0000000000000001
>Sep 3 10:05:39 [193378.950053][ C10] RDX: 00000000ffffffea RSI:
>00000000fffbffff RDI: 00000000fffbffff
>Sep 3 10:05:39 [193378.950057][ C10] RBP: ffff898a4da003c0 R08:
>0000000000000001 R09: 00000000fffbffff
>Sep 3 10:05:39 [193378.950061][ C10] R10: ffff89919d600000 R11:
>0000000000000003 R12: 0000000000000002
>Sep 3 10:05:39 [193378.950065][ C10] R13: 0000000000000000 R14:
>ffff89919fca07a8 R15: 0000000000000082
>Sep 3 10:05:39 [193378.950070][ C10] FS: 0000000000000000(0000)
>GS:ffff89919fc80000(0000) knlGS:0000000000000000
>Sep 3 10:05:39 [193378.950074][ C10] CS: 0010 DS: 0000 ES: 0000 CR0:
>0000000080050033
>Sep 3 10:05:39 [193378.950078][ C10] CR2: 00007fb39f41d000 CR3:
>00000001002fd003 CR4: 00000000003706e0
>Sep 3 10:05:39 [193378.950082][ C10] DR0: 0000000000000000 DR1:
>0000000000000000 DR2: 0000000000000000
>Sep 3 10:05:39 [193378.950086][ C10] DR3: 0000000000000000 DR6:
>00000000fffe0ff0 DR7: 0000000000000400
>Sep 3 10:05:39 [193378.950090][ C10] Call Trace:
>Sep 3 10:05:39 [193378.950094][ C10] <IRQ>
>Sep 3 10:05:39 [193378.950098][ C10] ? pfifo_fast_destroy+0x30/0x30
>Sep 3 10:05:39 [193378.950104][ C10] call_timer_fn.constprop.0+0x14/0x70
>Sep 3 10:05:39 [193378.950110][ C10] __run_timers.part.0+0x164/0x190
>Sep 3 10:05:39 [193378.950116][ C10] ?
>__hrtimer_run_queues+0x143/0x1a0
>Sep 3 10:05:39 [193378.950120][ C10] ? ktime_get+0x30/0x90
>Sep 3 10:05:39 [193378.950125][ C10] run_timer_softirq+0x21/0x50
>Sep 3 10:05:39 [193378.950130][ C10] __do_softirq+0xaf/0x1d7
>Sep 3 10:05:39 [193378.950136][ C10] __irq_exit_rcu+0x9a/0xd0
>Sep 3 10:05:39 [193378.950142][ C10]
>sysvec_apic_timer_interrupt+0x66/0x80
>Sep 3 10:05:39 [193378.950149][ C10] </IRQ>
>Sep 3 10:05:39 [193378.950152][ C10] <TASK>
>Sep 3 10:05:39 [193378.950155][ C10]
>asm_sysvec_apic_timer_interrupt+0x16/0x20
>Sep 3 10:05:39 [193378.950160][ C10] RIP:
>0010:cpuidle_enter_state+0xb3/0x290
>Sep 3 10:05:39 [193378.950167][ C10] Code: e8 d2 0d b0 ff 31 ff 49 89 c5 e8 48
>68 af ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 cf 01 00 00 31 ff e8 81 b4 b3 ff fb 45 85
>f6 <0f> 88 d0 00 00 00 49 63 ce 48 6b f1 68 48 8b 04 24 4c 89 ea 48 29
>Sep 3 10:05:39 [193378.950402][ C10] RSP: 0018:ffff96320014fe98 EFLAGS:
>00000202
>Sep 3 10:05:39 [193378.950411][ C10] RAX: ffff89919fca6800 RBX:
>ffff898a4206c800 RCX: 000000000000001f
>Sep 3 10:05:39 [193378.950418][ C10] RDX: 0000afe08b9e69de RSI:
>00000000238e3b7a RDI: 0000000000000000
>Sep 3 10:05:39 [193378.950424][ C10] RBP: 0000000000000001 R08:
>0000000000000002 R09: ffff89919fca5704
>Sep 3 10:05:39 [193378.950430][ C10] R10: 0000000000000008 R11:
>000000000000010b R12: ffffffffa8222f40
>Sep 3 10:05:39 [193378.950436][ C10] R13: 0000afe08b9e69de R14:
>0000000000000001 R15: 0000000000000000
>Sep 3 10:05:39 [193378.950443][ C10] ? cpuidle_enter_state+0x98/0x290
>Sep 3 10:05:39 [193378.950451][ C10] cpuidle_enter+0x24/0x40
>Sep 3 10:05:39 [193378.950459][ C10] cpuidle_idle_call+0xbb/0x100
>Sep 3 10:05:39 [193378.950468][ C10] do_idle+0x76/0xc0
>Sep 3 10:05:39 [193378.950476][ C10] cpu_startup_entry+0x14/0x20
>Sep 3 10:05:39 [193378.950483][ C10] start_secondary+0xd6/0xe0
>Sep 3 10:05:39 [193378.950491][ C10]
>secondary_startup_64_no_verify+0xd3/0xdb
>Sep 3 10:05:39 [193378.950499][ C10] </TASK>
>Sep 3 10:05:39 [193378.950504][ C10] ---[ end trace 0000000000000000 ]---
>Sep 3 10:05:39 [193378.950513][ C10] ixgbe 0000:02:00.1 eth1: initiating reset
>due to tx timeout
>Sep 3 10:05:39 [193378.950525][T1766094] ixgbe 0000:02:00.1 eth1: Reset
>adapter
>Sep 3 10:10:02 [ 30.021823][ T454] ixgbe 0000:02:00.1 eth1: NIC Link is Up 10
>Gbps, Flow Control: None
>
>
>
>_______________________________________________
>Intel-wired-lan mailing list
>Intel-wired-lan@...osl.org
>https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
Powered by blists - more mailing lists