lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 21 Jan 2020 22:16:27 -0800
From:   PGNet Dev <pgnet.dev@...il.com>
To:     netdev@...r.kernel.org
Subject: Re: kernel 5.4.13 'NETDEV WATCHDOG' timeout errors -- is it kernel?
 driver? bios?

Any suggestions on this one?


On 1/20/20 1:26 PM, PGNet Dev wrote:
>   xen-users@...ts.xenproject.org
> 
> I'm bringing a server, running Xen 4.13 + kernel 5.4.13-24.g5cf5394-default, back up after disk changes.
> 
> On boot, I'm seeing these 'NETDEV WATCHDOG' oops, ending up with unstable/dropped network:
> 
> 	[   35.344678] ------------[ cut here ]------------
> 	[   35.344703] NETDEV WATCHDOG: eno1 (e1000e): transmit queue 0 timed out
> 	[   35.344723] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:447 dev_watchdog+0x248/0x250
> 	[   35.344729] Modules linked in: af_packet br_netfilter bridge stp llc iscsi_ibft iscsi_boot_sysfs rfkill xen_pciback xen_netback xen_blkback xen_gntalloc dmi_sysfs xen_gntdev xen_evtchn nct6775 hwmon_vid msr sch_fq_codel intel_rapl_msr intel_rapl_common mei_wdt snd_hda_codec_hdmi nouveau raid10 mei_hdcp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mxm_wmi snd_hda_codec_realtek snd_hda_codec_generic wmi aesni_intel ledtrig_audio crypto_simd cryptd glue_helper ttm snd_hda_intel snd_intel_nhlt snd_hda_codec drm_kms_helper intel_pch_thermal snd_hda_core i2c_i801 drm snd_hwdep fb_sys_fops mei_me snd_pcm syscopyarea sysfillrect snd_timer sysimgblt snd mei soundcore ie31200_edac button xenfs xen_privcmd hid_generic usbhid raid1 md_mod firewire_ohci crc32c_intel igb firewire_core i2c_algo_bit dca crc_itu_t r8169 realtek libphy xhci_pci xhci_hcd ehci_pci ehci_hcd e1000e usbcore mvsas libsas scsi_transport_sas fan thermal video tcp_bbr sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> 	[   35.344764]  n_hdlc slhc nfsd auth_rpcgss nfs_acl nfs lockd grace sunrpc fscache efivarfs
> 	[   35.344770] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.13-24.g5cf5394-default #1 openSUSE Tumbleweed (unreleased)
> 	[   35.344770] Hardware name: Supermicro X10SAT/X10SAT, BIOS 3.0 05/26/2015
> 	[   35.344772] RIP: 0010:dev_watchdog+0x248/0x250
> 	[   35.344773] Code: 85 c0 75 e5 eb 9f 4c 89 ef c6 05 41 93 b0 00 01 e8 dd f0 fa ff 44 89 e1 4c 89 ee 48 c7 c7 48 4d 16 82 48 89 c2 e8 c6 4d 86 ff <0f> 0b eb 80 0f 1f 40 00 0f 1f 44 00 00 41 57 41 56 49 89 d6 41 55
> 	[   35.344774] RSP: 0018:ffffc90000003e68 EFLAGS: 00010286
> 	[   35.344775] RAX: 0000000000000000 RBX: ffff88815caf7000 RCX: 000000000000083f
> 	[   35.344775] RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000083f
> 	[   35.344776] RBP: ffff88815c79c45c R08: ffff888164a19a18 R09: 0000000000000003
> 	[   35.344777] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> 	[   35.344777] R13: ffff88815c79c000 R14: ffff88815c79c480 R15: 0000000000000001
> 	[   35.344778] FS:  0000000000000000(0000) GS:ffff888164a00000(0000) knlGS:0000000000000000
> 	[   35.344779] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 	[   35.344779] CR2: 00007ff3eaeb0090 CR3: 00000001632e6003 CR4: 00000000001606b0
> 	[   35.344782] Call Trace:
> 	[   35.344788]  <IRQ>
> 	[   35.344791]  ? pfifo_fast_enqueue+0x150/0x150
> 	[   35.344793]  call_timer_fn+0x2d/0x130
> 	[   35.344795]  __run_timers.part.0+0x185/0x280
> 	[   35.344797]  ? pfifo_fast_enqueue+0x150/0x150
> 	[   35.344800]  ? handle_irq_event_percpu+0x72/0x80
> 	[   35.344805]  run_timer_softirq+0x26/0x50
> 	[   35.344807]  __do_softirq+0x118/0x33b
> 	[   35.344810]  irq_exit+0xb9/0xc0
> 	[   35.344814]  xen_evtchn_do_upcall+0x2c/0x40
> 	[   35.344819]  xen_hvm_callback_vector+0xf/0x20
> 	[   35.344820]  </IRQ>
> 	[   35.344821] RIP: 0010:native_safe_halt+0xe/0x10
> 	[   35.344822] Code: 90 90 90 90 90 90 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d e6 02 4a 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d d6 02 4a 00 fb f4 <c3> 90 0f 1f 44 00 00 41 54 55 53 e8 12 67 7b ff 65 8b 2d 6b 75 6a
> 	[   35.344823] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff0c
> 	[   35.344824] RAX: 0000000080000000 RBX: 0000000000000000 RCX: 0000000000000001
> 	[   35.344825] RDX: 0000000000000001 RSI: 0000000000000083 RDI: 0000000000000000
> 	[   35.344825] RBP: 0000000000000000 R08: 00000015956e8dad R09: 0000000000000000
> 	[   35.344826] R10: 0000000000000000 R11: 0000000000000018 R12: ffffffff82214780
> 	[   35.344826] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff82214780
> 	[   35.344828]  default_idle+0x1f/0x140
> 	[   35.344831]  do_idle+0x1ff/0x280
> 	[   35.344832]  cpu_startup_entry+0x19/0x20
> 	[   35.344835]  start_kernel+0x4f2/0x511
> 	[   35.344838]  secondary_startup_64+0xb6/0xc0
> 	[   35.344839] ---[ end trace 7edaffa8e97068ae ]---
> 	[   35.344861] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[   40.096402] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[   45.788752] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[   50.726862] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[   55.783348] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[   64.581820] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[   74.730696] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[   79.628577] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[   84.704205] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[   89.701827] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[   99.551935] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[  104.369653] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[  109.802904] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[  114.600968] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[  119.775784] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[  124.752690] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[  134.622908] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[  139.441154] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[  144.615671] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[  149.612495] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[  159.722815] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[  164.561039] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 	[  169.696432] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
> 	[  174.753536] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> 
> 
> here atm,
> 
> 	ethtool -d eno1
> 		MAC Registers
> 		-------------
> 		0x00000: CTRL (Device control register)  0x40180240
> 		      Endian mode (buffers):             little
> 		      Link reset:                        normal
> 		      Set link up:                       1
> 		      Invert Loss-Of-Signal:             no
> 		      Receive flow control:              disabled
> 		      Transmit flow control:             disabled
> 		      VLAN mode:                         enabled
> 		      Auto speed detect:                 disabled
> 		      Speed select:                      1000Mb/s
> 		      Force speed:                       no
> 		      Force duplex:                      no
> 		0x00008: STATUS (Device status register) 0x00080083
> 		      Duplex:                            full
> 		      Link up:                           link config
> 		      TBI mode:                          disabled
> 		      Link speed:                        1000Mb/s
> 		      Bus type:                          PCI
> 		      Bus speed:                         33MHz
> 		      Bus width:                         32-bit
> 		0x00100: RCTL (Receive control register) 0x04008000
> 		      Receiver:                          disabled
> 		      Store bad packets:                 disabled
> 		      Unicast promiscuous:               disabled
> 		      Multicast promiscuous:             disabled
> 		      Long packet:                       disabled
> 		      Descriptor minimum threshold size: 1/2
> 		      Broadcast accept mode:             accept
> 		      VLAN filter:                       disabled
> 		      Canonical form indicator:          disabled
> 		      Discard pause frames:              filtered
> 		      Pass MAC control frames:           don't pass
> 		      Receive buffer size:               2048
> 		0x02808: RDLEN (Receive desc length)     0x00001000
> 		0x02810: RDH   (Receive desc head)       0x00000001
> 		0x02818: RDT   (Receive desc tail)       0x000000F0
> 		0x02820: RDTR  (Receive delay timer)     0x00000000
> 		0x00400: TCTL (Transmit ctrl register)   0x3103F0F8
> 		      Transmitter:                       disabled
> 		      Pad short packets:                 enabled
> 		      Software XOFF Transmission:        disabled
> 		      Re-transmit on late collision:     enabled
> 		0x03808: TDLEN (Transmit desc length)    0x00001000
> 		0x03810: TDH   (Transmit desc head)      0x00000001
> 		0x03818: TDT   (Transmit desc tail)      0x00000001
> 		0x03820: TIDV  (Transmit delay timer)    0x00000008
> 		PHY type:                                unknown
> 
> is this a kernel, Xen, e1000 driver &/or BIOS issue?
> 
> any known fix/workaround, or discussion, to point to?
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ