lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 20 Jan 2020 13:26:41 -0800
From:   PGNet Dev <pgnet.dev@...il.com>
To:     netdev@...r.kernel.org
Subject: kernel 5.4.13 'NETDEV WATCHDOG' timeout errors -- is it kernel?
 driver? bios?

 xen-users@...ts.xenproject.org

I'm bringing a server, running Xen 4.13 + kernel 5.4.13-24.g5cf5394-default, back up after disk changes.

On boot, I'm seeing these 'NETDEV WATCHDOG' oops, ending up with unstable/dropped network:

	[   35.344678] ------------[ cut here ]------------
	[   35.344703] NETDEV WATCHDOG: eno1 (e1000e): transmit queue 0 timed out
	[   35.344723] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:447 dev_watchdog+0x248/0x250
	[   35.344729] Modules linked in: af_packet br_netfilter bridge stp llc iscsi_ibft iscsi_boot_sysfs rfkill xen_pciback xen_netback xen_blkback xen_gntalloc dmi_sysfs xen_gntdev xen_evtchn nct6775 hwmon_vid msr sch_fq_codel intel_rapl_msr intel_rapl_common mei_wdt snd_hda_codec_hdmi nouveau raid10 mei_hdcp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mxm_wmi snd_hda_codec_realtek snd_hda_codec_generic wmi aesni_intel ledtrig_audio crypto_simd cryptd glue_helper ttm snd_hda_intel snd_intel_nhlt snd_hda_codec drm_kms_helper intel_pch_thermal snd_hda_core i2c_i801 drm snd_hwdep fb_sys_fops mei_me snd_pcm syscopyarea sysfillrect snd_timer sysimgblt snd mei soundcore ie31200_edac button xenfs xen_privcmd hid_generic usbhid raid1 md_mod firewire_ohci crc32c_intel igb firewire_core i2c_algo_bit dca crc_itu_t r8169 realtek libphy xhci_pci xhci_hcd ehci_pci ehci_hcd e1000e usbcore mvsas libsas scsi_transport_sas fan thermal video tcp_bbr sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
	[   35.344764]  n_hdlc slhc nfsd auth_rpcgss nfs_acl nfs lockd grace sunrpc fscache efivarfs
	[   35.344770] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.13-24.g5cf5394-default #1 openSUSE Tumbleweed (unreleased)
	[   35.344770] Hardware name: Supermicro X10SAT/X10SAT, BIOS 3.0 05/26/2015
	[   35.344772] RIP: 0010:dev_watchdog+0x248/0x250
	[   35.344773] Code: 85 c0 75 e5 eb 9f 4c 89 ef c6 05 41 93 b0 00 01 e8 dd f0 fa ff 44 89 e1 4c 89 ee 48 c7 c7 48 4d 16 82 48 89 c2 e8 c6 4d 86 ff <0f> 0b eb 80 0f 1f 40 00 0f 1f 44 00 00 41 57 41 56 49 89 d6 41 55
	[   35.344774] RSP: 0018:ffffc90000003e68 EFLAGS: 00010286
	[   35.344775] RAX: 0000000000000000 RBX: ffff88815caf7000 RCX: 000000000000083f
	[   35.344775] RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000083f
	[   35.344776] RBP: ffff88815c79c45c R08: ffff888164a19a18 R09: 0000000000000003
	[   35.344777] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
	[   35.344777] R13: ffff88815c79c000 R14: ffff88815c79c480 R15: 0000000000000001
	[   35.344778] FS:  0000000000000000(0000) GS:ffff888164a00000(0000) knlGS:0000000000000000
	[   35.344779] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	[   35.344779] CR2: 00007ff3eaeb0090 CR3: 00000001632e6003 CR4: 00000000001606b0
	[   35.344782] Call Trace:
	[   35.344788]  <IRQ>
	[   35.344791]  ? pfifo_fast_enqueue+0x150/0x150
	[   35.344793]  call_timer_fn+0x2d/0x130
	[   35.344795]  __run_timers.part.0+0x185/0x280
	[   35.344797]  ? pfifo_fast_enqueue+0x150/0x150
	[   35.344800]  ? handle_irq_event_percpu+0x72/0x80
	[   35.344805]  run_timer_softirq+0x26/0x50
	[   35.344807]  __do_softirq+0x118/0x33b
	[   35.344810]  irq_exit+0xb9/0xc0
	[   35.344814]  xen_evtchn_do_upcall+0x2c/0x40
	[   35.344819]  xen_hvm_callback_vector+0xf/0x20
	[   35.344820]  </IRQ>
	[   35.344821] RIP: 0010:native_safe_halt+0xe/0x10
	[   35.344822] Code: 90 90 90 90 90 90 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d e6 02 4a 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d d6 02 4a 00 fb f4 <c3> 90 0f 1f 44 00 00 41 54 55 53 e8 12 67 7b ff 65 8b 2d 6b 75 6a
	[   35.344823] RSP: 0018:ffffffff82203e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff0c
	[   35.344824] RAX: 0000000080000000 RBX: 0000000000000000 RCX: 0000000000000001
	[   35.344825] RDX: 0000000000000001 RSI: 0000000000000083 RDI: 0000000000000000
	[   35.344825] RBP: 0000000000000000 R08: 00000015956e8dad R09: 0000000000000000
	[   35.344826] R10: 0000000000000000 R11: 0000000000000018 R12: ffffffff82214780
	[   35.344826] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff82214780
	[   35.344828]  default_idle+0x1f/0x140
	[   35.344831]  do_idle+0x1ff/0x280
	[   35.344832]  cpu_startup_entry+0x19/0x20
	[   35.344835]  start_kernel+0x4f2/0x511
	[   35.344838]  secondary_startup_64+0xb6/0xc0
	[   35.344839] ---[ end trace 7edaffa8e97068ae ]---
	[   35.344861] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[   40.096402] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[   45.788752] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[   50.726862] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[   55.783348] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[   64.581820] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[   74.730696] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[   79.628577] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[   84.704205] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[   89.701827] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[   99.551935] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[  104.369653] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[  109.802904] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[  114.600968] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[  119.775784] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[  124.752690] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[  134.622908] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[  139.441154] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[  144.615671] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[  149.612495] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[  159.722815] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[  164.561039] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
	[  169.696432] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
	[  174.753536] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None


here atm,

	ethtool -d eno1
		MAC Registers
		-------------
		0x00000: CTRL (Device control register)  0x40180240
		      Endian mode (buffers):             little
		      Link reset:                        normal
		      Set link up:                       1
		      Invert Loss-Of-Signal:             no
		      Receive flow control:              disabled
		      Transmit flow control:             disabled
		      VLAN mode:                         enabled
		      Auto speed detect:                 disabled
		      Speed select:                      1000Mb/s
		      Force speed:                       no
		      Force duplex:                      no
		0x00008: STATUS (Device status register) 0x00080083
		      Duplex:                            full
		      Link up:                           link config
		      TBI mode:                          disabled
		      Link speed:                        1000Mb/s
		      Bus type:                          PCI
		      Bus speed:                         33MHz
		      Bus width:                         32-bit
		0x00100: RCTL (Receive control register) 0x04008000
		      Receiver:                          disabled
		      Store bad packets:                 disabled
		      Unicast promiscuous:               disabled
		      Multicast promiscuous:             disabled
		      Long packet:                       disabled
		      Descriptor minimum threshold size: 1/2
		      Broadcast accept mode:             accept
		      VLAN filter:                       disabled
		      Canonical form indicator:          disabled
		      Discard pause frames:              filtered
		      Pass MAC control frames:           don't pass
		      Receive buffer size:               2048
		0x02808: RDLEN (Receive desc length)     0x00001000
		0x02810: RDH   (Receive desc head)       0x00000001
		0x02818: RDT   (Receive desc tail)       0x000000F0
		0x02820: RDTR  (Receive delay timer)     0x00000000
		0x00400: TCTL (Transmit ctrl register)   0x3103F0F8
		      Transmitter:                       disabled
		      Pad short packets:                 enabled
		      Software XOFF Transmission:        disabled
		      Re-transmit on late collision:     enabled
		0x03808: TDLEN (Transmit desc length)    0x00001000
		0x03810: TDH   (Transmit desc head)      0x00000001
		0x03818: TDT   (Transmit desc tail)      0x00000001
		0x03820: TIDV  (Transmit delay timer)    0x00000008
		PHY type:                                unknown

is this a kernel, Xen, e1000 driver &/or BIOS issue?  

any known fix/workaround, or discussion, to point to?

Powered by blists - more mailing lists