[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3ad8a76d-5da1-eb62-689e-44ea0534907f@gmail.com>
Date: Sat, 15 Feb 2020 23:35:28 +0100
From: Heiner Kallweit <hkallweit1@...il.com>
To: Vincas Dargis <vindrg@...il.com>,
Salvatore Bonaccorso <carnil@...ian.org>
Cc: netdev@...r.kernel.org
Subject: Re: About r8169 regression 5.4
On 15.02.2020 23:07, Vincas Dargis wrote:
> 2020-02-15 18:12, Salvatore Bonaccorso rašė:
>> You can generate the a7a92cf81589 revert patch, and then for simple
>> testing of a patch and build have a look at the Simple patching and
>> building[1] section of the kernel handbook.
>>
>> Hope this helps,
>>
>> Regards,
>> Salvatore
>>
>> [1] https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s4.2.2
>>
>
> Sadly, after running for an hour, I still got this:
>
> Feb 15 23:49:21 vinco kernel: [ 3670.779254] ------------[ cut here ]------------
> Feb 15 23:49:21 vinco kernel: [ 3670.779275] NETDEV WATCHDOG: enp5s0f1 (r8169): transmit queue 0 timed out
> Feb 15 23:49:21 vinco kernel: [ 3670.779299] WARNING: CPU: 6 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0x248/0x250
> Feb 15 23:49:21 vinco kernel: [ 3670.779300] Modules linked in: rfcomm(E) xt_recent(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_multiport(E) xt_conntrack(E) xt_hashlimit(E) xt_addrtype(E) xt_iface(OE) xt_mark(E) nft_chain_nat(E) xt_comment(E) xt_CT(E) xt_owner(E) xt_tcpudp(E) nft_compat(E) nft_counter(E) xt_NFLOG(E) nf_log_ipv4(E) nf_log_common(E) xt_LOG(E) nf_nat_tftp(E) nf_nat_snmp_basic(E) nf_conntrack_snmp(E) nf_nat_sip(E) nf_nat_pptp(E) nf_nat_irc(E) nf_nat_h323(E) nf_nat_ftp(E) nf_nat_amanda(E) ts_kmp(E) nf_conntrack_amanda(E) nf_nat(E) nf_conntrack_sane(E) nf_conntrack_tftp(E) nf_conntrack_sip(E) nf_conntrack_pptp(E) nf_conntrack_netlink(E) nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nf_conntrack_irc(E) nf_conntrack_h323(E) nf_conntrack_ftp(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) vboxnetadp(OE) vboxnetflt(OE) xfrm_user(E) xfrm_algo(E) vboxdrv(OE) l2tp_ppp(E) l2tp_netlink(E) l2tp_core(E) ip6_udp_tunnel(E) udp_tunnel(E) pppox(E)
> ppp_generic(E) slhc(E) nfnetlink_log(E) bnep(E)
> Feb 15 23:49:21 vinco kernel: [ 3670.779353] nfnetlink(E) bbswitch(OE) intel_rapl_msr(E) intel_rapl_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) ghash_clmulni_intel(E) binfmt_misc(E) btusb(E) btrtl(E) btbcm(E) btintel(E) nls_ascii(E) nls_cp437(E) bluetooth(E) snd_hda_codec_realtek(E) aesni_intel(E) uvcvideo(E) vfat(E) crypto_simd(E) videobuf2_vmalloc(E) fat(E) snd_hda_codec_generic(E) cryptd(E) videobuf2_memops(E) glue_helper(E) ledtrig_audio(E) videobuf2_v4l2(E) iwlmvm(E) intel_cstate(E) drbg(E) snd_hda_codec_hdmi(E) intel_uncore(E) videobuf2_common(E) mac80211(E) ansi_cprng(E) libarc4(E) videodev(E) efi_pstore(E) joydev(E) snd_hda_intel(E) mc(E) snd_intel_dspcfg(E) intel_rapl_perf(E) ecdh_generic(E) pcspkr(E) ecc(E) serio_raw(E) snd_hda_codec(E) asus_nb_wmi(E) iwlwifi(E) asus_wmi(E) snd_hda_core(E) efivars(E) sparse_keymap(E) snd_hwdep(E) sg(E) snd_pcm(E) cfg80211(E) snd_timer(E) iTCO_wdt(E)
> iTCO_vendor_support(E) snd(E) watchdog(E) rfkill(E)
> Feb 15 23:49:21 vinco kernel: [ 3670.779386] soundcore(E) ie31200_edac(E) evdev(E) asus_wireless(E) ac(E) parport_pc(E) ppdev(E) lp(E) parport(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) xor(E) zstd_decompress(E) zstd_compress(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) sr_mod(E) sd_mod(E) cdrom(E) hid_logitech_hidpp(E) hid_logitech_dj(E) hid_generic(E) usbhid(E) hid(E) i915(E) rtsx_pci_sdmmc(E) i2c_algo_bit(E) mmc_core(E) xhci_pci(E) drm_kms_helper(E) ehci_pci(E) ahci(E) lpc_ich(E) rtsx_pci(E) ehci_hcd(E) mfd_core(E) drm(E) libahci(E) xhci_hcd(E) crc32_pclmul(E) mxm_wmi(E) libata(E) crc32c_intel(E) r8169(E) realtek(E) psmouse(E) usbcore(E) i2c_i801(E) scsi_mod(E) libphy(E) usb_common(E) wmi(E) battery(E) video(E) button(E)
> Feb 15 23:49:21 vinco kernel: [ 3670.779418] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G OE 5.5.0-rc5-amd64 #1 Debian 5.5~rc5-1~exp1
> Feb 15 23:49:21 vinco kernel: [ 3670.779419] Hardware name: ASUSTeK COMPUTER INC. N551JM/N551JM, BIOS N551JM.205 02/13/2015
> Feb 15 23:49:21 vinco kernel: [ 3670.779422] RIP: 0010:dev_watchdog+0x248/0x250
> Feb 15 23:49:21 vinco kernel: [ 3670.779425] Code: 85 c0 75 e5 eb 9f 4c 89 ef c6 05 a8 8b a6 00 01 e8 1d cc fa ff 44 89 e1 4c 89 ee 48 c7 c7 68 ca 55 a9 48 89 c2 e8 1a 67 9e ff <0f> 0b eb 80 0f 1f 40 00 0f 1f 44 00 00 41 57 41 56 49 89 d6 41 55
> Feb 15 23:49:21 vinco kernel: [ 3670.779426] RSP: 0018:ffffbf5dc01e0e68 EFLAGS: 00010286
> Feb 15 23:49:21 vinco kernel: [ 3670.779428] RAX: 0000000000000000 RBX: ffffa0e11c031400 RCX: 000000000000083f
> Feb 15 23:49:21 vinco kernel: [ 3670.779429] RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000083f
> Feb 15 23:49:21 vinco kernel: [ 3670.779430] RBP: ffffa0e11caee45c R08: 0000000000000471 R09: 0000000000000004
> Feb 15 23:49:21 vinco kernel: [ 3670.779431] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> Feb 15 23:49:21 vinco kernel: [ 3670.779432] R13: ffffa0e11caee000 R14: ffffa0e11caee480 R15: 0000000000000001
> Feb 15 23:49:21 vinco kernel: [ 3670.779433] FS: 0000000000000000(0000) GS:ffffa0e11ef80000(0000) knlGS:0000000000000000
> Feb 15 23:49:21 vinco kernel: [ 3670.779434] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Feb 15 23:49:21 vinco kernel: [ 3670.779435] CR2: 000004c3f29aba30 CR3: 000000020e80a005 CR4: 00000000001626e0
> Feb 15 23:49:21 vinco kernel: [ 3670.779436] Call Trace:
> Feb 15 23:49:21 vinco kernel: [ 3670.779439] <IRQ>
> Feb 15 23:49:21 vinco kernel: [ 3670.779443] ? pfifo_fast_enqueue+0x150/0x150
> Feb 15 23:49:21 vinco kernel: [ 3670.779446] call_timer_fn+0x2d/0x130
> Feb 15 23:49:21 vinco kernel: [ 3670.779448] __run_timers.part.0+0x16f/0x260
> Feb 15 23:49:21 vinco kernel: [ 3670.779452] ? tick_sched_handle+0x22/0x60
> Feb 15 23:49:21 vinco kernel: [ 3670.779455] ? tick_sched_timer+0x38/0x80
> Feb 15 23:49:21 vinco kernel: [ 3670.779457] ? tick_sched_do_timer+0x60/0x60
> Feb 15 23:49:21 vinco kernel: [ 3670.779460] run_timer_softirq+0x26/0x50
> Feb 15 23:49:21 vinco kernel: [ 3670.779464] __do_softirq+0xe6/0x2e9
> Feb 15 23:49:21 vinco kernel: [ 3670.779469] irq_exit+0xa6/0xb0
> Feb 15 23:49:21 vinco kernel: [ 3670.779471] smp_apic_timer_interrupt+0x76/0x130
> Feb 15 23:49:21 vinco kernel: [ 3670.779474] apic_timer_interrupt+0xf/0x20
> Feb 15 23:49:21 vinco kernel: [ 3670.779475] </IRQ>
> Feb 15 23:49:21 vinco kernel: [ 3670.779479] RIP: 0010:cpuidle_enter_state+0xc9/0x3e0
> Feb 15 23:49:21 vinco kernel: [ 3670.779481] Code: e8 5c ad ab ff 80 7c 24 0f 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 ea 02 00 00 31 ff e8 9e dc b1 ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 40 02 00 00 49 63 d5 4c 2b 64 24 10 48 8d 04 52 48
> Feb 15 23:49:21 vinco kernel: [ 3670.779482] RSP: 0018:ffffbf5dc00c7e68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> Feb 15 23:49:21 vinco kernel: [ 3670.779483] RAX: ffffa0e11efacac0 RBX: ffffdf5dbfd9e0f0 RCX: 000000000000001f
> Feb 15 23:49:21 vinco kernel: [ 3670.779484] RDX: 0000000000000000 RSI: 0000000033518eeb RDI: 0000000000000000
> Feb 15 23:49:21 vinco kernel: [ 3670.779485] RBP: ffffffffa96bdaa0 R08: 00000356ab7df88b R09: 000000000002c3e0
> Feb 15 23:49:21 vinco kernel: [ 3670.779486] R10: 0000000000001592 R11: ffffa0e11efab9a4 R12: 00000356ab7df88b
> Feb 15 23:49:21 vinco kernel: [ 3670.779487] R13: 0000000000000005 R14: 0000000000000005 R15: ffffa0e11ca98000
> Feb 15 23:49:21 vinco kernel: [ 3670.779490] ? cpuidle_enter_state+0xa4/0x3e0
> Feb 15 23:49:21 vinco kernel: [ 3670.779493] cpuidle_enter+0x29/0x40
> Feb 15 23:49:21 vinco kernel: [ 3670.779496] do_idle+0x1e4/0x280
> Feb 15 23:49:21 vinco kernel: [ 3670.779499] cpu_startup_entry+0x19/0x20
> Feb 15 23:49:21 vinco kernel: [ 3670.779502] start_secondary+0x15f/0x1b0
> Feb 15 23:49:21 vinco kernel: [ 3670.779506] secondary_startup_64+0xa4/0xb0
> Feb 15 23:49:21 vinco kernel: [ 3670.779508] ---[ end trace a87faacfee854ba7 ]---
>
> Though what is strange that network does seems to be usable! I don't have to reboot to make browser and other application to continue working. Maybe other changes up to 5.5-rc5 helped?
In case of a tx timeout NIC and driver parts are reset, see rtl8169_tx_timeout().
Depending on the root cause this may often be sufficient to make it work again.
It's likely that the root cause for the timeout is in the driver, however we don't
know for sure yet. Reason could also be a net core regression. So still the best
would be a bisect.
5.4 has been out for more than two months now, and this report is the first one
I see. Therefore I'd assume that the issue affects special cases (e.g. specific
chip versions) only.
Helpful would be a full dmesg log of the boot.
And just to be on the safe side: You could try to disable EEE (ethtool --set-eee <if> eee off)
and see whether this helps.
Powered by blists - more mailing lists