lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4765029.31r3eYUQgx@saruman>
Date:   Fri, 05 Aug 2022 12:25:05 +0100
From:   James Hogan <jhogan@...nel.org>
To:     Vinicius Costa Gomes <vinicius.gomes@...el.com>,
        intel-wired-lan@...ts.osuosl.org,
        Sasha Neftin <sasha.neftin@...el.com>,
        Aleksandr Loktionov <aleksandr.loktionov@...el.com>
Cc:     Paul Menzel <pmenzel@...gen.mpg.de>,
        Tony Nguyen <anthony.l.nguyen@...el.com>,
        Jesse Brandeburg <jesse.brandeburg@...el.com>,
        netdev@...r.kernel.org
Subject: Re: [Intel-wired-lan] I225-V (igc driver) hangs after resume in igc_resume/igc_tsn_reset

On Thursday, 4 August 2022 23:07:34 BST James Hogan wrote:
> And I just found this patch from December which may have been masked by the
> PTM issues:
> https://lore.kernel.org/netdev/20211201185731.236130-1-vinicius.gomes@intel.
> com/
> 
> I'll build and run with that for a few days and see how it goes.

I gave it a good hammering yesterday evening with suspend/resume cycles, and
it didn't lock up, however it did still fail to bring the network up a couple
of times, requiring me to unload and reload the driver.

The only kernel log splats I saw were an assert that RTNL mutex wasn't taken
in the igc_runtime_resume path, and a suspicious RCU usage warning, both
pasted below.

I'll keep running with that patch and lockdep enabled (based on
5.18.16-arch1-1) and report back any further issues.

Cheers
James

------------[ cut here ]------------
RTNL: assertion failed at net/core/dev.c (2886)
WARNING: CPU: 0 PID: 7752 at net/core/dev.c:2886 netif_set_real_num_tx_queues+0x1f0/0x210
Modules linked in: rfcomm intel_rapl_msr ee1004 spi_nor iTCO_wdt intel_pmc_bxt mtd iTCO_vendor_support mei_pxp mei_hdcp cmac algif_hash algif_skcipher af_alg bnep pmt_telemetry pmt_class wmi_bmof mxm_wmi intel_rapl_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass rapl intel_cstate intel_uncore pcspkr snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_hda_codec_realtek snd_sof_utils snd_soc_hdac_hda snd_hda_codec_generic snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus ledtrig_audio uvcvideo snd_usb_audio snd_soc_core videobuf2_vmalloc videobuf2_memops snd_usbmidi_lib videobuf2_v4l2 snd_compress i2c_i801 snd_rawmidi spi_intel_pci ac97_bus igc(-) spi_intel videobuf2_common snd_pcm_dmaengine i2c_smbus snd_seq_device mei_me snd_hda_codec_hdmi mei cdc_acm videodev snd_hda_intel snd_intel_dspcfg
 snd_intel_sdw_acpi mc amdgpu mousedev i915 snd_hda_codec snd_hda_core btusb snd_hwdep btrtl snd_pcm btbcm gpu_sched drm_buddy joydev btintel snd_timer drm_ttm_helper btmtk snd ttm intel_vsec drm_dp_helper soundcore intel_gtt serial_multi_instantiate wmi video bluetooth ecdh_generic acpi_tad rfkill coretemp acpi_pad nls_iso8859_1 vfat fat mac_hid ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog xt_multiport xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip6table_filter ip6_tables iptable_filter i2c_dev sg crypto_user fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_microsoft ff_memless dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm rng_core dm_mod usbhid crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd nvme cryptd sr_mod xhci_pci nvme_core cdrom xhci_pci_renesas
CPU: 0 PID: 7752 Comm: kworker/0:1 Not tainted 5.18.16-arch1-1 #3 2927cbed739f932be66f137e6808a2714da26c25
Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A DDR4(MS-7D25), BIOS 1.40 05/17/2022
Workqueue: pm pm_runtime_work
RIP: 0010:netif_set_real_num_tx_queues+0x1f0/0x210
Code: f8 f7 5f 01 00 0f 85 90 fe ff ff ba 46 0b 00 00 48 c7 c6 6e 6e 6f 82 48 c7 c7 70 a5 72 82 c6 05 d8 f7 5f 01 01 e8 1f d6 25 00 <0f> 0b e9 6a fe ff ff b8 ea ff ff ff e9 46 fe ff ff 66 66 2e 0f 1f
RSP: 0018:ffffa25e823dbc98 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff95a5668fa000 RCX: 0000000000000027
RDX: ffff95accfc21a28 RSI: 0000000000000001 RDI: ffff95accfc21a20
RBP: 0000000000000004 R08: 0000000000000000 R09: ffffa25e823dbaa0
R10: 0000000000000003 R11: ffff95acf07ac2e8 R12: 0000000000000001
R13: 0000000000000004 R14: ffff95a5668fa000 R15: ffff95a5691bc1e8
FS:  0000000000000000(0000) GS:ffff95accfc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007feec86e5c90 CR3: 0000000252fdc001 CR4: 0000000000f70ef0
PKRU: 55555554
Call Trace:
 <TASK>
 __igc_open+0x40a/0x660 [igc 73e11f9f5110389b26a5a274cff80c9ddf9bab7a]
 __igc_resume+0x133/0x240 [igc 73e11f9f5110389b26a5a274cff80c9ddf9bab7a]
 ? pci_pme_active+0xa5/0x1a0
 pci_pm_runtime_resume+0xab/0xd0
 ? pci_pm_freeze_noirq+0xf0/0xf0
 __rpm_callback+0x41/0x170
 rpm_callback+0x35/0x70
 ? pci_pm_freeze_noirq+0xf0/0xf0
 rpm_resume+0x5ee/0x820
 pm_runtime_work+0x7c/0xb0
 process_one_work+0x276/0x570
 worker_thread+0x53/0x390
 ? _raw_spin_unlock_irqrestore+0x34/0x60
 ? process_one_work+0x570/0x570
 kthread+0xdb/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30
 </TASK>
irq event stamp: 128847
hardirqs last  enabled at (128853): [<ffffffff81132dde>] __up_console_sem+0x5e/0x70
hardirqs last disabled at (128858): [<ffffffff81132dc3>] __up_console_sem+0x43/0x70
softirqs last  enabled at (125404): [<ffffffff810a5533>] __irq_exit_rcu+0xa3/0xd0
softirqs last disabled at (125395): [<ffffffff810a5533>] __irq_exit_rcu+0xa3/0xd0
---[ end trace 0000000000000000 ]---

=============================
WARNING: suspicious RCU usage
5.18.16-arch1-1 #3 Tainted: G        W   
-----------------------------
net/sched/sch_generic.c:1389 suspicious rcu_dereference_protected() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1 
2 locks held by kworker/0:1/7752:
 #0: ffff95a540ba8b38 ((wq_completion)pm){+.+.}-{0:0}, at: process_one_work+0x1f5/0x570
 #1: ffffa25e823dbe78 ((work_completion)(&dev->power.work)){+.+.}-{0:0}, at: process_one_work+0x1f5/0x570

stack backtrace:
CPU: 0 PID: 7752 Comm: kworker/0:1 Tainted: G        W         5.18.16-arch1-1 #3 2927cbed739f932be66f137e6808a2714da26c25
Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A DDR4(MS-7D25), BIOS 1.40 05/17/2022
Workqueue: pm pm_runtime_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x5f/0x7b
 dev_qdisc_change_real_num_tx+0x68/0x80
 netif_set_real_num_tx_queues+0x8d/0x210
 __igc_open+0x40a/0x660 [igc 73e11f9f5110389b26a5a274cff80c9ddf9bab7a]
 __igc_resume+0x133/0x240 [igc 73e11f9f5110389b26a5a274cff80c9ddf9bab7a]
 ? pci_pme_active+0xa5/0x1a0
 pci_pm_runtime_resume+0xab/0xd0
 ? pci_pm_freeze_noirq+0xf0/0xf0
 __rpm_callback+0x41/0x170
 rpm_callback+0x35/0x70
 ? pci_pm_freeze_noirq+0xf0/0xf0
 rpm_resume+0x5ee/0x820
 pm_runtime_work+0x7c/0xb0
 process_one_work+0x276/0x570
 worker_thread+0x53/0x390
 ? _raw_spin_unlock_irqrestore+0x34/0x60
 ? process_one_work+0x570/0x570
 kthread+0xdb/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30
 </TASK>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ