lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c307ba94-0d8c-3cbf-19da-44ee31751428@amd.com>
Date:   Wed, 31 May 2023 17:44:57 +0530
From:   K Prateek Nayak <kprateek.nayak@....com>
To:     Sandeep Dhavale <dhavale@...gle.com>, Tejun Heo <tj@...nel.org>
Cc:     jiangshanlai@...il.com, torvalds@...ux-foundation.org,
        peterz@...radead.org, linux-kernel@...r.kernel.org,
        kernel-team@...a.com, joshdon@...gle.com, brho@...gle.com,
        briannorris@...omium.org, nhuck@...gle.com, agk@...hat.com,
        snitzer@...nel.org, void@...ifault.com, kernel-team@...roid.com
Subject: Re: [PATCH 14/24] workqueue: Generalize unbound CPU pods

Hello Sandeep,

I too am seeing similar crash with the same call stack, albeit a
different error, a little while after the kernel boots. I'll inline
the details below.

On 5/31/2023 2:48 AM, Sandeep Dhavale wrote:
> Hi Tejun,
> 
>> @@ -6234,6 +6256,7 @@ static inline void wq_watchdog_init(void) { }
>>   */
>>  void __init workqueue_init_early(void)
>>  {
>> +       struct wq_pod_type *pt = &wq_pod_types[WQ_AFFN_SYSTEM];
>>         int std_nice[NR_STD_WORKER_POOLS] = { 0, HIGHPRI_NICE_LEVEL };
>>         int i, cpu;
>>
>> @@ -6248,6 +6271,22 @@ void __init workqueue_init_early(void)
>>         wq_update_pod_attrs_buf = alloc_workqueue_attrs();
>>         BUG_ON(!wq_update_pod_attrs_buf);
>>
>> +       /* initialize WQ_AFFN_SYSTEM pods */
>> +       pt->pod_cpus = kcalloc(1, sizeof(pt->pod_cpus[0]), GFP_KERNEL);
>> +       pt->pod_node = kcalloc(1, sizeof(pt->pod_node[0]), GFP_KERNEL);
>> +       pt->cpu_pod = kcalloc(nr_cpu_ids, sizeof(pt->cpu_pod[0]), GFP_KERNEL);
>> +       BUG_ON(!pt->pod_cpus || !pt->pod_node || !pt->cpu_pod);
>> +
>> +       BUG_ON(!zalloc_cpumask_var_node(&pt->pod_cpus[0], GFP_KERNEL, NUMA_NO_NODE));
>> +
>> +       wq_update_pod_attrs_buf = alloc_workqueue_attrs();
>> +       BUG_ON(!wq_update_pod_attrs_buf);
>> +
> 
> Looks like allocation for wq_update_pod_attrs_buf is already being
> done in the preceding code block.
> 
> I am trying to evaluate this series to see if it helps with the
> scheduling delays we have seen in EROFS.
> In addition to the panic and fix reported by Prateek [0], I am having
> stability issues only with the series applied.
> I am testing with Pixel 6 and android-mainline kernel [1]
> 
> The panic seems to be in the context of kworker for events_unbound wq.
> The only significant change directly to events_unbound wq was in patch [2]
> 
> @@ -6399,7 +6335,7 @@ void __init workqueue_init_early(void)
>   system_highpri_wq = alloc_workqueue("events_highpri", WQ_HIGHPRI, 0);
>   system_long_wq = alloc_workqueue("events_long", 0, 0);
>   system_unbound_wq = alloc_workqueue("events_unbound", WQ_UNBOUND,
> -    WQ_UNBOUND_MAX_ACTIVE);
> +    WQ_MAX_ACTIVE);
>   system_freezable_wq = alloc_workqueue("events_freezable",
>        WQ_FREEZABLE, 0);
>   system_power_efficient_wq = alloc_workqueue("events_power_efficient",
> 
> Panic log:
> [  316.386684][  T115] Unable to handle kernel paging request at
> virtual address ffffffd2745a0160
> [  316.386936][  T115] Mem abort info:
> [  316.387027][  T115]   ESR = 0x0000000096000007
> [  316.387137][  T115]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  316.387284][  T115]   SET = 0, FnV = 0
> [  316.387378][  T115]   EA = 0, S1PTW = 0
> [  316.387475][  T115]   FSC = 0x07: level 3 translation fault
> [  316.387606][  T115] Data abort info:
> [  316.387694][  T115]   ISV = 0, ISS = 0x00000007
> [  316.387804][  T115]   CM = 0, WnR = 0
> [  316.387897][  T115] swapper pgtable: 4k pages, 39-bit VAs,
> pgdp=0000000081dec000
> [  316.388071][  T115] [ffffffd2745a0160] pgd=10000009d83ff003,
> p4d=10000009d83ff003, pud=10000009d83ff003, pmd=10000009d83fb003,
> pte=0000000000000000
> [  316.388491][  T115] Internal error: Oops: 0000000096000007 [#1] PREEMPT SMP
> [  316.388765][  T115] debug-snapshot dss: core register saved(CPU:2)
> [  316.388993][  T115] debug-snapshot dss: ECC error check erridr_el1.num = 0x2
> [  316.389260][  T115] debug-snapshot dss: ERRSELR_EL1.SEL = 0, NOT
> Error, ERXSTATUS_EL1 = 0x0
> [  316.389578][  T115] debug-snapshot dss: ERRSELR_EL1.SEL = 1, NOT
> Error, ERXSTATUS_EL1 = 0x0
> [  316.389898][  T115] debug-snapshot dss: context saved(CPU:2)
> [  316.390112][  T115] item - log_kevents is disabled
> [  316.390300][  T115] Modules linked in: sec_touch(OE) ftm5(OE)
> bcmdhd4389(OE) goog_touch_interface(OE) snd_soc_cs40l2x(OE)
> haptics_cs40l2x(OE) google_dock(OE) lwis(OE) panel_boe_nt37290(OE)
> panel_samsung_s6e3hc4(OE) panel_samsung_s6e3hc3_c10(OE)
> panel_samsung_s6e3fc3_p10(OE) stmvl53l1(OE) slg51000_core(OE)
> slg51000_regulator(OE) pinctrl_slg51000(OE) nfc mac802154
> ieee802154_socket ieee802154_6lowpan ieee802154 nhc_udp nhc_routing
> nhc_mobility nhc_ipv6 nhc_hop nhc_fragment nhc_dest 6lowpan diag tipc
> mac80211 l2tp_ppp l2tp_core hidp rfcomm can_gw can_bcm can_raw can
> cfg80211 8021q btsdio hci_uart btqca btbcm bluetooth ftdi_sio
> usbserial cdc_acm r8153_ecm aqc111 cdc_ncm cdc_eem cdc_ether
> ax88179_178a asix usbnet r8152 rtl8150 pptp pppox ppp_mppe ppp_deflate
> bsd_comp ppp_generic slhc slcan vcan can_dev mii libarc4 bigocean(OE)
> st33spi(OE) st54spi(OE) st21nfc(OE) nitrous(OE) rfkill
> exynos_reboot(OE) heatmap(OE) touch_bus_negotiator(OE)
> touch_offload(OE) aoc_alsa_dev(OE) aoc_alsa_dev_util(OE)
> aoc_uwb_platform_drv(OE)
> [  316.390708][  T115]  aoc_uwb_service_dev(OE) aoc_channel_dev(OE)
> aoc_control_dev(OE) aoc_char_dev(OE) aoc_core(OE) mailbox_wc(OE)
> audiometrics(OE) snd_soc_cs35l41_i2c(OE) snd_soc_cs35l41_spi(OE)
> snd_soc_cs35l41(OE) snd_soc_wm_adsp(OE) max20339(OE) pca9468(OE)
> p9221(OE) max77759_charger(OE) max77729_charger(OE) max77729_uic(OE)
> max77729_pmic(OE) max1720x_battery(OE) overheat_mitigation(OE)
> google_cpm(OE) google_dual_batt_gauge(OE) google_charger(OE)
> google_battery(OE) google_bms(OE) abrolhos(OE) mali_kbase(OE)
> mali_pixel(OE) panel_samsung_s6e3hc3(OE) panel_samsung_sofef01(OE)
> panel_samsung_s6e3fc3(OE) panel_samsung_s6e3hc2(OE)
> panel_samsung_emul(OE) panel_samsung_drv(OE) exynos_drm(OE)
> arm_memlat_mon(OE) governor_memlat(OE) memlat_devfreq(OE)
> exynos_acme(OE) s3c2410_wdt(OE) trusty_virtio(OE) trusty_test(OE)
> trusty_log(OE) trusty_irq(OE) gs101_spmic_thermal(OE) gpu_cooling(OE)
> debug_reboot(OE) smfc(OE) exynos_mfc(OE) i2c_exynos5(OE)
> rtc_s2mpg10(OE) keycombo(OE) goodixfp(OE) usbc_cooling_dev(OE)
> tcpci_max77759(OE)
> [  316.393987][  T115]  max77759_contaminant(OE) bc_max77759(OE)
> max77759_helper(OE) tcpci_fusb307(OE) slg46826(OE) usb_psy(OE)
> usb_f_dm1(OE) usb_f_dm(OE) xhci_exynos(OE) ufs_exynos_gs(OE)
> s2mpg1x_gpio(OE) bcm47765(OE) sscoredump(OE) sbb_mux(OE) gsc_spi(OE)
> g2d(OE) samsung_iommu(OE) samsung_iommu_group(OE) exyswd_rng(OE)
> exynos_tty(OE) max77826_gs_regulator(OE) boot_control_sysfs(OE)
> exynos_seclog(OE) dbgcore_dump(OE) pixel_stat_mm(OE)
> pixel_stat_sysfs(OE) sysrq_hook(OE) hardlockup_debug(OE) eh(OE)
> cp_thermal_zone(OE) cpif(OE) bts(OE) exynos_dit(OE) cpif_page(OE)
> boot_device_spi(OE) bcm_dbg(OE) exynos_bcm_dbg_dump(OE) gsa_gsc(OE)
> slc_acpm(OE) slc_pmon(OE) slc_dummy(OE) acpm_mbox_test(OE)
> exynos_devfreq(OE) exynos_dm(OE) slc_pt(OE) power_stats(OE)
> exynos_pd_dbg(OE) pixel_em(OE) gs_thermal(OE) google_bcl(OE)
> i2c_acpm(OE) s2mpg11_regulator(OE) s2mpg10_regulator(OE) odpm(OE)
> s2mpg10_powermeter(OE) s2mpg10_mfd(OE) s2mpg11_powermeter(OE)
> pmic_class(OE) s2mpg11_mfd(OE) exynos_cpuhp(OE) pixel_boot_metrics(OE)
> exynos_adv_tracer_s2d(OE)
> [  316.397483][  T115]  keydebug(OE) exynos_coresight_etm(OE)
> exynos_ecc_handler(OE) exynos_coresight(OE) exynos_debug_test(OE)
> pixel_debug_test(OE) ehld(OE) sjtag_driver(OE) exynos_adv_tracer(OE)
> gsa(OE) trusty_ipc(OE) samsung_dma_heap(OE) trusty_core(OE)
> samsung_secure_iova(OE) deferred_free_helper(OE) page_pool(OE)
> hardlockup_watchdog(OE) debug_snapshot_debug_kinfo(OE)
> debug_snapshot_qd(OE) debug_snapshot_sfrdump(OE) exynos_pd(OE)
> dwc3_exynos_usb(OE) gvotable(OE) clk_exynos_gs(OE) pcie_exynos_gs(OE)
> exynos_pm(OE) acpm_flexpmu_dbg(OE) pcie_exynos_gs101_rc_cal(OE)
> shm_ipc(OE) spi_s3c64xx(OE) samsung_dma(OE) pl330(OE) s2mpu(OE)
> logbuffer(OE) itmon(OE) exynos_cpupm(OE) exynos_mct(OE) cmupmucal(OE)
> exynos_pm_qos(OE) gs_acpm(OE) kernel_top(OE) dss(OE)
> pixel_suspend_diag(OE) systrace(OE) ect_parser(OE) gs_chipid(OE)
> pinctrl_exynos_gs(OE) phy_exynos_mipi(OE) phy_exynos_mipi_dsim(OE)
> exynos_pmu_if(OE) phy_exynos_usbdrd_super(OE) exynos_pd_el3(OE)
> arm_dsu_pmu(E) softdog(E) pps_gpio(E) i2c_dev(E) spidev(E) sg(E)
> at24(E) zram zsmalloc
> [  316.404101][  T115] CPU: 2 PID: 115 Comm: kworker/u24:2 Tainted: G
>       W  OE      6.3.0-mainline-maybe-dirty #1
> [  316.404491][  T115] Hardware name: Oriole DVT (DT)
> [  316.404678][  T115] Workqueue: events_unbound idle_cull_fn
> [  316.404882][  T115] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT
> -SSBS BTYPE=--)
> [  316.405176][  T115] pc : available_idle_cpu+0x20/0x60
> [  316.405368][  T115] lr : select_task_rq_fair+0x1d0/0x17d8
> [  316.405574][  T115] sp : ffffffc008dfbb40
> [  316.405728][  T115] x29: ffffffc008dfbc10 x28: 0000000000000000
> x27: 0000000000000008
> [  316.406028][  T115] x26: 0000000000000000 x25: 0000000000000001
> x24: 0000000000000008
> [  316.406323][  T115] x23: 0000000000000000 x22: 0000000000000400
> x21: 0000000000000000
> [  316.406623][  T115] x20: 0000000000000008 x19: ffffff8800812380
> x18: ffffffc008cdf040
> [  316.406925][  T115] x17: 00000000aa3494c0 x16: 00000000aa3494c0
> x15: 0000000000019ed5
> [  316.407221][  T115] x14: 0000000000000001 x13: 000000000001a2d5
> x12: 0000000000000010
> [  316.407521][  T115] x11: 0000000000000400 x10: de8448a6b7c5d500 x9
> : ffffffd27459f6c0
> [  316.407822][  T115] x8 : ffffffd27459f6c0 x7 : 0000000000008080 x6
> : 0000000000000000
> [  316.408118][  T115] x5 : ffffff894f35c590 x4 : 0000646e756f626e x3
> : 0000000000000008
> [  316.408418][  T115] x2 : 0000000000000001 x1 : ffffff8800812380 x0
> : 0000000000000008
> [  316.408724][  T115] Call trace:
> [  316.408842][  T115]  available_idle_cpu+0x20/0x60
> [  316.409020][  T115]  try_to_wake_up+0x4ec/0x85c
> [  316.409190][  T115]  wake_up_process+0x18/0x28
> [  316.409359][  T115]  wake_dying_workers+0x5c/0xe8
> [  316.409539][  T115]  idle_cull_fn+0xdc/0x11c
> [  316.409705][  T115]  process_scheduled_works+0x208/0x45c
> [  316.409905][  T115]  worker_thread+0x22c/0x31c
> [  316.410074][  T115]  kthread+0x114/0x1c0
> [  316.410229][  T115]  ret_from_fork+0x10/0x20
> [  316.410399][  T115] Code: b00105c9 911b0129 f8605908 8b090108 (f9455109)
> [  316.410651][  T115] ---[ end trace 0000000000000000 ]---
> [  316.410853][  T115] Kernel panic - not syncing: Oops: Fatal exception
> [  316.411097][  T115] SMP: stopping secondary CPUs
> 
> Do you think the change in patch [2] could be related?

I have hit the following error but at the exact same RIP

1) General Protection Fault

    [  320.476222] general protection fault, probably for non-canonical address 0xfbcb2fe8ef894d01: 0000 [#1] PREEMPT SMP NOPTI
    [  320.487110] CPU: 16 PID: 1553 Comm: kworker/u512:1 Not tainted 6.4.0-rc1-tj-wq-please-boot+ #457
    [  320.495289] Hardware name: Dell Inc. PowerEdge R6525/024PW1, BIOS 2.7.3 03/30/2022
    [  320.502855] Workqueue: events_unbound idle_cull_fn
    [  320.507663] RIP: 0010:select_task_rq_fair+0x9bd/0x2570
    [  320.512812] Code: ff 0f 1f 44 00 00 49 c7 c6 28 15 02 00 48 81 bd 60 ff ff ff ff 1f 00 00 0f 87 dc 17 00 00 4d 01 f5 49 8b 45 00 48 85 c0 74 0b <8b> 40 08 85 c0 0f 85 36 11 00 00 8b 75 98 8b 7d a8 e8 7d 01 ff ff
    [  320.531559] RSP: 0018:ffffb7ba505c3c58 EFLAGS: 00010086
    [  320.536784] RAX: fbcb2fe8ef894cf9 RBX: ffffffffa5454538 RCX: 0000000000000010
    [  320.543916] RDX: 542058454d4f4400 RSI: 0000000000000100 RDI: 0000000000000080
    [  320.551050] RBP: ffffb7ba505c3db8 R08: 0000000000000000 R09: 0000000000000012
    [  320.558182] R10: ffff9db1c0159620 R11: ffffffffffffffff R12: ffff9df03d633840
    [  320.565315] R13: ffffffffa5454528 R14: 0000000000021528 R15: ffff9db1cb1b8000
    [  320.572447] FS:  0000000000000000(0000) GS:ffff9df03d600000(0000) knlGS:0000000000000000
    [  320.580535] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  320.586280] CR2: 000055c6dc75d008 CR3: 000000807d43c004 CR4: 0000000000770ee0
    [  320.593414] PKRU: 55555554
    [  320.596126] Call Trace:
    [  320.598581]  <TASK>
    [  320.600687]  ? raw_spin_rq_unlock+0x14/0x40
    [  320.604877]  ? affine_move_task+0x29c/0x580
    [  320.609065]  ? update_load_avg+0x82/0x790
    [  320.613079]  ? __set_cpus_allowed_ptr_locked+0x146/0x1c0
    [  320.618390]  try_to_wake_up+0x121/0x690
    [  320.622230]  wake_up_process+0x19/0x20
    [  320.625983]  idle_cull_fn+0x9d/0x130
    [  320.629560]  process_one_work+0x190/0x360
    [  320.633576]  worker_thread+0x2c7/0x440
    [  320.637326]  ? __pfx_worker_thread+0x10/0x10
    [  320.641600]  kthread+0xfb/0x130
    [  320.644755]  ? __pfx_kthread+0x10/0x10
    [  320.648507]  ret_from_fork+0x2c/0x50
    [  320.652097]  </TASK>
    [  320.654288] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter br_netfilter bridge
    stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio overlay binfmt_misc ipmi_ssif nls_iso8859_1 intel_rapl_msr intel_rapl_common amd64_edac kvm_amd kvm rapl dell_smbios dcdbas dell_wmi_descriptor wmi_bmof ccp ptdma
    k10temp acpi_ipmi ipmi_si acpi_power_meter mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler msr ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4
    btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mgag200 crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel
    drm_shmem_helper sha512_ssse3 drm_kms_helper syscopyarea sysfillrect aesni_intel sysimgblt crypto_simd cryptd tg3 xhci_pci drm
    [  320.654405]  xhci_pci_renesas megaraid_sas wmi
    [  320.748401] ---[ end trace 0000000000000000 ]---

2) NULL Pointer Dereferencing

    [  320.700972] BUG: kernel NULL pointer dereference, address: 0000000000000007
    [  320.707942] #PF: supervisor read access in kernel mode
    [  320.713079] #PF: error_code(0x0000) - not-present page
    [  320.718220] PGD 0 P4D 0
    [  320.720758] Oops: 0000 [#1] PREEMPT SMP NOPTI
    [  320.725118] CPU: 200 PID: 3718 Comm: kworker/u522:2 Not tainted 6.4.0-rc1-tj-wq-test+ #470
    [  320.733376] Hardware name: Dell Inc. PowerEdge R6525/024PW1, BIOS 2.7.3 03/30/2022
    [  320.740942] Workqueue: events_unbound idle_cull_fn
    [  320.745744] RIP: 0010:select_task_rq_fair+0x9bd/0x2570
    [  320.750883] Code: ff 0f 1f 44 00 00 49 c7 c6 28 15 02 00 48 81 bd 60 ff ff ff ff 1f 00 00 0f 87 dc 17 00 00 4d 01 f5 49 8b 45 00 48 85 c0 74 0b <8b> 40 08 85 c0 0f 85 36 11 00 00 8b 75 98 8b 7d a8 e8 7d 01 ff ff
    [  320.769628] RSP: 0018:ffff9d9bd663fc58 EFLAGS: 00010086
    [  320.774856] RAX: ffffffffffffffff RBX: ffffffffafc54538 RCX: 00000000000000c8
    [  320.781989] RDX: cccccccccccccccc RSI: 0000000000000100 RDI: 0000000000000000
    [  320.789122] RBP: ffff9d9bd663fdb8 R08: 0000000000000000 R09: 0000000000000001
    [  320.796254] R10: ffff8f73801599c0 R11: ffffffffffffffff R12: ffff8ff1f3e33840
    [  320.803388] R13: ffffffffafc54528 R14: 0000000000021528 R15: ffff8fb306fe4d40
    [  320.810519] FS:  0000000000000000(0000) GS:ffff8ff1f3e00000(0000) knlGS:0000000000000000
    [  320.818606] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  320.824353] CR2: 0000000000000007 CR3: 000000807d43c003 CR4: 0000000000770ee0
    [  320.831484] PKRU: 55555554
    [  320.834197] Call Trace:
    [  320.836651]  <TASK>
    [  320.838760]  ? raw_spin_rq_unlock+0x14/0x40
    [  320.842944]  ? affine_move_task+0x29c/0x580
    [  320.847129]  ? update_load_avg+0x82/0x790
    [  320.851144]  ? __set_cpus_allowed_ptr_locked+0x146/0x1c0
    [  320.856453]  try_to_wake_up+0x121/0x690
    [  320.860295]  wake_up_process+0x19/0x20
    [  320.864046]  idle_cull_fn+0x9d/0x130
    [  320.867625]  process_one_work+0x190/0x360
    [  320.871638]  ? __pfx_worker_thread+0x10/0x10
    [  320.875912]  worker_thread+0x2c7/0x440
    [  320.879665]  ? __pfx_worker_thread+0x10/0x10
    [  320.883935]  kthread+0xfb/0x130
    [  320.887083]  ? __pfx_kthread+0x10/0x10
    [  320.890837]  ret_from_fork+0x2c/0x50
    [  320.894414]  </TASK>
    [  320.896608] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter br_netfilter bridge
    stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio overlay binfmt_misc ipmi_ssif nls_iso8859_1 intel_rapl_msr intel_rapl_common amd64_edac kvm_amd kvm rapl dell_smbios dcdbas dell_wmi_descriptor wmi_bmof ccp ptdma
    k10temp acpi_ipmi ipmi_si acpi_power_meter mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler msr ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables
    autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper syscopyarea
    crct10dif_pclmul crc32_pclmul sysfillrect ghash_clmulni_intel sha512_ssse3 sysimgblt aesni_intel crypto_simd cryptd tg3 drm xhci_pci
    [  320.896686]  xhci_pci_renesas megaraid_sas wmi
    [  320.990684] CR2: 0000000000000007
    [  320.994006] ---[ end trace 0000000000000000 ]---

The RIP points to dereferencing sd_llc_shared->has_idle_cores

    $ scripts/faddr2line vmlinux select_task_rq_fair+0x9bd
    select_task_rq_fair+0x9bd/0x2570:
    test_idle_cores at kernel/sched/fair.c:6830
    (inlined by) select_idle_sibling at kernel/sched/fair.c:7189
    (inlined by) select_task_rq_fair at kernel/sched/fair.c:7710

My kernel is somewhat stable (I have not seen a panic for ~45min but I
was not stress testing the system either during that time) with the
following changes:

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index b2e914655f05..a279cc9c2248 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2247,7 +2247,7 @@ static void unbind_worker(struct worker *worker)
        if (cpumask_intersects(wq_unbound_cpumask, cpu_active_mask))
                WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, wq_unbound_cpumask) < 0);
        else
-               WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
+               WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_active_mask) < 0);
 }

 static void wake_dying_workers(struct list_head *cull_list)
--

However, the bits above were not directly changed by this patch and have
been in workqueue.c since commit 46a4d679ef88 ("workqueue: Avoid a false
warning in unbind_workers()"). I can only suspect something else changed
that has uncovered another issue in my case. You can give it a try and
see if it helps your case too.

I'll wait for Tejun's response however, since I have no explanation as to
why the above workaround improves the system stability in my case :)

> 
> Thanks,
> Sandeep.
> 
> [0] https://lore.kernel.org/all/30625cdd-4d61-594b-8db9-6816b017dde3@amd.com/
> [1] https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline
> [2] https://lore.kernel.org/all/20230519001709.2563-10-tj@kernel.org/

--
Thanks and Regards,
Prateek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ