lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <64deddc1-661f-4a29-a79e-e3e08af0d789@linaro.org>
Date: Wed, 22 May 2024 12:00:31 +0200
From: Neil Armstrong <neil.armstrong@...aro.org>
To: Nícolas F. R. A. Prado <nfraprado@...labora.com>,
 regressions@...ts.linux.dev
Cc: linux-kernel@...r.kernel.org, kernel@...labora.com
Subject: Re: [REGRESSION] boot regression on linux-next on sc7180 platforms -
 null pointer dereference on iommu_dma_sync_sg_for_device

Hi,

On 14/05/2024 18:41, Nícolas F. R. A. Prado wrote:
> Hi,
> 
> KernelCI has identified a new boot regression on linux-next. It affects the
> following platforms:
> * sc7180-trogdor-kingoftown
> * sc7180-trogdor-lazor-limozeen

I also see the regression on:
- SM8550-QRD
- SM8560-QRD

reverting commit 8cc3bad9d9d6 ("spi: Remove unneded check for orig_nents") removes the issue.

Thanks for reporting this,
Neil

[    6.404623] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
[    6.413685] Mem abort info:
[    6.416574]   ESR = 0x0000000096000006
[    6.420436]   EC = 0x25: DABT (current EL), IL = 32 bits
[    6.425901]   SET = 0, FnV = 0
[    6.429046]   EA = 0, S1PTW = 0
[    6.432293]   FSC = 0x06: level 2 translation fault
[    6.437320] Data abort info:
[    6.440289]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[    6.445927]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    6.451121]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    6.456585] user pgtable: 4k pages, 48-bit VAs, pgdp=000000088f68b000
[    6.463208] [000000000000001c] pgd=080000088f68d003, p4d=080000088f68d003, pud=080000088f68e003, pmd=0000000000000000
[    6.474108] Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
[    6.480542] Modules linked in: ucsi_glink pmic_glink_altmode goodix_berlin_spi(+) nb7vpq904m wcd939x_usbss qcom_battmgr typec_ucsi aux_hpd_bridge goodix_berlin_core crct10dif_ce hci_uart rtc_pm8xxx leds_qcom_lpg led_class_multicolor qcom_pon nvmem_qcom_spmi_sdam sm3_ce qcom_pbs btqca snd_soc_wcd939x snd_soc_sc8280xp snd_soc_wcd939x_sdw phy_qcom_eusb2_repeater snd_soc_qcom_sdw regmap_sdw qcom_spmi_temp_alarm snd_soc_qcom_common btbcm snd_soc_wcd_mbhc sm3 qcom_stats snd_soc_wcd_classh drm_dp_aux_bus sha3_ce gpu_sched sha512_ce sha512_arm64 drm_exec bluetooth qcom_q6v5_pas phy_qcom_qmp_combo qcrypto soundwire_qcom qcom_pil_info snd_soc_lpass_va_macro pinctrl_sm8650_lpass_lpi authenc snd_soc_lpass_tx_macro aux_bridge cfg80211 spi_geni_qcom i2c_qcom_geni snd_soc_lpass_rx_macro rfkill phy_qcom_snps_eusb2 dispcc_sm8650 drm_display_helper pinctrl_lpass_lpi gpi snd_soc_lpass_wsa_macro snd_soc_lpass_macro_common slimbus drm_kms_helper gpucc_sm8650 ipa qcom_q6v5 qrtr libdes phy_qcom_qmp_ufs qcom_sysmon qcom_common
[    6.480602]  qcom_glink_smem
[    6.571649]  soundwire_bus mdt_loader pmic_glink qcom_rng phy_qcom_qmp_pcie llcc_qcom ufs_qcom icc_bwmon typec rmtfs_mem pdr_interface qmi_helpers nvmem_reboot_mode socinfo fuse drm backlight ipv6
[    6.597201] CPU: 4 PID: 241 Comm: (udev-worker) Tainted: G S                 6.9.0-next-20240521 #1
[    6.606488] Hardware name: Qualcomm Technologies, Inc. SM8650 QRD (DT)
[    6.613189] pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[    6.641597] lr : __dma_sync_sg_for_device+0x3c/0x40
[    6.646632] sp : ffff800081bf3260
[    6.660650] x26: ffff59520fbd1c80 x25: 0000000000000000 x24: ffffb46fccd24988
[    6.660653] x23: ffff595201628410 x22: 0000000000000002 x21: 0000000000000000
[    6.660655] x20: ffff800081bf33f0 x19: 0000000000000000 x18: 0000000000000001
[    6.660656] x17: 0000000000000018 x16: 0000000000000100 x15: 0000000000000002
[    6.688275] x14: 0000000000000001 x13: ffff595200995180 x12: 000000000025a5c8
[    6.688277] x11: 0000000000000820 x10: 0000000000000001 x9 : ffff59520fbd1c69
[    6.688279] x8 : ffff595202169704 x7 : 00000000ffffffff x6 : 0000000000000001
[    6.688281] x5 : fffffdffbf7a8cc0 x4 : ffffb46fcc0232a4 x3 : 0000000000000002
[    6.688283] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff595201628410
[    6.688286] Call trace:
[    6.688287]  iommu_dma_sync_sg_for_device+0x28/0x100
[    6.717582]  __dma_sync_sg_for_device+0x3c/0x40
[    6.717585]  spi_transfer_one_message+0x358/0x680
[    6.732229]  __spi_pump_transfer_message+0x188/0x494
[    6.732232]  __spi_sync+0x2a8/0x3c4
[    6.732234]  spi_sync+0x30/0x54
[    6.732236]  goodix_berlin_spi_write+0xf8/0x164 [goodix_berlin_spi]
[    6.739854]  _regmap_raw_write_impl+0x538/0x674
[    6.750053]  _regmap_raw_write+0xb4/0x144
[    6.750056]  regmap_raw_write+0x7c/0xc0
[    6.750058]  goodix_berlin_power_on+0xb0/0x1b0 [goodix_berlin_core]
[    6.765520]  goodix_berlin_probe+0xc0/0x660 [goodix_berlin_core]
[    6.765522]  goodix_berlin_spi_probe+0x12c/0x14c [goodix_berlin_spi]
[    6.772339]  spi_probe+0x84/0xe4
[    6.772342]  really_probe+0xbc/0x29c
[    6.784313]  __driver_probe_device+0x78/0x12c
[    6.784316]  driver_probe_device+0x3c/0x15c
[    6.784319]  __driver_attach+0x90/0x19c
[    6.784322]  bus_for_each_dev+0x7c/0xdc
[    6.794520]  driver_attach+0x24/0x30
[    6.794523]  bus_add_driver+0xe4/0x208
[    6.794526]  driver_register+0x5c/0x124
[    6.802586]  __spi_register_driver+0xa4/0xe4
[    6.802589]  goodix_berlin_spi_driver_init+0x20/0x1000 [goodix_berlin_spi]
[    6.802591]  do_one_initcall+0x80/0x1c8
[    6.902310]  do_init_module+0x60/0x218
[    6.921988]  load_module+0x1bcc/0x1d8c
[    6.925847]  init_module_from_file+0x88/0xcc
[    6.930238]  __arm64_sys_finit_module+0x1dc/0x2e4
[    6.935074]  invoke_syscall+0x48/0x114
[    6.938944]  el0_svc_common.constprop.0+0xc0/0xe0
[    6.943781]  do_el0_svc+0x1c/0x28
[    6.947195]  el0_svc+0x34/0xd8
[    6.950348]  el0t_64_sync_handler+0x120/0x12c
[    6.954833]  el0t_64_sync+0x190/0x194
[    6.958600] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
[    6.964859] ---[ end trace 0000000000000000 ]---

> 
> The regression was introduced in next-20240509, and still affects today's
> (next-20240514) release.
> 
> The config used was the upstream arm64 defconfig with a config fragment on top
> [1].
> 
> The following stack traces are produced during boot and a usable shell is never
> reached:
> 
> [    0.381981] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> [    0.381989] Mem abort info:
> [    0.381991]   ESR = 0x0000000096000004
> [    0.381994]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    0.381997]   SET = 0, FnV = 0
> [    0.382000]   EA = 0, S1PTW = 0
> [    0.382003]   FSC = 0x04: level 0 translation fault
> [    0.382006] Data abort info:
> [    0.382008]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [    0.382011]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [    0.382014]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [    0.382017] [000000000000001c] user address but active_mm is swapper
> [    0.382021] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
> [    0.382025] Modules linked in:
> [    0.382032] CPU: 4 PID: 68 Comm: kworker/u32:2 Not tainted 6.9.0-next-20240514-dirty #380
> [    0.382038] Hardware name: Google Kingoftown (DT)
> [    0.382042] Workqueue: async async_run_entry_fn
> [    0.382055] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    0.382061] pc : iommu_dma_sync_sg_for_device+0x28/0x100
> [    0.382070] lr : __dma_sync_sg_for_device+0x28/0x4c
> [    0.382080] sp : ffff800080943740
> [    0.382082] x29: ffff800080943740 x28: ffff36ee44280000 x27: ffff36ee40bd7810
> [    0.382092] x26: ffff800080943998 x25: ffff36ee44280480 x24: ffffb54600bcf0e8
> [    0.382101] x23: ffff36ee40bd7810 x22: 0000000000000001 x21: 0000000000000000
> [    0.382110] x20: ffffb54600f3d098 x19: 0000000000000000 x18: ffffb54601c1a210
> [    0.382118] x17: 000000040044ffff x16: 0000000000000000 x15: ffff36efb6d95580
> [    0.382126] x14: ffff36ee409156c0 x13: 0000000000001797 x12: 0000000000000002
> [    0.382134] x11: 0000000000000004 x10: ffff36ee4308b3d8 x9 : ffff36ee44280469
> [    0.382143] x8 : ffff36ee4308b304 x7 : 00000000ffffffff x6 : 0000000000000001
> [    0.382151] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
> [    0.382159] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40bd7810
> [    0.382167] Call trace:
> [    0.382170]  iommu_dma_sync_sg_for_device+0x28/0x100
> [    0.382176]  __dma_sync_sg_for_device+0x28/0x4c
> [    0.382183]  spi_transfer_one_message+0x378/0x6e4
> [    0.382193]  __spi_pump_transfer_message+0x190/0x4a4
> [    0.382199]  __spi_sync+0x2a0/0x3c4
> [    0.382205]  spi_sync_locked+0x10/0x1c
> [    0.382211]  tpm_tis_spi_transfer_full+0x160/0x2fc
> [    0.382217]  tpm_tis_spi_transfer+0x34/0x40
> [    0.382221]  tpm_tis_spi_cr50_read_bytes+0x5c/0x90
> [    0.382226]  tpm_tis_core_init+0xfc/0x7e0
> [    0.382231]  tpm_tis_spi_init+0x54/0x70
> [    0.382236]  cr50_spi_probe+0xf4/0x27c
> [    0.382241]  tpm_tis_spi_driver_probe+0x34/0x64
> [    0.382245]  spi_probe+0x84/0xe4
> [    0.382251]  really_probe+0xbc/0x2a0
> [    0.382258]  __driver_probe_device+0x78/0x12c
> [    0.382264]  driver_probe_device+0x40/0x160
> [    0.382269]  __device_attach_driver+0xb8/0x134
> [    0.382275]  bus_for_each_drv+0x84/0xe0
> [    0.382280]  __device_attach_async_helper+0xac/0xd0
> [    0.382286]  async_run_entry_fn+0x34/0xe0
> [    0.382291]  process_one_work+0x154/0x298
> [    0.382300]  worker_thread+0x304/0x408
> [    0.382307]  kthread+0x118/0x11c
> [    0.382313]  ret_from_fork+0x10/0x20
> [    0.382324] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
> [    0.382328] ---[ end trace 0000000000000000 ]---
> 
> [    0.393379] spi_master spi6: will run message pump with realtime priority
> [    0.393896] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> [    0.393903] Mem abort info:
> [    0.393905]   ESR = 0x0000000096000004
> [    0.393908]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    0.393912]   SET = 0, FnV = 0
> [    0.393915]   EA = 0, S1PTW = 0
> [    0.393917]   FSC = 0x04: level 0 translation fault
> [    0.393920] Data abort info:
> [    0.393922]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [    0.393925]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [    0.393928]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [    0.393931] [000000000000001c] user address but active_mm is swapper
> [    0.393935] Internal error: Oops: 0000000096000004 [#2] PREEMPT SMP
> [    0.393939] Modules linked in:
> [    0.393946] CPU: 2 PID: 103 Comm: cros_ec_spi_hig Tainted: G      D            6.9.0-next-20240514-dirty #380
> [    0.393953] Hardware name: Google Kingoftown (DT)
> [    0.393956] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    0.393962] pc : iommu_dma_sync_sg_for_device+0x28/0x100
> [    0.393975] lr : __dma_sync_sg_for_device+0x28/0x4c
> [    0.393985] sp : ffff800080de3aa0
> [    0.393988] x29: ffff800080de3aa0 x28: ffff36ee44281800 x27: ffff36ee40ff8010
> [    0.393997] x26: ffff800080de3cf8 x25: ffff36ee44281c80 x24: ffffb54600bcf0e8
> [    0.394006] x23: ffff36ee40ff8010 x22: 0000000000000001 x21: 0000000000000000
> [    0.394014] x20: ffffb54600f3d3d8 x19: 0000000000000000 x18: ffffb54601c1a210
> [    0.394023] x17: 0000000000010108 x16: 0000000000000000 x15: 000000000000000c
> [    0.394031] x14: 0000000000000000 x13: ffff36ee40b962b0 x12: 0000000000000000
> [    0.394039] x11: 0000000000000000 x10: 0000000000003fff x9 : ffff36ee44281c69
> [    0.394047] x8 : ffff36ee4103e704 x7 : 00000000ffffffff x6 : 0000000000000001
> [    0.394055] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
> [    0.394063] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40ff8010
> [    0.394071] Call trace:
> [    0.394074]  iommu_dma_sync_sg_for_device+0x28/0x100
> [    0.394081]  __dma_sync_sg_for_device+0x28/0x4c
> [    0.394088]  spi_transfer_one_message+0x378/0x6e4
> [    0.394096]  __spi_pump_transfer_message+0x190/0x4a4
> [    0.394103]  __spi_sync+0x2a0/0x3c4
> [    0.394109]  spi_sync_locked+0x10/0x1c
> [    0.394115]  do_cros_ec_pkt_xfer_spi+0x108/0x530
> [    0.394122]  cros_ec_xfer_high_pri_work+0x20/0x34
> [    0.394127]  kthread_worker_fn+0xcc/0x184
> [    0.394134]  kthread+0x118/0x11c
> [    0.394140]  ret_from_fork+0x10/0x20
> [    0.394150] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
> [    0.394154] ---[ end trace 0000000000000000 ]---
> 
> [    3.654117] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> [    3.663154] Mem abort info:
> [    3.666032]   ESR = 0x0000000096000004
> [    3.669943]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    3.675417]   SET = 0, FnV = 0
> [    3.678563]   EA = 0, S1PTW = 0
> [    3.681792]   FSC = 0x04: level 0 translation fault
> [    3.686808] Data abort info:
> [    3.689765]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [    3.695399]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [    3.700592]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [    3.706050] [000000000000001c] user address but active_mm is swapper
> [    3.712576] Internal error: Oops: 0000000096000004 [#3] PREEMPT SMP
> [    3.719017] Modules linked in:
> [    3.722162] CPU: 6 PID: 11 Comm: kworker/u32:0 Tainted: G      D            6.9.0-next-20240514-dirty #380
> [    3.732067] Hardware name: Google Kingoftown (DT)
> [    3.736904] Workqueue: events_unbound deferred_probe_work_func
> [    3.742907] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    3.750052] pc : iommu_dma_sync_sg_for_device+0x28/0x100
> [    3.755526] lr : __dma_sync_sg_for_device+0x28/0x4c
> [    3.760548] sp : ffff8000800ab0b0
> [    3.763953] x29: ffff8000800ab0b0 x28: ffff36ee43a6a000 x27: ffff36ee41012010
> [    3.771279] x26: ffff8000800ab2e8 x25: ffff36ee43a6a480 x24: ffffb54600bcf0e8
> [    3.778604] x23: ffff36ee41012010 x22: 0000000000000001 x21: 0000000000000000
> [    3.785928] x20: ffffb54600f3d718 x19: 0000000000000000 x18: ffffb54601c19c48
> [    3.793258] x17: 0000000000010108 x16: 0000000000000000 x15: 000000000000000c
> [    3.800589] x14: 0000000000000000 x13: ffff36ee40b962b0 x12: 0000000000000000
> [    3.807921] x11: 071c71c71c71c71c x10: 0000000000003fff x9 : ffff36ee43a6a469
> [    3.815254] x8 : ffff36ee4101cf04 x7 : 00000000ffffffff x6 : 0000000000000001
> [    3.822587] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
> [    3.829910] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee41012010
> [    3.837234] Call trace:
> [    3.839750]  iommu_dma_sync_sg_for_device+0x28/0x100
> [    3.844853]  __dma_sync_sg_for_device+0x28/0x4c
> [    3.849517]  spi_transfer_one_message+0x378/0x6e4
> [    3.854360]  __spi_pump_transfer_message+0x190/0x4a4
> [    3.859462]  __spi_sync+0x2a0/0x3c4
> [    3.863048]  spi_sync+0x30/0x54
> [    3.866283]  spi_mem_exec_op+0x26c/0x41c
> [    3.870321]  spi_nor_read_id+0x7c/0xc4
> [    3.874180]  spi_nor_detect+0x34/0x158
> [    3.878039]  spi_nor_scan+0x1f0/0xef8
> [    3.881813]  spi_nor_probe+0x94/0x2ec
> [    3.885587]  spi_mem_probe+0x6c/0xac
> [    3.889262]  spi_probe+0x84/0xe4
> [    3.892579]  really_probe+0xbc/0x2a0
> [    3.896262]  __driver_probe_device+0x78/0x12c
> [    3.900747]  driver_probe_device+0x40/0x160
> [    3.905046]  __device_attach_driver+0xb8/0x134
> [    3.909619]  bus_for_each_drv+0x84/0xe0
> [    3.913568]  __device_attach+0xa8/0x1b0
> [    3.917515]  device_initial_probe+0x14/0x20
> [    3.921814]  bus_probe_device+0xa8/0xac
> [    3.925761]  device_add+0x590/0x750
> [    3.929351]  __spi_add_device+0x138/0x208
> [    3.933476]  of_register_spi_device+0x394/0x57c
> [    3.938139]  spi_register_controller+0x394/0x760
> [    3.942888]  qcom_qspi_probe+0x328/0x390
> [    3.946928]  platform_probe+0x68/0xd8
> [    3.950701]  really_probe+0xbc/0x2a0
> [    3.954384]  __driver_probe_device+0x78/0x12c
> [    3.958869]  driver_probe_device+0x40/0x160
> [    3.963169]  __device_attach_driver+0xb8/0x134
> [    3.967734]  bus_for_each_drv+0x84/0xe0
> [    3.971682]  __device_attach+0xa8/0x1b0
> [    3.975628]  device_initial_probe+0x14/0x20
> [    3.979927]  bus_probe_device+0xa8/0xac
> [    3.983873]  deferred_probe_work_func+0x88/0xc0
> [    3.988536]  process_one_work+0x154/0x298
> [    3.992663]  worker_thread+0x304/0x408
> [    3.996525]  kthread+0x118/0x11c
> [    3.999847]  ret_from_fork+0x10/0x20
> [    4.003534] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
> [    4.009788] ---[ end trace 0000000000000000 ]---
> 
> Searching on lore I could only find the following series that caused another
> regression, and its subsequent fix:
> https://lore.kernel.org/lkml/20240507112026.1803778-1-aleksander.lobakin@intel.com/
> https://lore.kernel.org/all/20240509144616.938519-1-aleksander.lobakin@intel.com/
> 
> But even after reverting both the issue was still there, so I've concluded
> that's unrelated.
> 
> Thanks,
> Nícolas
> 
> #regzbot introduced: next-20240509
> 
> [1] https://pastebin.com/raw/sx4bPAa6
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ