lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Tue, 14 May 2024 12:41:26 -0400
From: NĂ­colas F. R. A. Prado <nfraprado@...labora.com>
To: regressions@...ts.linux.dev
Cc: linux-kernel@...r.kernel.org, kernel@...labora.com
Subject: [REGRESSION] boot regression on linux-next on sc7180 platforms -
 null pointer dereference on iommu_dma_sync_sg_for_device

Hi,

KernelCI has identified a new boot regression on linux-next. It affects the
following platforms:
* sc7180-trogdor-kingoftown
* sc7180-trogdor-lazor-limozeen

The regression was introduced in next-20240509, and still affects today's
(next-20240514) release.

The config used was the upstream arm64 defconfig with a config fragment on top
[1].

The following stack traces are produced during boot and a usable shell is never
reached:

[    0.381981] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
[    0.381989] Mem abort info:
[    0.381991]   ESR = 0x0000000096000004
[    0.381994]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.381997]   SET = 0, FnV = 0
[    0.382000]   EA = 0, S1PTW = 0
[    0.382003]   FSC = 0x04: level 0 translation fault
[    0.382006] Data abort info:
[    0.382008]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[    0.382011]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    0.382014]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    0.382017] [000000000000001c] user address but active_mm is swapper
[    0.382021] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[    0.382025] Modules linked in:
[    0.382032] CPU: 4 PID: 68 Comm: kworker/u32:2 Not tainted 6.9.0-next-20240514-dirty #380
[    0.382038] Hardware name: Google Kingoftown (DT)
[    0.382042] Workqueue: async async_run_entry_fn
[    0.382055] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.382061] pc : iommu_dma_sync_sg_for_device+0x28/0x100
[    0.382070] lr : __dma_sync_sg_for_device+0x28/0x4c
[    0.382080] sp : ffff800080943740
[    0.382082] x29: ffff800080943740 x28: ffff36ee44280000 x27: ffff36ee40bd7810
[    0.382092] x26: ffff800080943998 x25: ffff36ee44280480 x24: ffffb54600bcf0e8
[    0.382101] x23: ffff36ee40bd7810 x22: 0000000000000001 x21: 0000000000000000
[    0.382110] x20: ffffb54600f3d098 x19: 0000000000000000 x18: ffffb54601c1a210
[    0.382118] x17: 000000040044ffff x16: 0000000000000000 x15: ffff36efb6d95580
[    0.382126] x14: ffff36ee409156c0 x13: 0000000000001797 x12: 0000000000000002
[    0.382134] x11: 0000000000000004 x10: ffff36ee4308b3d8 x9 : ffff36ee44280469
[    0.382143] x8 : ffff36ee4308b304 x7 : 00000000ffffffff x6 : 0000000000000001
[    0.382151] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
[    0.382159] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40bd7810
[    0.382167] Call trace:
[    0.382170]  iommu_dma_sync_sg_for_device+0x28/0x100
[    0.382176]  __dma_sync_sg_for_device+0x28/0x4c
[    0.382183]  spi_transfer_one_message+0x378/0x6e4
[    0.382193]  __spi_pump_transfer_message+0x190/0x4a4
[    0.382199]  __spi_sync+0x2a0/0x3c4
[    0.382205]  spi_sync_locked+0x10/0x1c
[    0.382211]  tpm_tis_spi_transfer_full+0x160/0x2fc
[    0.382217]  tpm_tis_spi_transfer+0x34/0x40
[    0.382221]  tpm_tis_spi_cr50_read_bytes+0x5c/0x90
[    0.382226]  tpm_tis_core_init+0xfc/0x7e0
[    0.382231]  tpm_tis_spi_init+0x54/0x70
[    0.382236]  cr50_spi_probe+0xf4/0x27c
[    0.382241]  tpm_tis_spi_driver_probe+0x34/0x64
[    0.382245]  spi_probe+0x84/0xe4
[    0.382251]  really_probe+0xbc/0x2a0
[    0.382258]  __driver_probe_device+0x78/0x12c
[    0.382264]  driver_probe_device+0x40/0x160
[    0.382269]  __device_attach_driver+0xb8/0x134
[    0.382275]  bus_for_each_drv+0x84/0xe0
[    0.382280]  __device_attach_async_helper+0xac/0xd0
[    0.382286]  async_run_entry_fn+0x34/0xe0
[    0.382291]  process_one_work+0x154/0x298
[    0.382300]  worker_thread+0x304/0x408
[    0.382307]  kthread+0x118/0x11c
[    0.382313]  ret_from_fork+0x10/0x20
[    0.382324] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
[    0.382328] ---[ end trace 0000000000000000 ]---

[    0.393379] spi_master spi6: will run message pump with realtime priority
[    0.393896] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
[    0.393903] Mem abort info:
[    0.393905]   ESR = 0x0000000096000004
[    0.393908]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.393912]   SET = 0, FnV = 0
[    0.393915]   EA = 0, S1PTW = 0
[    0.393917]   FSC = 0x04: level 0 translation fault
[    0.393920] Data abort info:
[    0.393922]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[    0.393925]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    0.393928]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    0.393931] [000000000000001c] user address but active_mm is swapper
[    0.393935] Internal error: Oops: 0000000096000004 [#2] PREEMPT SMP
[    0.393939] Modules linked in:
[    0.393946] CPU: 2 PID: 103 Comm: cros_ec_spi_hig Tainted: G      D            6.9.0-next-20240514-dirty #380
[    0.393953] Hardware name: Google Kingoftown (DT)
[    0.393956] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.393962] pc : iommu_dma_sync_sg_for_device+0x28/0x100
[    0.393975] lr : __dma_sync_sg_for_device+0x28/0x4c
[    0.393985] sp : ffff800080de3aa0
[    0.393988] x29: ffff800080de3aa0 x28: ffff36ee44281800 x27: ffff36ee40ff8010
[    0.393997] x26: ffff800080de3cf8 x25: ffff36ee44281c80 x24: ffffb54600bcf0e8
[    0.394006] x23: ffff36ee40ff8010 x22: 0000000000000001 x21: 0000000000000000
[    0.394014] x20: ffffb54600f3d3d8 x19: 0000000000000000 x18: ffffb54601c1a210
[    0.394023] x17: 0000000000010108 x16: 0000000000000000 x15: 000000000000000c
[    0.394031] x14: 0000000000000000 x13: ffff36ee40b962b0 x12: 0000000000000000
[    0.394039] x11: 0000000000000000 x10: 0000000000003fff x9 : ffff36ee44281c69
[    0.394047] x8 : ffff36ee4103e704 x7 : 00000000ffffffff x6 : 0000000000000001
[    0.394055] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
[    0.394063] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40ff8010
[    0.394071] Call trace:
[    0.394074]  iommu_dma_sync_sg_for_device+0x28/0x100
[    0.394081]  __dma_sync_sg_for_device+0x28/0x4c
[    0.394088]  spi_transfer_one_message+0x378/0x6e4
[    0.394096]  __spi_pump_transfer_message+0x190/0x4a4
[    0.394103]  __spi_sync+0x2a0/0x3c4
[    0.394109]  spi_sync_locked+0x10/0x1c
[    0.394115]  do_cros_ec_pkt_xfer_spi+0x108/0x530
[    0.394122]  cros_ec_xfer_high_pri_work+0x20/0x34
[    0.394127]  kthread_worker_fn+0xcc/0x184
[    0.394134]  kthread+0x118/0x11c
[    0.394140]  ret_from_fork+0x10/0x20
[    0.394150] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
[    0.394154] ---[ end trace 0000000000000000 ]---

[    3.654117] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
[    3.663154] Mem abort info:
[    3.666032]   ESR = 0x0000000096000004
[    3.669943]   EC = 0x25: DABT (current EL), IL = 32 bits
[    3.675417]   SET = 0, FnV = 0
[    3.678563]   EA = 0, S1PTW = 0
[    3.681792]   FSC = 0x04: level 0 translation fault
[    3.686808] Data abort info:
[    3.689765]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[    3.695399]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    3.700592]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    3.706050] [000000000000001c] user address but active_mm is swapper
[    3.712576] Internal error: Oops: 0000000096000004 [#3] PREEMPT SMP
[    3.719017] Modules linked in:
[    3.722162] CPU: 6 PID: 11 Comm: kworker/u32:0 Tainted: G      D            6.9.0-next-20240514-dirty #380
[    3.732067] Hardware name: Google Kingoftown (DT)
[    3.736904] Workqueue: events_unbound deferred_probe_work_func
[    3.742907] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    3.750052] pc : iommu_dma_sync_sg_for_device+0x28/0x100
[    3.755526] lr : __dma_sync_sg_for_device+0x28/0x4c
[    3.760548] sp : ffff8000800ab0b0
[    3.763953] x29: ffff8000800ab0b0 x28: ffff36ee43a6a000 x27: ffff36ee41012010
[    3.771279] x26: ffff8000800ab2e8 x25: ffff36ee43a6a480 x24: ffffb54600bcf0e8
[    3.778604] x23: ffff36ee41012010 x22: 0000000000000001 x21: 0000000000000000
[    3.785928] x20: ffffb54600f3d718 x19: 0000000000000000 x18: ffffb54601c19c48
[    3.793258] x17: 0000000000010108 x16: 0000000000000000 x15: 000000000000000c
[    3.800589] x14: 0000000000000000 x13: ffff36ee40b962b0 x12: 0000000000000000
[    3.807921] x11: 071c71c71c71c71c x10: 0000000000003fff x9 : ffff36ee43a6a469
[    3.815254] x8 : ffff36ee4101cf04 x7 : 00000000ffffffff x6 : 0000000000000001
[    3.822587] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
[    3.829910] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee41012010
[    3.837234] Call trace:
[    3.839750]  iommu_dma_sync_sg_for_device+0x28/0x100
[    3.844853]  __dma_sync_sg_for_device+0x28/0x4c
[    3.849517]  spi_transfer_one_message+0x378/0x6e4
[    3.854360]  __spi_pump_transfer_message+0x190/0x4a4
[    3.859462]  __spi_sync+0x2a0/0x3c4
[    3.863048]  spi_sync+0x30/0x54
[    3.866283]  spi_mem_exec_op+0x26c/0x41c
[    3.870321]  spi_nor_read_id+0x7c/0xc4
[    3.874180]  spi_nor_detect+0x34/0x158
[    3.878039]  spi_nor_scan+0x1f0/0xef8
[    3.881813]  spi_nor_probe+0x94/0x2ec
[    3.885587]  spi_mem_probe+0x6c/0xac
[    3.889262]  spi_probe+0x84/0xe4
[    3.892579]  really_probe+0xbc/0x2a0
[    3.896262]  __driver_probe_device+0x78/0x12c
[    3.900747]  driver_probe_device+0x40/0x160
[    3.905046]  __device_attach_driver+0xb8/0x134
[    3.909619]  bus_for_each_drv+0x84/0xe0
[    3.913568]  __device_attach+0xa8/0x1b0
[    3.917515]  device_initial_probe+0x14/0x20
[    3.921814]  bus_probe_device+0xa8/0xac
[    3.925761]  device_add+0x590/0x750
[    3.929351]  __spi_add_device+0x138/0x208
[    3.933476]  of_register_spi_device+0x394/0x57c
[    3.938139]  spi_register_controller+0x394/0x760
[    3.942888]  qcom_qspi_probe+0x328/0x390
[    3.946928]  platform_probe+0x68/0xd8
[    3.950701]  really_probe+0xbc/0x2a0
[    3.954384]  __driver_probe_device+0x78/0x12c
[    3.958869]  driver_probe_device+0x40/0x160
[    3.963169]  __device_attach_driver+0xb8/0x134
[    3.967734]  bus_for_each_drv+0x84/0xe0
[    3.971682]  __device_attach+0xa8/0x1b0
[    3.975628]  device_initial_probe+0x14/0x20
[    3.979927]  bus_probe_device+0xa8/0xac
[    3.983873]  deferred_probe_work_func+0x88/0xc0
[    3.988536]  process_one_work+0x154/0x298
[    3.992663]  worker_thread+0x304/0x408
[    3.996525]  kthread+0x118/0x11c
[    3.999847]  ret_from_fork+0x10/0x20
[    4.003534] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
[    4.009788] ---[ end trace 0000000000000000 ]---

Searching on lore I could only find the following series that caused another
regression, and its subsequent fix:
https://lore.kernel.org/lkml/20240507112026.1803778-1-aleksander.lobakin@intel.com/
https://lore.kernel.org/all/20240509144616.938519-1-aleksander.lobakin@intel.com/

But even after reverting both the issue was still there, so I've concluded
that's unrelated.

Thanks,
NĂ­colas

#regzbot introduced: next-20240509

[1] https://pastebin.com/raw/sx4bPAa6

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ