[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXv+5HJpTYmQ2h-GD7GjyeYT7bL9EBCvu0mz5LgpzJZtzfW0w@mail.gmail.com>
Date: Mon, 7 Apr 2025 13:17:31 +0800
From: Chen-Yu Tsai <wenst@...omium.org>
To: Joerg Roedel <joro@...tes.org>
Cc: Louis-Alexis Eyraud <louisalexis.eyraud@...labora.com>, Yong Wu <yong.wu@...iatek.com>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
Matthias Brugger <matthias.bgg@...il.com>,
AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
Bjorn Helgaas <bhelgaas@...gle.com>, Lorenzo Pieralisi <lpieralisi@...nel.org>,
"Rob Herring (Arm)" <robh@...nel.org>, kernel@...labora.com, Joerg Roedel <jroedel@...e.de>,
Jason Gunthorpe <jgg@...pe.ca>, iommu@...ts.linux.dev, linux-mediatek@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v2] iommu/mediatek: Fix NULL pointer deference in mtk_iommu_device_group
Hi,
On Thu, Apr 3, 2025 at 6:24 PM Louis-Alexis Eyraud
<louisalexis.eyraud@...labora.com> wrote:
>
> Currently, mtk_iommu calls during probe iommu_device_register before
> the hw_list from driver data is initialized. Since iommu probing issue
> fix, it leads to NULL pointer dereference in mtk_iommu_device_group when
> hw_list is accessed with list_first_entry (not null safe).
>
> So, change the call order to ensure iommu_device_register is called
> after the driver data are initialized.
>
> Fixes: 9e3a2a643653 ("iommu/mediatek: Adapt sharing and non-sharing pgtable case")
> Fixes: bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path")
> Reviewed-by: Yong Wu <yong.wu@...iatek.com>
> Tested-by: Chen-Yu Tsai <wenst@...omium.org> # MT8183 Juniper, MT8186 Tentacruel
> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
> Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@...labora.com>
> ---
> This patch fixes a NULL pointer dereference that occurs during the
> mtk_iommu driver probe and observed at least on several Mediatek Genio boards:
> ```
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
This is a reminder to please land this and send to Linus ASAP.
This fixes the v6.15-rc1 kernel on all the MediaTek Chromebook platforms,
except for MT8188, which seems to have another issue in iommu_get_dma_domain()
used from the DRM driver:
Disabling lock debugging due to kernel taint
Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000158
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 39-bit VAs, pgdp=00000001185ab000
[0000000000000158] pgd=0000000000000000, p4d=0000000000000000,
pud=0000000000000000
Internal error: Oops: 0000000096000005 [#1] SMP
Modules linked in: mtk_vcodec_dec_hw mtk_vcodec_dec mtk_vcodec_enc
v4l2_vp9 v4l2_h264 mtk_vcodec_dbgfs mtk_jpeg mtk_vcodec_common
cros_ec_sensors mtk_jpeg_enc_hw cros_ec_sensors_core mtk_jpeg_dec_hw
mtk_vpu v4l2_mem2mem videobuf2_v4l2 snd_sof_mt8186
videobuf2_dma_contig snd_sof_xtensa_dsp sha1_ce videobuf2_memops
mtk_adsp_common mtk_scp videobuf2_common snd_sof_of mtk_rpmsg snd_sof
rpmsg_core cros_ec_sensorhub hid_google_hammer hid_vivaldi_common
snd_sof_utils mtk_scp_ipi fuse
CPU: 6 UID: 0 PID: 12 Comm: kworker/u32:0 Tainted: G B
6.15.0-rc1-00001-gfed05d98b726 #628 PREEMPT
06e695da6360d22824958738f9ba1c9b2416be19
Tainted: [B]=BAD_PAGE
Hardware name: Google Ciri sku2 board (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : iommu_get_dma_domain+0x30/0x58
lr : iommu_get_dma_domain+0x30/0x58
sp : ffffffc080186ee0
x29: ffffffc080186ee0 x28: ffffff80c4c59028 x27: 0000000000000000
x26: 0000000000000014 x25: 0068000000000f0b x24: ffffff80c4c5ca28
x23: 00000000008ca000 x22: 1ffffff810030df8 x21: ffffff80c20ea010
x20: ffffff80c20ea010 x19: 0000000000000000 x18: 0000000000000000
x17: 3d3d3d3d3d3d3d3d x16: 3d3d3d3d3d3d3d3d x15: 0720072007200720
x14: 0720072007200720 x13: ffffff80c0a1bb80 x12: ffffffbae2ca45e9
x11: 1ffffffae2ca45e8 x10: ffffffbae2ca45e8 x9 : dfffffc000000000
x8 : 000000451d35ba18 x7 : ffffffd716522f47 x6 : 0000000000000001
x5 : ffffffd716522f40 x4 : ffffffbae2ca45e9 x3 : ffffffd713717e6c
x2 : 0000000000000001 x1 : ffffff80c0941dc0 x0 : 0000000000000001
Call trace:
iommu_get_dma_domain+0x30/0x58 (P)
__iommu_dma_alloc_noncontiguous+0x34/0x498
iommu_dma_alloc+0x2f0/0x3e0
dma_alloc_attrs+0x1b4/0x3b8
mtk_gem_create+0x124/0x170
mtk_gem_dumb_create+0x84/0x180
drm_mode_create_dumb+0xf8/0x128
drm_client_framebuffer_create+0x11c/0x240
drm_fbdev_dma_driver_fbdev_probe+0x154/0x5a0
__drm_fb_helper_initial_config_and_unlock+0x4f0/0x928
drm_fb_helper_initial_config+0x50/0x68
drm_fbdev_client_hotplug+0xc0/0x120
drm_client_register+0xa0/0x100
drm_fbdev_client_setup+0xc8/0x260
drm_client_setup+0x60/0xd0
mtk_drm_bind+0x4f0/0xaf0
try_to_bring_up_aggregate_device+0x258/0x2f0
__component_add+0x104/0x240
component_add+0x1c/0x38
mtk_disp_rdma_probe+0x180/0x260
platform_probe+0x98/0x128
really_probe+0x118/0x3c0
__driver_probe_device+0xc0/0x198
driver_probe_device+0x64/0x1f8
__device_attach_driver+0xf0/0x1b0
bus_for_each_drv+0xf4/0x178
__device_attach+0x120/0x240
device_initial_probe+0x1c/0x30
bus_probe_device+0xdc/0xe8
deferred_probe_work_func+0xec/0x140
process_one_work+0x428/0xa80
worker_thread+0x2c0/0x538
kthread+0x258/0x380
ret_from_fork+0x10/0x20
Code: 97e8f12c f9422673 91056260 97e8f129 (f940ae60)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: 0x1693600000 from 0xffffffc080000000
PHYS_OFFSET: 0x40000000
CPU features: 0x0e00,000002e0,01202650,8200720b
Memory Limit: none
Rebooting in 30 seconds..
Thanks
ChenYu
> Mem abort info:
> ESR = 0x0000000096000004
> EC = 0x25: DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x04: level 0 translation fault
> Data abort info:
> ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101380000
> [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
> Internal error: Oops: 0000000096000004 [#1] SMP
> Modules linked in: btusb btrtl mt6315_regulator btintel mtk_vcodec_dec
> btmtk v4l2_vp9 btbcm mtk_vcodec_enc v4l2_h264 mtk_vcodec_dbgfs
> bluetooth mtk_jpeg ecdh_generic mtk_vcodec_common mtk_jpeg_enc_hw
> mtk_jpeg_dec_hw ecc v4l2_mem2mem rfkill videobuf2_dma_contig
> videobuf2_memops videobuf2_v4l2 videodev videobuf2_common mc
> mt6360_charger mcp251xfd it5205 goodix_ts can_dev panfrost
> drm_shmem_helper gpu_sched lvts_thermal mtk_svs mtk_adsp_mailbox
> snd_soc_dmic mtk_scp mtk_rpmsg mtk_scp_ipi mtk_cmdq_mailbox
> mediatek_cpufreq_hw spmi_mtk_pmif fuse dm_mod ip_tables x_tables
> ipv6 mediatek_drm tcpci_mt6360 drm_display_helper drm_client_lib
> tcpci drm_dma_helper drm_kms_helper tcpm drm mtk_mutex mtk_mmsys
> typec rtc_mt6397 mtk_cmdq_helper phy_mtk_pcie pcie_mediatek_gen3
> dwmac_mediatek stmmac_platform stmmac pwm_mtk_disp pcs_xpcs pwm_bl
> backlight
> CPU: 5 UID: 0 PID: 12 Comm: kworker/u32:0 Not tainted 6.14.0-next-20250326 #4 PREEMPT
> Hardware name: MediaTek Genio 1200 EVK-P1V2-EMMC (DT)
> Workqueue: events_unbound deferred_probe_work_func
> pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : mtk_iommu_device_group+0x2c/0xe0
> lr : __iommu_probe_device+0x130/0x490
> sp : ffff8000827b3970
> x29: ffff8000827b3970 x28: ffff0000c0028000 x27: ffff80008164f2b8
> x26: ffff0000cbc47130 x25: ffff80008096a9f0 x24: ffff80008164f5a8
> x23: ffff800082684b60 x22: ffff8000827b3a80 x21: ffff0000cbc47130
> x20: ffffffffffffffed x19: ffff0000c117c010 x18: 0000000000000000
> x17: 6f702d616d642d64 x16: 6574636972747365 x15: 0000000000000002
> x14: 0000000000000000 x13: 0000000000128d55 x12: 6f632d616d642e30
> x11: 0000000000000100 x10: 0000000000000001 x9 : 0000000000000220
> x8 : 0101010101010101 x7 : ffff0000c117c010 x6 : 306c766f2c727461
> x5 : 0000000000000000 x4 : ffff0000c10a2de8 x3 : ffff0000c10a2e70
> x2 : ffff0000c01a4600 x1 : 0000000000000000 x0 : ffff0000c65470c0
> Call trace:
> mtk_iommu_device_group+0x2c/0xe0 (P)
> __iommu_probe_device+0x130/0x490
> probe_iommu_group+0x3c/0x70
> bus_for_each_dev+0x7c/0xe0
> iommu_device_register+0xd8/0x22c
> mtk_iommu_probe+0x270/0x53c
> platform_probe+0x68/0xd8
> really_probe+0xbc/0x2c0
> __driver_probe_device+0x78/0x120
> driver_probe_device+0x3c/0x154
> __device_attach_driver+0xb8/0x140
> bus_for_each_drv+0x88/0xe8
> __device_attach+0xa0/0x190
> device_initial_probe+0x14/0x20
> bus_probe_device+0xb4/0xc0
> deferred_probe_work_func+0x90/0xc8
> process_one_work+0x148/0x284
> worker_thread+0x2cc/0x3cc
> kthread+0x12c/0x204
> ret_from_fork+0x10/0x20
> Code: b4000500 f9401c01 92800254 f9409821 (f9400035)
> ---[ end trace 0000000000000000 ]---
> ```
>
> I've tested this patch on Mediatek Genio 510-EVK and 1200-EVK boards
> with a kernel based on linux-next (tag: next-20250327).
> ---
> Changes in v2:
> - Fix goto label usage in device registration error case
> - Add review and test trailers
> - Link to v1: https://lore.kernel.org/r/20250327-fix-mtk-iommu-error-v1-1-df969158e752@collabora.com
> ---
> drivers/iommu/mtk_iommu.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 034b0e670384a24df10130cbbff95ce8e0bc092d..df98d0c65f5469c6803cd9d151c85ad855558cf5 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -1372,15 +1372,6 @@ static int mtk_iommu_probe(struct platform_device *pdev)
> platform_set_drvdata(pdev, data);
> mutex_init(&data->mutex);
>
> - ret = iommu_device_sysfs_add(&data->iommu, dev, NULL,
> - "mtk-iommu.%pa", &ioaddr);
> - if (ret)
> - goto out_link_remove;
> -
> - ret = iommu_device_register(&data->iommu, &mtk_iommu_ops, dev);
> - if (ret)
> - goto out_sysfs_remove;
> -
> if (MTK_IOMMU_HAS_FLAG(data->plat_data, SHARE_PGTABLE)) {
> list_add_tail(&data->list, data->plat_data->hw_list);
> data->hw_list = data->plat_data->hw_list;
> @@ -1390,19 +1381,28 @@ static int mtk_iommu_probe(struct platform_device *pdev)
> data->hw_list = &data->hw_list_head;
> }
>
> + ret = iommu_device_sysfs_add(&data->iommu, dev, NULL,
> + "mtk-iommu.%pa", &ioaddr);
> + if (ret)
> + goto out_list_del;
> +
> + ret = iommu_device_register(&data->iommu, &mtk_iommu_ops, dev);
> + if (ret)
> + goto out_sysfs_remove;
> +
> if (MTK_IOMMU_IS_TYPE(data->plat_data, MTK_IOMMU_TYPE_MM)) {
> ret = component_master_add_with_match(dev, &mtk_iommu_com_ops, match);
> if (ret)
> - goto out_list_del;
> + goto out_device_unregister;
> }
> return ret;
>
> -out_list_del:
> - list_del(&data->list);
> +out_device_unregister:
> iommu_device_unregister(&data->iommu);
> out_sysfs_remove:
> iommu_device_sysfs_remove(&data->iommu);
> -out_link_remove:
> +out_list_del:
> + list_del(&data->list);
> if (MTK_IOMMU_IS_TYPE(data->plat_data, MTK_IOMMU_TYPE_MM))
> device_link_remove(data->smicomm_dev, dev);
> out_runtime_disable:
>
> ---
> base-commit: bc960e3e32c8b940c10b14557271355f66ae4db1
> change-id: 20250327-fix-mtk-iommu-error-af6ec347d057
>
> Best regards,
> --
> Louis-Alexis Eyraud <louisalexis.eyraud@...labora.com>
>
Powered by blists - more mailing lists