lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5d8101c2c9f6c4b965641dadbaf837e8@kernel.org>
Date:   Mon, 22 Jun 2020 16:07:52 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     Heiko Stübner <heiko@...ech.de>
Cc:     Enric Balletbo i Serra <enric.balletbo@...labora.com>,
        MyungJoo Ham <myungjoo.ham@...sung.com>,
        Kyungmin Park <kyungmin.park@...sung.com>,
        Chanwoo Choi <cw00.choi@...sung.com>, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PM / devfreq: rk3399_dmc: Fix kernel oops when
 rockchip,pmu is absent

Hi Heiko,

On 2020-06-22 14:54, Heiko Stübner wrote:
> Hi Marc,
> 
> Am Montag, 22. Juni 2020, 15:31:55 CEST schrieb Marc Zyngier:
>> On Sat, 13 Jun 2020 11:24:35 +0100
>> Marc Zyngier <maz@...nel.org> wrote:
>> 
>> > Booting a recent kernel on a rk3399-based system (nanopc-t4),
>> > equipped with a recent u-boot and ATF results in the following:
>> >
>> > [    5.607431] Unable to handle kernel NULL pointer dereference at virtual address 00000000000001e4
>> > [    5.608219] Mem abort info:
>> > [    5.608469]   ESR = 0x96000004
>> > [    5.608749]   EC = 0x25: DABT (current EL), IL = 32 bits
>> > [    5.609223]   SET = 0, FnV = 0
>> > [    5.609600]   EA = 0, S1PTW = 0
>> > [    5.609891] Data abort info:
>> > [    5.610149]   ISV = 0, ISS = 0x00000004
>> > [    5.610489]   CM = 0, WnR = 0
>> > [    5.610757] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e62fb000
>> > [    5.611326] [00000000000001e4] pgd=0000000000000000, p4d=0000000000000000
>> > [    5.611931] Internal error: Oops: 96000004 [#1] SMP
>> > [    5.612363] Modules linked in: rockchip_thermal(E+) rk3399_dmc(E+) soundcore(E) dw_wdt(E) rockchip_dfi(E) nvmem_rockchip_efuse(E) pwm_rockchip(E) cfg80211(E+) rockchip_saradc(E) industrialio(E) rfkill(E) cpufreq_dt(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) realtek(E) nvme(E) nvme_core(E) t10_pi(E) xhci_plat_hcd(E) xhci_hcd(E) rtc_rk808(E) rk808_regulator(E) clk_rk808(E) dwc3(E) udc_core(E) roles(E) ulpi(E) rk808(E)
>> fan53555(E) rockchipdrm(E) analogix_dp(E) dw_hdmi(E) cec(E)
>> dw_mipi_dsi(E) fixed(E) dwc3_of_simple(E) phy_rockchip_emmc(E)
>> gpio_keys(E) drm_kms_helper(E) phy_rockchip_inno_usb2(E)
>> ehci_platform(E) dwmac_rk(E) stmmac_platform(E) phy_rockchip_pcie(E)
>> ohci_platform(E) ohci_hcd(E) rockchip_io_domain(E) stmmac(E)
>> phy_rockchip_typec(E) ehci_hcd(E) sdhci_of_arasan(E) mdio_xpcs(E)
>> sdhci_pltfm(E) cqhci(E) drm(E) sdhci(E) phylink(E) of_mdio(E)
>> usbcore(E) i2c_rk3x(E) dw_mmc_rockchip(E) dw_mmc_pltfm(E) dw_mmc(E)
>> fixed_phy(E) libphy(E)
>> > [    5.612454]  pl330(E)
>> > [    5.620255] CPU: 1 PID: 270 Comm: systemd-udevd Tainted: G            E     5.7.0-13692-g83ae758d8b22 #1157
>> > [    5.621110] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2020.07-rc4-00023-g10d4cafe0f 06/10/2020
>> > [    5.621947] pstate: 40000005 (nZcv daif -PAN -UAO BTYPE=--)
>> > [    5.622446] pc : regmap_read+0x1c/0x80
>> > [    5.622787] lr : rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
>> > [    5.623299] sp : ffff8000126cb8a0
>> > [    5.623594] x29: ffff8000126cb8a0 x28: ffff8000126cbdb0
>> > [    5.624063] x27: ffff0000f22dac40 x26: ffff0000f6779800
>> > [    5.624533] x25: ffff0000f6779810 x24: 00000000ffffffea
>> > [    5.625002] x23: 00000000ffffffea x22: ffff0000f65b74c8
>> > [    5.625471] x21: ffff0000f783ca08 x20: ffff0000f65b7480
>> > [    5.625941] x19: 0000000000000000 x18: 0000000000000001
>> > [    5.626410] x17: 0000000000000000 x16: 0000000000000000
>> > [    5.626878] x15: ffff0000f22db138 x14: ffffffffffffffff
>> > [    5.627347] x13: 0000000000000018 x12: ffff80001106a8c7
>> > [    5.627817] x11: 0000000000000003 x10: 0101010101010101
>> > [    5.627861] systemd[1]: Found device SPCC M.2 PCIE SSD 3.
>> > [    5.628286] x9 : ffff800008d7c89c x8 : 7f7f7f7f7f7f7f7f
>> > [    5.629238] x7 : fefefeff646c606d x6 : 1c0e0e0ee3e8e9f0
>> > [    5.629709] x5 : 706968630e0e0e1c x4 : 8080808000000000
>> > [    5.630178] x3 : 937b1b5b1b434b80 x2 : ffff8000126cb944
>> > [    5.630648] x1 : 0000000000000308 x0 : 0000000000000000
>> > [    5.631119] Call trace:
>> > [    5.631346]  regmap_read+0x1c/0x80
>> > [    5.631654]  rk3399_dmcfreq_probe+0x6a4/0x8c0 [rk3399_dmc]
>> > [    5.632142]  platform_drv_probe+0x5c/0xb0
>> > [    5.632500]  really_probe+0xe4/0x448
>> > [    5.632819]  driver_probe_device+0xfc/0x168
>> > [    5.633191]  device_driver_attach+0x7c/0x88
>> > [    5.633567]  __driver_attach+0xac/0x178
>> > [    5.633914]  bus_for_each_dev+0x78/0xc8
>> > [    5.634261]  driver_attach+0x2c/0x38
>> > [    5.634582]  bus_add_driver+0x14c/0x230
>> > [    5.634925]  driver_register+0x6c/0x128
>> > [    5.635269]  __platform_driver_register+0x50/0x60
>> > [    5.635692]  rk3399_dmcfreq_driver_init+0x2c/0x1000 [rk3399_dmc]
>> > [    5.636226]  do_one_initcall+0x50/0x230
>> > [    5.636569]  do_init_module+0x60/0x248
>> > [    5.636902]  load_module+0x21f8/0x28d8
>> > [    5.637237]  __do_sys_finit_module+0xb0/0x118
>> > [    5.637627]  __arm64_sys_finit_module+0x28/0x38
>> > [    5.638031]  el0_svc_common.constprop.0+0x7c/0x1f8
>> > [    5.638456]  do_el0_svc+0x2c/0x98
>> > [    5.638754]  el0_svc+0x18/0x48
>> > [    5.639029]  el0_sync_handler+0x8c/0x2d4
>> > [    5.639378]  el0_sync+0x158/0x180
>> > [    5.639680] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (b941e400)
>> > [    5.640221] ---[ end trace 63675fe5d0021970 ]---
>> >
>> > This turns out to be due to the rk3399-dmc driver looking for
>> > an *undocumented* property (rockchip,pmu), and happily using
>> > a NULL pointer when the property isn't there.
>> >
>> > The very existence of this driver in the kernel is highly doubtful
>> > (I'd expect firmware to deal with this directly), but in the meantime
>> > let's prevent it from oopsing the kernel at probe time if this
>> > property isn't present.
> 
> TF-A is handling the actual frequency scaling, this driver is like a
> glorified wrapper around the TF-A interface ... and the dmc_clock it
> calls is just this firmware-interface (see 
> drivers/clk/rockchip/clk-ddr.c)
> 
> And I guess it also works around some missing Coreboot functionality.
> 
> On u-boot we have ddr-timings in the uboot-devicetree which
> _now_ after so many years finally also gets passed on to TF-A
> but coreboot uses a completely different system, so I guess ChromeOS
> used this to also tell TF-A about the actual ram configuration.

It is all very convoluted... :-/

>> > Signed-off-by: Marc Zyngier <maz@...nel.org>
>> > ---
>> >  drivers/devfreq/rk3399_dmc.c | 17 ++++++++++-------
>> >  1 file changed, 10 insertions(+), 7 deletions(-)
>> >
>> > diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
>> > index 24f04f78285b..bee233a2e0ce 100644
>> > --- a/drivers/devfreq/rk3399_dmc.c
>> > +++ b/drivers/devfreq/rk3399_dmc.c
>> > @@ -371,13 +371,16 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
>> >  	}
>> >
>> >  	node = of_parse_phandle(np, "rockchip,pmu", 0);
>> > -	if (node) {
>> > -		data->regmap_pmu = syscon_node_to_regmap(node);
>> > -		of_node_put(node);
>> > -		if (IS_ERR(data->regmap_pmu)) {
>> > -			ret = PTR_ERR(data->regmap_pmu);
>> > -			goto err_edev;
>> > -		}
>> > +	if (!node) {
>> > +		ret = -ENODEV;
>> > +		goto err_edev;
>> > +	}
>> > +
>> > +	data->regmap_pmu = syscon_node_to_regmap(node);
>> > +	of_node_put(node);
>> > +	if (IS_ERR(data->regmap_pmu)) {
>> > +		ret = PTR_ERR(data->regmap_pmu);
>> > +		goto err_edev;
>> >  	}
>> >
>> >  	regmap_read(data->regmap_pmu, RK3399_PMUGRF_OS_REG2, &val);
>> 
>> 
>> Any opinion on this patch? I can't believe I'm the only one hitting
>> this.
> 
> Looking at my test-kernel-config, I don't seem to actually build this 
> driver
> though I'm also not using boards productively.
> 
> But looking deeper, I'm either blind or nothing really carries a 
> dt-node
> with a "rockchip,rk3399-dmc" compatible at all, so I'm wondering how 
> you
> could actually hit it yourself :-) .

maz@...e-girl:~$ sudo dtc -I dtb /sys/firmware/fdt 2>/dev/null | grep -A 
5 dmc
	dmc {
		u-boot,dm-pre-reloc;
		compatible = "rockchip,rk3399-dmc";
		devfreq-events = <0xc8>;

[followed by a ton of timings...]

It is definitely coming from u-boot (I don't provide any DTB otherwise,
and you can find the corresponding node and timings in the u-boot tree).

> The change was introduced only last year with
> commit 9173c5ceb035 ("PM / devfreq: rk3399_dmc: Pass ODT and auto
> power down parameters to TF-A.")
> 
> So adding that rockchip,pmu property should be optional to not break
> dt-bindings, so I guess instead of erroring out, the one regmap_read 
> below
> that (only one using the regmap) should actually just check for
> regmap_pmu != NULL?

But what does it mean then? Not finding the regmap means not finding
the DDR type, in which case the best you can do is to hit the default
case which errors out.

So I went ahead and made the whole of 9173c5ceb035 depend on the
regmap being available. I now get:

[    6.618548] rk3399-dmc-freq ffa80000.dmc: Invalid operating-points in 
device tree.
[    6.619256] rk3399-dmc-freq: probe of ffa80000.dmc failed with error 
-22

which suits me perfectly.

I'll post a new revision in a few minutes.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ