lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7nce4if7gowtbvenqhwzw6bazgfcgml6enwufomqxs4uruj3vs@sgagkj3zpx4t>
Date: Sat, 11 Jan 2025 13:22:12 -0600
From: Bjorn Andersson <andersson@...nel.org>
To: Krzysztof Kozlowski <krzk@...nel.org>
Cc: Abel Vesa <abel.vesa@...aro.org>, 
	Konrad Dybcio <konradybcio@...nel.org>, Caleb Connolly <caleb.connolly@...aro.org>, 
	Dmitry Baryshkov <dmitry.baryshkov@...aro.org>, Johan Hovold <johan@...nel.org>, 
	Neil Armstrong <neil.armstrong@...aro.org>, linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	stable@...r.kernel.org
Subject: Re: [PATCH RFC] soc: qcom: pmic_glink: Fix device access from worker
 during suspend

On Sat, Jan 11, 2025 at 04:35:09PM +0100, Krzysztof Kozlowski wrote:
> On 10/01/2025 16:29, Abel Vesa wrote:
> > The pmic_glink_altmode_worker() currently gets scheduled on the system_wq.
> > When the system is suspended (s2idle), the fact that the worker can be
> > scheduled to run while devices are still suspended provesto be a problem
> > when a Type-C retimer, switch or mux that is controlled over a bus like
> > I2C, because the I2C controller is suspended.
> > 
> > This has been proven to be the case on the X Elite boards where such
> > retimers (ParadeTech PS8830) are used in order to handle Type-C
> > orientation and altmode configuration. The following warning is thrown:
> > 
> > [   35.134876] i2c i2c-4: Transfer while suspended
> > [   35.143865] WARNING: CPU: 0 PID: 99 at drivers/i2c/i2c-core.h:56 __i2c_transfer+0xb4/0x57c [i2c_core]
> > [   35.352879] Workqueue: events pmic_glink_altmode_worker [pmic_glink_altmode]
> > [   35.360179] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> > [   35.455242] Call trace:
> > [   35.457826]  __i2c_transfer+0xb4/0x57c [i2c_core] (P)
> > [   35.463086]  i2c_transfer+0x98/0xf0 [i2c_core]
> > [   35.467713]  i2c_transfer_buffer_flags+0x54/0x88 [i2c_core]
> > [   35.473502]  regmap_i2c_write+0x20/0x48 [regmap_i2c]
> > [   35.478659]  _regmap_raw_write_impl+0x780/0x944
> > [   35.483401]  _regmap_bus_raw_write+0x60/0x7c
> > [   35.487848]  _regmap_write+0x134/0x184
> > [   35.491773]  regmap_write+0x54/0x78
> > [   35.495418]  ps883x_set+0x58/0xec [ps883x]
> > [   35.499688]  ps883x_sw_set+0x60/0x84 [ps883x]
> > [   35.504223]  typec_switch_set+0x48/0x74 [typec]
> > [   35.508952]  pmic_glink_altmode_worker+0x44/0x1fc [pmic_glink_altmode]
> > [   35.515712]  process_scheduled_works+0x1a0/0x2d0
> > [   35.520525]  worker_thread+0x2a8/0x3c8
> > [   35.524449]  kthread+0xfc/0x184
> > [   35.527749]  ret_from_fork+0x10/0x20
> > 
> > The solution here is to schedule the altmode worker on the system_freezable_wq
> > instead of the system_wq. This will result in the altmode worker not being
> > scheduled to run until the devices are resumed first, which will give the
> > controllers like I2C a chance to resume before the transfer is requested.
> > 
> > Fixes: 080b4e24852b ("soc: qcom: pmic_glink: Introduce altmode support")
> > Cc: stable@...r.kernel.org    # 6.3
> > Signed-off-by: Abel Vesa <abel.vesa@...aro.org>
> 
> This is an incomplete fix, I think. You fix one case but several other
> possibilities are still there:
> 

I agree, this whacks only one mole, but it's reasonable to expect that
there are more hidden here.

> 1. Maybe the driver just lacks proper suspend/resume handling?
> I assume all this happens during system suspend, so what certainty you
> have that your second work - pmic_glink_altmode_pdr_notify() - is not
> executed as well?
> 
> 2. Follow up: all other drivers and all other future use cases will be
> affected as well. Basically what this patch is admitting is that driver
> can be executed anytime, even during suspend, so each call of
> pmic_glink_send() has to be audited. Now and in the future, because what
> stops some developer of adding one more path calling pmic_glink_send(),
> which also turns out to be executed during suspend?
> 
> 3. So qcom_battmgr.c is buggy as well?
> 
> 4. ucsi_glink.c? I don't see handling suspend, either...
> 
> Maybe the entire problem is how pmic glink was designed: not as proper
> bus driver which handles both child-parent relationship and system suspend.

The underlying problem is that GLINK register its interrupt as
IRQF_NO_SUSPEND (for historical reasons) and as such incoming messages
will be delivered in late suspend and early resume. In this specific
case, a specific message is handled by pmic_glink_altmode_callback(), by
invoking schedule_work() which in this case happens to schedule
pmic_glink_altmode_worker before we've resumed the I2C controller. 

I presume with your suggestion about a pmic_glink bus driver we'd come
up with some mechanism for pmic_glink to defer these messages until
resume has happened?

As you suggest, I too suspect that we have more of these hidden in other
rpmsg client drivers.


In the discussions leading up to this patch we agreed that a better
solution would be to change GLINK (SMEM) to not deliver messages when
the system is suspended. But as this has an impact on how GLINK may or
may not wake up the system, Abel's fix is a reasonable stop-gap solution
while we work that out.

That said, this backstory, the description of the actual underlying
problem, the planned longevity (shortgevity?) of this fix are missing
from the commit message. As written, we could expect a good Samaritan to
come in and replicate this fix across all those other use cases,
contrary to the agreed plans.


@Abel, can you please make sure that your commit message captures those
aspects as well?

Regards,
Bjorn

> Best regards,
> Krzysztof
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ