lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zad1K5V8mhNiiMWl@cae.in-ulm.de>
Date: Wed, 17 Jan 2024 07:35:23 +0100
From: "Christian A. Ehrhardt" <lk@...e.de>
To: Mario Limonciello <mario.limonciello@....com>
Cc: Heikki Krogerus <heikki.krogerus@...ux.intel.com>,
	linux-usb@...r.kernel.org,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Neil Armstrong <neil.armstrong@...aro.org>,
	Hans de Goede <hdegoede@...hat.com>,
	Saranya Gopal <saranya.gopal@...el.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC] Fix stuck UCSI controller on DELL


Hi Mario,

On Tue, Jan 16, 2024 at 09:00:03PM -0600, Mario Limonciello wrote:
> On 1/15/2024 12:55, Christian A. Ehrhardt wrote:
> > 
> > Hi Heikki,
> > 
> > sorry to bother you again with this but I'm afraid there's
> > a misunderstanding wrt. the nature of the quirk. See below:
> > 
> > On Thu, Jan 04, 2024 at 01:59:02PM +0200, Heikki Krogerus wrote:
> > > Hi Christian,
> > > 
> > > On Wed, Jan 03, 2024 at 11:06:35AM +0100, Christian A. Ehrhardt wrote:
> > > > I have a DELL Latitude 5431 where typec only works somewhat.
> > > > After the first plug/unplug event the PPM seems to be stuck and
> > > > commands end with a timeout (GET_CONNECTOR_STATUS failed (-110)).
> > > > 
> > > > This patch fixes it for me but according to my reading it is in
> > > > violation of the UCSI spec. On the other hand searching through
> > > > the net it appears that many DELL models seem to have timeout problems
> > > > with UCSI.
> > > > 
> > > > Do we want some kind of quirk here? There does not seem to be a quirk
> > > > framework for this part of the code, yet. Or is it ok to just send the
> > > > additional ACK in all cases and hope that the PPM will do the right
> > > > thing?
> > > 
> > > We can use DMI quirks. Something like the attached diff (not tested).
> > > 
> > > thanks,
> > > 
> > > -- 
> > > heikki
> > 
> > > diff --git a/drivers/usb/typec/ucsi/ucsi_acpi.c b/drivers/usb/typec/ucsi/ucsi_acpi.c
> > > index 6bbf490ac401..7e8b1fcfa024 100644
> > > --- a/drivers/usb/typec/ucsi/ucsi_acpi.c
> > > +++ b/drivers/usb/typec/ucsi/ucsi_acpi.c
> > > @@ -113,18 +113,44 @@ ucsi_zenbook_read(struct ucsi *ucsi, unsigned int offset, void *val, size_t val_
> > >   	return 0;
> > >   }
> > > -static const struct ucsi_operations ucsi_zenbook_ops = {
> > > -	.read = ucsi_zenbook_read,
> > > -	.sync_write = ucsi_acpi_sync_write,
> > > -	.async_write = ucsi_acpi_async_write
> > > -};
> > > +static int ucsi_dell_sync_write(struct ucsi *ucsi, unsigned int offset,
> > > +				const void *val, size_t val_len)
> > > +{
> > > +	u64 ctrl = *(u64 *)val;
> > > +	int ret;
> > > +
> > > +	ret = ucsi_acpi_sync_write(ucsi, offset, val, val_len);
> > > +	if (ret && (ctrl & (UCSI_ACK_CC_CI | UCSI_ACK_CONNECTOR_CHANGE))) {
> > > +		ctrl= UCSI_ACK_CC_CI | UCSI_ACK_COMMAND_COMPLETE;
> > > +
> > > +		dev_dbg(ucsi->dev->parent, "%s: ACK failed\n", __func__);
> > > +		ret = ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &ctrl, sizeof(ctrl));
> > > +	}
> > 
> > Unfortunately, this has the logic reversed. The quirk (i.e. the
> > additional UCSI_ACK_COMMAND_COMPLETE) is required after a _successful_
> > UCSI_ACK_CONNECTOR_CHANGE. Otherwise, _subsequent_ commands will timeout
> > (usually the next GET_CONNECTOR_CHANGE).
> > 
> > This means the quirk must be applied _before_ we detect any failure.
> > Consequently, the quirk has the potential to break working systems.
> > 
> > Sorry, if that wasn't clear from my original mail. Please let me know
> > if this changes how you want the quirks handled.
> > 
> >       Thanks    Christian
> > 
> 
> For the problematic scenario have you tried to play with it a bit to see if
> it's too short of a timeout (raise timeout) or to output the response bits
> to see if anything else surprising is sent?

It is not a problem with the timeout. Waiting forever in this case
doesn't help. IMHO this is actually a bug in the PPM, i.e. in Dell's
bios.

Sending an ack after the timeout fixes things, though.

> Does it always fail on the same command, or does it happen to a bunch of
> them?

It always fails on the first command after UCSI_ACK_CC_CI for a
connector change. However, there might be no such command if the
next event is a notification.

I did play around with it a bit more and came up with a way to
probe for the issue:

    https://lore.kernel.orgorg/all/20240116224041.220740-1-lk@c--e.de/   

regards    Christian



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ