lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <339ffb6f02457f7a293f19dbfed2aaf724f5c3d9.camel@redhat.com>
Date:   Wed, 19 May 2021 13:50:31 +0200
From:   Benjamin Berg <bberg@...hat.com>
To:     Heikki Krogerus <heikki.krogerus@...ux.intel.com>
Cc:     Bjorn Andersson <bjorn.andersson@...aro.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] usb: typec: ucsi: Clear pending after acking connector
 change

On Wed, 2021-05-19 at 14:33 +0300, Heikki Krogerus wrote:
> On Tue, May 18, 2021 at 08:04:14PM +0200, Benjamin Berg wrote:
> > On Tue, 2021-05-18 at 16:29 +0300, Heikki Krogerus wrote:
> > > On Mon, May 17, 2021 at 02:57:28PM +0200, Benjamin Berg wrote:
> > > > 
> > > > [SNIP]
> > > > Unfortunately, I don't feel it'll work. The problem that I was
> > > > seeing
> > > > looked like a race condition in the PPM itself, where the
> > > > window is
> > > > the
> > > > time between the UCSI_GET_CONNECTOR_STATUS command and the
> > > > subsequent
> > > > ACK.
> > > > For such a firmware level bug in the PPM, we need a way to
> > > > detect
> > > > the
> > > > race condition when it happens (or get a fix for the firmware).
> > > 
> > > OK. Let me know does the patch bring the issue back for you.
> > 
> > So, I just tried the patch, and I can occasionally reproduce the
> > issue
> > where "online" for the ucsi power adapter is stuck at "1" after
> > unplugging with the patch applied.
> 
> Thanks for testing it.
> 
> I'm still not sure that the PPM is the culprit here. I have a feeling
> that the problem you are seeing is caused by the workaround (bad
> workaround) that we have for the issue where the EC firmware does not
> return with the BUSY bit set in the CCI when it should in many cases.
> The UCSI ACPI driver has one minute timeout value for command
> completion because of that, which is way too long. So if the EC
> firmware decides to take its time before acknowledging command, the
> driver is stuck, and we start loosing the events...

Hmm, interesting, yes. If the PPM is sending a second change event
before or after we ACK'ed the first one, and we loose this event, then
that would absolutely explain the issue I am seeing.

In my case, I was able to considerably increase the probability of the
bug by inserting a sleep just before acknowledging the connector
change. Not sure whether this is meaningful, but maybe that tells us
something about who is at fault here.

Benjamin

> Well, I guess
> technically the PPM would be the culprit in the end in any case, but
> I'm just not sure that there is any race like you suspected.
> 
> But this is off topic. I'll send you an RFC proposal what I think we
> could do about that.
> 
> 
> thanks,
> 


Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ