lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5B8DA87D05A7694D9FA63FD143655C1B542F6305@hasmsx108.ger.corp.intel.com>
Date:   Fri, 7 Oct 2016 20:10:39 +0000
From:   "Winkler, Tomas" <tomas.winkler@...el.com>
To:     Jason Gunthorpe <jgunthorpe@...idianresearch.com>
CC:     Jarkko Sakkinen <jarkko.sakkinen@...ux.intel.com>,
        "tpmdd-devel@...ts.sourceforge.net" 
        <tpmdd-devel@...ts.sourceforge.net>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] tpm: don't destroy chip device prematurely

> Subject: Re: [PATCH] tpm: don't destroy chip device prematurely
> 
> On Fri, Oct 07, 2016 at 02:24:59PM +0000, Winkler, Tomas wrote:
> 
> > So here I'm to say I'm sorry for misleading this, after all the doubts
> > I got back to debugging and traces.  One thing for a reason moving the
> > device_del, had really made the problem go away, but the real problem
> > was unbalance runtime_pm PUT/GET from the tpm_crb probe function.
> 
> Oh this is very good news, I'm glad this was resolved in crb!
> 
> Presumably the unbalanced put made the ref count go negative and the
> balanced get caused it to go to zero, so pm locking was basically totally
> broken? That would explain how an idle callback could run concurrently with
> transmit_cmd.

This is not due to locking and refcount, but similar. The usage_count went negative  and the idle callback kicked in from the pm work queue, and suspended the device.

> 
> Though a bit of a mystery why device_del had any impact? I'm still very
> unclear exactly how the child device effects the parent - and that seems like
> pretty important information going forward..

Yes, there is some dependency as if device_del is not called the idle callback doesn't kick in between send and receive and that was misleading.  I'm not sure but this could be due to scheduling of the pm worker, but I'm not sure.   In any case we hit the issue even w/o device_del if the device is exercise enough.  I will dig into that later.

Thanks
Tomas



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ