lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180430211740.GG95643@bhelgaas-glaptop.roam.corp.google.com>
Date:   Mon, 30 Apr 2018 16:17:40 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Sinan Kaya <okaya@...eaurora.org>
Cc:     Paul Menzel <pmenzel+linux-pci@...gen.mpg.de>,
        Dave Young <dyoung@...hat.com>, linux-pci@...r.kernel.org,
        kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Lukas Wunner <lukas@...ner.de>,
        Eric Biederman <ebiederm@...ssion.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Vivek Goyal <vgoyal@...hat.com>
Subject: Re: pciehp 0000:00:1c.0:pcie004: Timeout on hotplug command 0x1038
 (issued 65284 msec ago)

On Mon, Apr 30, 2018 at 04:48:15PM -0400, Sinan Kaya wrote:
> Bjorn,
> 
> On 4/28/2018 9:03 AM, okaya@...eaurora.org wrote:
> >> Hmm, if it is the remove() method then kexec does not use it.  kexec use
> >> the shutdown() method instead.  I missed this details when I replied.
> > 
> > Portdrv hooks up remove handler to shutdown. That's why remove is getting called.
> 
> What should we do about this?
> 
> Since there is an actual HW errata involved, should we quirk this
> root port and not wait as if remove/shutdown doesn't exist?

I was hoping to avoid a quirk because AFAIK all Intel parts have this
issue so it will be an ongoing maintenance issue.  I tried to avoid
the timeout delays, e.g., with 40b960831cfa ("PCI: pciehp: Compute
timeout from hotplug command start time").

But we still see the alarming messages, so we should probably add a
quirk to get rid of those.

But I haven't given up on the idea of getting rid of the
pciehp_remove() path.  I'm not convinced yet that we actually need to
do anything to shut this device down.  I don't like the assumption
that kexec requires this.  The kexec is fundamentally just a branch,
and anything we do before the branch (i.e., in the old kernel), we
should also be able to do after the branch (i.e., in the kexec-ed
kernel).

> Paul,
> You might want to file a bugzilla so that we can keep our debug
> efforts out of this list.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ