[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210916143257.GA1608462@bjorn-Precision-5520>
Date: Thu, 16 Sep 2021 09:32:57 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Pali Rohár <pali@...nel.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
Marek Behún <kabel@...nel.org>,
Krzysztof Wilczyński <kw@...ux.com>,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI: aardvark: Implement re-issuing config requests on
CRS response
On Wed, Sep 15, 2021 at 12:55:53PM +0200, Pali Rohár wrote:
> On Tuesday 14 September 2021 15:55:26 Bjorn Helgaas wrote:
> > It is illegal for a device to return CRS after it has returned a
> > successful completion unless an intervening reset has occurred, so
> > drivers and other code should never see it.
> >
> > > And issue is there also with write requests. Is somebody checking return
> > > value of pci_bus_write_config function?
> >
> > Similar case here. The enumeration and wait-after-reset paths always
> > do *reads* until we get a successful completion, so I don't think we
> > ever issue a write that can get CRS.
>
> Yes, in normal conditions we should not see it.
>
> But for testing purposes (that emulated bridge works fine) I'm using
> setpci for changing some configuration.
>
> And via setpci it is possible to turn off CRSSVE bit in which case then
> Root Complex should re-issue request again.
>
> I'm not sure how "legal" it is if userspace / setpci changes some of
> these bits. At least on a hardware with a real Root Port device it
> should be fully transparent. As hardware handles this re-issue and
> kernel then would see (reissued) response.
If setpci changes bits like these, all bets are off. We can't tell
what happened, so we can't rely on any configuration Linux did. I
think we really should taint the kernel when this happens.
> Test case: Initialize device, then unbind it from sysfs, reset it (hot
> reset or warm reset) and then rescan / reinit it again. Here device is
> permitted to send CRS response.
>
> We know that more PCIe cards are buggy and sometimes firmware on cards
> crashes or resets card logic. Which may put card into initialization
> state when it is again permitted to send CRS response.
Yep. That's a buggy device and normally we would work around it with
a quirk. This particular kind of bug would be hard to work around,
but a host bridge driver doesn't seem like the right place to do it
because we'd have to do it in *every* such driver.
Bjorn
Powered by blists - more mailing lists