linux-kernel - Re: [PATCH v5 1/2] PCI: iproc: Retry request when CRS returned from EP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMSpPPdxiqwZoUmpL3eyD=SJin8_Wa=4Q0ncXJpsf7kbwqZNsQ@mail.gmail.com>
Date:   Fri, 4 Aug 2017 20:04:20 +0530
From:   Oza Oza <oza.oza@...adcom.com>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     Bjorn Helgaas <bhelgaas@...gle.com>, Ray Jui <rjui@...adcom.com>,
        Scott Branden <sbranden@...adcom.com>,
        Jon Mason <jonmason@...adcom.com>,
        BCM Kernel Feedback <bcm-kernel-feedback-list@...adcom.com>,
        Andy Gospodarek <gospo@...adcom.com>,
        linux-pci <linux-pci@...r.kernel.org>,
        linux-kernel@...r.kernel.org,
        Oza Pawandeep <oza.pawandeep@...il.com>
Subject: Re: [PATCH v5 1/2] PCI: iproc: Retry request when CRS returned from EP

On Fri, Aug 4, 2017 at 7:07 PM, Bjorn Helgaas <helgaas@...nel.org> wrote:
> On Fri, Aug 04, 2017 at 11:29:29AM +0530, Oza Oza wrote:
>> On Fri, Aug 4, 2017 at 12:27 AM, Bjorn Helgaas <helgaas@...nel.org> wrote:
>> > On Thu, Aug 03, 2017 at 01:50:29PM +0530, Oza Oza wrote:
>> >> On Thu, Aug 3, 2017 at 2:34 AM, Bjorn Helgaas <helgaas@...nel.org> wrote:
>> >> > On Thu, Jul 06, 2017 at 08:39:41AM +0530, Oza Pawandeep wrote:
>
>> >> this is documented in our PCIe core hw manual.
>> >> >>
>> >> 4.7.3.3. Retry Status On Configuration Cycle
>> >> Endpoints are allowed to generate retry status on configuration
>> >> cycles.  In this case, the RC needs to re-issue the request.  The IP
>> >> does not handle this because the number of configuration cycles needed
>> >> will probably be less
>> >> than the total number of non-posted operations needed.   When a retry status
>> >> is received on the User RX interface for a configuration request that
>> >> was sent on the User TX interface, it will be indicated with a
>> >> completion with the CMPL_STATUS field set to 2=CRS, and the user will
>> >> have to find the address and data values and send a new transaction on
>> >> the User TX interface.
>> >> When the internal configuration space returns a retry status during a
>> >> configuration cycle (user_cscfg = 1) on the Command/Status interface,
>> >> the pcie_cscrs will assert with the pcie_csack signal to indicate the
>> >> CRS status.
>> >> When the CRS Software Visibility Enable register in the Root Control
>> >> register is enabled, the IP will return the data value to 0x0001 for
>> >> the Vendor ID value and 0xffff  (all 1’s) for the rest of the data in
>> >> the request for reads of offset 0 that return with CRS status.  This
>> >> is true for both the User RX Interface and for the Command/Status
>> >> interface.  When CRS Software Visibility is enabled, the CMPL_STATUS
>> >> field of the completion on the User RX Interface will not be 2=CRS and
>> >> the pcie_cscrs signal will not assert on the Command/Status interface.
>> >> >>
>> >> Broadcom does not sell PCIe core IP, so above information is not
>> >> publicly available in terms of hardware erratum or any similar note.
>> >>
>> >>
>> >> >
>> >> >> As a result of the fact, PCIe RC driver (sw) should take care of CRS.
>> >> >> This patch fixes the problem, and attempts to read config space again in
>> >> >> case of PCIe code forwarding the CRS back to CPU.
>> >> >> It implements iproc_pcie_config_read which gets called for Stingray,
>> >> >> Otherwise it falls back to PCI generic APIs.
> ...
>
>> >> >> +static int iproc_pcie_cfg_retry(void __iomem *cfg_data_p)
>> >> >> +{
>> >> >> +     int timeout = CFG_RETRY_STATUS_TIMEOUT_US;
>> >> >> +     unsigned int ret;
>> >> >> +
>> >> >> +     /*
>> >> >> +      * As per PCI spec, CRS Software Visibility only
>> >> >> +      * affects config read of the Vendor ID.
>> >> >> +      * For config write or any other config read the Root must
>> >> >> +      * automatically re-issue configuration request again as a
>> >> >> +      * new request. Iproc based PCIe RC (hw) does not retry
>> >> >> +      * request on its own, so handle it here.
>> >> >> +      */
>> >> >> +     do {
>> >> >> +             ret = readl(cfg_data_p);
>> >> >> +             if (ret == CFG_RETRY_STATUS)
>> >> >> +                     udelay(1);
>> >> >> +             else
>> >> >> +                     return PCIBIOS_SUCCESSFUL;
>> >> >> +     } while (timeout--);
>> >> >
>> >> > Shouldn't this check *where* in config space we're reading?
>> >> >
>> >> No, we do not require, because with SPDK it was reading BAR space
>> >> which points to MSI-x table.
>> >> what I mean is, it could be anywhere.
>> >
>> > The documentation above says the IP returns data of 0x0001 for *reads
>> > of offset 0*.  Your current code reissues the read if *any* read
>> > anywhere returns data that happens to match CFG_RETRY_STATUS.  That
>> > may be a perfectly valid register value for some device's config
>> > space.  In that case, you do not want to reissue the read and you do
>> > not want to timeout.
>>
>> ok, so the documentation is about PCIe core IP,
>> It is not about the host bridge. (PCI to AXI bridge)
>>
>> PCIe core forwards 0x0001 return code to host bridge, and converts it
>> into 0xffff0001.
>> which is hard wired.
>> we have another HW layer on top of PCIe core which implements lots of
>> logic. (host bridge)
>>
>> I agree with you that, it could be valid data, but our ASIC conveyed
>> that they have chosen the code
>> such that this data is unlikely to be valid.
>
> Hehe.  I think it's more likely that they chose this value because
> it's mentioned in the spec for CRS.  I doubt it has anything to do
> with it being "unlikely".  Or maybe it really is just a coincidence :)
>
> In any event, your documentation says this data is only fabricated for
> reads at offset 0.  So why do you check for it for *all* reads?

sorry, missed to clarify your question:

the Documentation mentioned offset 0, but that is not the case.
it is true for every single configuration access.
so yes, the iproc PCIe core, will not re-issue any request for any offset
in configuration space.

>
>> however I have found one case where this data is valid.
>> which is; BAR exposing 64-bit IO memory, which seems legacy and is rare as well.
>>
>> however in our next chip revision, ASIC will give us separate CRS
>> register in our host-bridge.
>> We will be retrying on that register instead of this code. but that
>> will be the minor code change.
>
> It would be a lot better if your next chip revision handled CRS in
> hardware.  It's not clear this can be done correctly in software and
> even if it can, you're always going to be the exception, which means
> it's likely to continue to be a source of bugs.
>
>> > The PCIe spec says that if CRS software visibility is enabled and we
>> > get a CRS completion for a config read of the Vendor ID at offset 0,
>> > the read should not be retried, and the read should return 0x0001.
>> > For example, pci_bus_read_dev_vendor_id() should see that 0x0001
>> > value.
>> >
>> > That's not what this code does.  If the readl() returns
>> > CFG_RETRY_STATUS, you do the delay and reissue the read.  You never
>> > return 0x0001 to the caller of pci_bus_read_config_word().
>> >
>>
>> As far as I understand, PCI config access APIs do not check for RETRY.
>> Also caller of them such as __pci_read_base, do not check for retry.
>
> Of course not.  PCIe r3.1, sec 2.3.2, says the CRS status is only
> visible to software if (1) CRS software visibility is enabled and
> (2) we're reading the Vendor ID.  In all other cases, the Root
> Complex must reissue the request, which is invisible to software.
>
> Bjorn