[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a18b7027-5627-7827-ce28-059cc5819d08@rock-chips.com>
Date: Thu, 23 Jun 2016 10:15:04 +0800
From: Shawn Lin <shawn.lin@...k-chips.com>
To: Brian Norris <briannorris@...omium.org>
Cc: shawn.lin@...k-chips.com, Bjorn Helgaas <bhelgaas@...gle.com>,
devicetree@...r.kernel.org, Heiko Stuebner <heiko@...ech.de>,
Arnd Bergmann <arnd@...db.de>,
Marc Zyngier <marc.zyngier@....com>, linux-pci@...r.kernel.org,
Wenrui Li <wenrui.li@...k-chips.com>,
linux-kernel@...r.kernel.org,
Doug Anderson <dianders@...omium.org>,
linux-rockchip@...ts.infradead.org,
Rob Herring <robh+dt@...nel.org>
Subject: Re: [PATCH v3 2/2] PCI: Rockchip: Add Rockchip PCIe controller
support
在 2016/6/23 9:35, Brian Norris 写道:
> Hi,
>
> On Thu, Jun 23, 2016 at 09:09:46AM +0800, Shawn Lin wrote:
>> 在 2016/6/23 8:29, Brian Norris 写道:
>>> On Thu, Jun 16, 2016 at 09:50:35AM +0800, Shawn Lin wrote:
>
> [...]
>
>>>> + /* 500ms timeout value should be enough for gen1/2 taining */
>>>> + timeout = jiffies + msecs_to_jiffies(500);
>>>> +
>>>> + err = -ETIMEDOUT;
>>>> + while (time_before(jiffies, timeout)) {
>>>> + status = pcie_read(port, PCIE_CLIENT_BASIC_STATUS1);
>>>> + if (((status >> PCIE_CLIENT_LINK_STATUS_SHIFT) &
>>>> + PCIE_CLIENT_LINK_STATUS_MASK) ==
>>>> + PCIE_CLIENT_LINK_STATUS_UP) {
>>>> + dev_dbg(port->dev, "pcie link training gen1 pass!\n");
>>>> + err = 0;
>>>> + break;
>>>> + }
>>>> + msleep(20);
>>>> + }
>>>
>>> Technically, the above timeout loop is not quite correct. Possible error
>>> case: we can fail with a timeout after 500 ms if the training completes
>>> between the 480-500 ms time window. This can happen because you're
>>> doing:
>>>
>>> (1) read register: if complete, then terminate successfully
>>> (2) delay
>>> (3) check for timeout: if time is up, return error
>>>
>>> You actually need:
>>>
>>> (1) read register: if complete, then terminate successfully
>>> (2) delay
>>> (3) check for timeout: if time is up, repeat (1), and then report error
>>>
>>> You can examine the logic for readx_poll_timeout() in
>>> include/linux/iopoll.h to see an example of a proper timeout loop. You
>>> could even try to use one of the helpers there, except that your
>>> pcie_read() takes 2 args.
>>
>> I see, thanks.
>>
>>>
>>>> + if (err) {
>>>> + dev_err(port->dev, "pcie link training gen1 timeout!\n");
>>>> + return err;
>>>> + }
>>>> +
>>>> + /*
>>>> + * Enable retrain for gen2. This should be configured only after
>>>> + * gen1 finished.
>>>> + */
>>>> + status = pcie_read(port,
>>>> + PCIR_RC_CONFIG_LCS + PCIE_RC_CONFIG_BASE);
>>>> + status |= PCIE_CORE_LCSR_RETAIN_LINK;
>>>> + pcie_write(port, status,
>>>> + PCIR_RC_CONFIG_LCS + PCIE_RC_CONFIG_BASE);
>>>
>>> I'm not really an expert on this, but how are you actually "retraining
>>> for gen2"? Is that just the behavior of the core, that it retries at the
>>> higher rate on the 2nd training attempt? I'm just confused, since you
>>> set PCIE_CLIENT_GEN_SEL_2 above, so you've already allowed either gen1
>>> or gen2 negotiation AFAICT, and so seemingly you might already have
>>> negotiated at gen2.
>>
>>
>> Not really. I allow the core to enable gen2, but it needs a extra
>> setting of retrain after finishing gen1. It's not so strange as it
>> depends on the vendor's design. So I have to say it fits the
>> designer's expectation.
>
> OK.
>
>>>> + err = -ETIMEDOUT;
>>>> +
>>>> + while (time_before(jiffies, timeout)) {
>>>
>>> You never reset 'timeout' when starting this loop. So you only have a
>>> cumulative 500 ms to do both the gen1 and gen2 loops. Is that
>>> intentional? (I feel like this comment was made on an earlier revision
>>> of this driver, though I haven't read everything thoroughly.)
>>
>> yes, I don't have any docs to let me know how long should I wait for
>> gen1/2 to be finished. Maybe someday it will be augmented to a larger
>> value if finding a device actually need a longer time. But the only
>> thing I can say is that it's from my test for many pcie devices
>> currently.
>>
>>
>> Do you agree?
>
> I'm not suggesting increasing the timeout, exactly; I'm suggesting that
> you should set some minimum timeout for each training loop, instead of
> reusing the same deadline. i.e., something like this before the second
> loop:
>
> /* Reset the deadline for gen2 */
> timeout = jiffies + msecs_to_jiffies(500);
ok, I will update it.
>
> As it stands, if the first loop takes 490 ms, that leaves you only 10 ms
> for the second loop. And I think it'd be confusing if we ever see the
> first loop take a "long" time, and then we time out on the second --
> we'd be blaming the gen2 training for gen1's slowness.
>
>>>> + status = pcie_read(port, PCIE_CORE_CTRL_MGMT_BASE);
>>>> + if (((status >> PCIE_CORE_PL_CONF_SPEED_SHIFT) &
>>>> + PCIE_CORE_PL_CONF_SPEED_MASK) ==
>>>> + PCIE_CORE_PL_CONF_SPEED_50G) {
>>>> + dev_dbg(port->dev, "pcie link training gen2 pass!\n");
>>>> + err = 0;
>>>> + break;
>>>> + }
>>>> + msleep(20);
>>>> + }
>>>
>>> Similar problem with your timeout loop, as mentioned above. Although I
>>> confused about what you do in the error case here...
>>>
>>>> + if (err)
>>>> + dev_dbg(port->dev, "pcie link training gen2 timeout, force to gen1!\n");
>>>
>>> ... how are you forcing gen1? I don't see any code that indicates this.
>>> Should you at least be checking that there aren't some kind of training
>>> errors, and that we settled back on gen1/2.5G?
>>
>> yes, when failing gen2, my pcie controller will fallback to gen1
>> automatically.
>
> OK. Well maybe the text should say something like "falling back" instead
> of "force"?
>
Sounds good, will fix. Thanks.
>> if we pass the gen1 then fail to train gen2, a dbg msg here is enough
>> here to let we know that we should check the HW signal. Of course we
>> should make sure that this device supports gen2.
>
> OK.
>
> [...]
>
> Brian
>
>
>
--
Best Regards
Shawn Lin
Powered by blists - more mailing lists