[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220930212909.GA1923173@bhelgaas>
Date: Fri, 30 Sep 2022 16:29:09 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Krishna chaitanya chundru <quic_krichai@...cinc.com>
Cc: linux-pci@...r.kernel.org, linux-arm-msm@...r.kernel.org,
linux-kernel@...r.kernel.org, mka@...omium.org,
quic_vbadigan@...cinc.com, quic_hemantk@...cinc.com,
quic_nitegupt@...cinc.com, quic_skananth@...cinc.com,
quic_ramkri@...cinc.com, manivannan.sadhasivam@...aro.org,
swboyd@...omium.org, dmitry.baryshkov@...aro.org,
Prasad Malisetty <quic_pmaliset@...cinc.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"Saheed O. Bolarinwa" <refactormyself@...il.com>,
Vidya Sagar <vidyas@...dia.com>,
Krzysztof WilczyĆski <kw@...ux.com>,
Kai-Heng Feng <kai.heng.feng@...onical.com>
Subject: Re: [PATCH v7] PCI/ASPM: Update LTR threshold based upon reported
max latencies
On Fri, Sep 16, 2022 at 01:38:37PM +0530, Krishna chaitanya chundru wrote:
> In ASPM driver, LTR threshold scale and value are updated based on
> tcommon_mode and t_poweron values. In Kioxia NVMe L1.2 is failing due to
> LTR threshold scale and value are greater values than max snoop/non-snoop
> value.
>
> Based on PCIe r4.1, sec 5.5.1, L1.2 substate must be entered when
> reported snoop/no-snoop values is greater than or equal to
> LTR_L1.2_THRESHOLD value.
I find LTR configuration pretty much impenetrable, but this doesn't
seem right to me. If I understand correctly, LTR messages are a way
for endpoints to report their latency requirements, i.e., sort of a
dynamic version of "Endpoint L1 Acceptable Latency".
As you said, a comparison between the most recent LTR value and
LTR_L1.2_THESHOLD determines whether the link goes to L1.1 or L1.2.
So I assume LTR_L1.2_THESHOLD must be the minimum time required to
transition the link from L0 to L1.2 and back to L0, which includes
T_POWER_OFF, T_L1.2, T_POWER_ON, and T_COMMONMODE (sec 5.5.3.3.1,
5.5.5).
If the device can tolerate at least that much time, i.e., if the
LTR value >= LTR_L1.2_THESHOLD, the link should go to L1.2.
I'm not a hardware person, but I don't see how LTR_L1.2_THESHOLD can
*depend* on the LTR max latency values. The LTR max latencies depend
on the endpoint. I think LTR_L1.2_THESHOLD depends on the circuit
design of both ends of the link.
More comments below, but they're only pertinent if we can figure out
that this is the correct approach.
Bjorn
> Signed-off-by: Prasad Malisetty <quic_pmaliset@...cinc.com>
> Signed-off-by: Krishna chaitanya chundru <quic_krichai@...cinc.com>
> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
> ---
>
> I am taking this patch forward as prasad is no more working with our org.
> changes since v6:
> - Rebasing with pci/next.
It's best if you base patches on my "main" branch (not "next"), which
is typically -rc1, unless they depend on something that's already been
merged.
In the patch below, rewrap so everything still fits in 80 columns like
the rest of the file.
Update citations to current spec version (r6.0). It looks like the
section numbers are the same.
> changes since v5:
> - no changes, just reposting as standalone patch instead of reply to
> previous patch.
> Changes since v4:
> - Replaced conditional statements with min and max.
> changes since v3:
> - Changed the logic to include this condition "snoop/nosnoop
> latencies are not equal to zero and lower than LTR_L1.2_THRESHOLD"
> Changes since v2:
> - Replaced LTRME logic with max snoop/no-snoop latencies check.
> Changes since v1:
> - Added missing variable declaration in v1 patch
> ---
> drivers/pci/pcie/aspm.c | 30 ++++++++++++++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 928bf64..2bb8470 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -486,13 +486,35 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link,
> {
> struct pci_dev *child = link->downstream, *parent = link->pdev;
> u32 val1, val2, scale1, scale2;
> + u32 max_val, max_scale, max_snp_scale, max_snp_val, max_nsnp_scale, max_nsnp_val;
> u32 t_common_mode, t_power_on, l1_2_threshold, scale, value;
> u32 ctl1 = 0, ctl2 = 0;
> u32 pctl1, pctl2, cctl1, cctl2;
> + u16 ltr;
> + u16 max_snoop_lat, max_nosnoop_lat;
>
> if (!(link->aspm_support & ASPM_STATE_L1_2_MASK))
> return;
>
> + ltr = pci_find_ext_capability(child, PCI_EXT_CAP_ID_LTR);
> + if (!ltr)
> + return;
> +
> + pci_read_config_word(child, ltr + PCI_LTR_MAX_SNOOP_LAT, &max_snoop_lat);
> + pci_read_config_word(child, ltr + PCI_LTR_MAX_NOSNOOP_LAT, &max_nosnoop_lat);
> +
> + max_snp_scale = (max_snoop_lat & PCI_LTR_SCALE_MASK) >> PCI_LTR_SCALE_SHIFT;
> + max_snp_val = max_snoop_lat & PCI_LTR_VALUE_MASK;
> +
> + max_nsnp_scale = (max_nosnoop_lat & PCI_LTR_SCALE_MASK) >> PCI_LTR_SCALE_SHIFT;
> + max_nsnp_val = max_nosnoop_lat & PCI_LTR_VALUE_MASK;
> +
> + /* choose the greater max scale value between snoop and no snoop value*/
Add space before */
Capitalize comments to match style of file.
> + max_scale = max(max_snp_scale, max_nsnp_scale);
> +
> + /* choose the greater max value between snoop and no snoop scales */
> + max_val = max(max_snp_val, max_nsnp_val);
> +
> /* Choose the greater of the two Port Common_Mode_Restore_Times */
> val1 = (parent_l1ss_cap & PCI_L1SS_CAP_CM_RESTORE_TIME) >> 8;
> val2 = (child_l1ss_cap & PCI_L1SS_CAP_CM_RESTORE_TIME) >> 8;
> @@ -525,6 +547,14 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link,
> */
> l1_2_threshold = 2 + 4 + t_common_mode + t_power_on;
> encode_l12_threshold(l1_2_threshold, &scale, &value);
> +
> + /*
> + * Based on PCIe r4.1, sec 5.5.1, L1.2 substate must be entered when reported
> + * snoop/no-snoop values are greater than or equal to LTR_L1.2_THRESHOLD value.
> + */
> + scale = min(scale, max_scale);
> + value = min(value, max_val);
I don't think this computes the right thing. If we have this:
scale = 001b (x 32ns)
value = 1024
max_scale = 010b (x 1024ns)
max_value = 1
Then the latencies are both 1024ns, so I would expect a min() of
1024ns. But computing min() separately for the scale and value will
give "scale = 001b" (x 32ns) and "value = 1", for a latency of 32ns.
I think you would need to compare the values in ns, i.e.,
"l1_2_threshold".
I assume the max() computations above have a similar issue, but I
didn't work it out.
But I'm not convinced that this is the right approach to begin with.
> ctl1 |= t_common_mode << 8 | scale << 29 | value << 16;
>
> /* Some broken devices only support dword access to L1 SS */
> --
> 2.7.4
>
Powered by blists - more mailing lists