linux-kernel - Re: [PATCH] PCI/ASPM: Make SUNIX serial card acceptable latency unlimited

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220930151817.GA1973184@bhelgaas>
Date:   Fri, 30 Sep 2022 10:18:17 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Chris Chiu <chris.chiu@...onical.com>
Cc:     bhelgaas@...gle.com, mika.westerberg@...ux.intel.com,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI/ASPM: Make SUNIX serial card acceptable latency
 unlimited

On Fri, Sep 30, 2022 at 05:10:50PM +0800, Chris Chiu wrote:
> SUNIX serial card advertise L1 acceptable L0S exit latency to be
> < 2us, L1 < 32us, but the link capability shows they're unlimited.
> 
> It fails the latency check and prohibits the ASPM L1 from being
> enabled. The L1 acceptable latency quirk fixes the issue.

Hi Chris, help me understand what's going on here.

The "Endpoint L1 Acceptable Latency" field in Device Capabilities is
described like this (PCIe r6.0, sec 7.5.3.3):

  This field indicates the acceptable latency that an Endpoint can
  withstand due to the transition from L1 state to the L0 state. It is
  essentially an indirect measure of the Endpoint’s internal
  buffering.

  Power management software uses the reported L1 Acceptable Latency
  number to compare against the L1 Exit Latencies reported (see below)
  by all components comprising the data path from this Endpoint to the
  Root Complex Root Port to determine whether ASPM L1 entry can be
  used with no loss of performance.

The "L1 Exit Latency" in Link Capabilities:

  This field indicates the L1 Exit Latency for the given PCI Express
  Link. The value reported indicates the length of time this Port
  requires to complete transition from ASPM L1 to L0.

Apparently the SUNIX device advertises in Dev Cap that it can tolerate
a maximum of 32us of L1 Exit Latency for the entire path from the
SUNIX device to the Root Port, and in Link Cap that the SUNIX device
itself may take more than 64us to exit L1.

If that's accurate, then we should not enable L1 for that device
because using L1 may cause buffer overflows, e.g., dropped characters.

Per 03038d84ace7 ("PCI/ASPM: Make Intel DG2 L1 acceptable latency
unlimited"), the existing users of aspm_l1_acceptable_latency() are
graphics devices where I assume there would be little data coming from
the device and buffering would not be an issue.

It doesn't seem plausible to me that a serial device, where there is a
continuous stream of incoming data, could tolerate an *unlimited* exit
latency.

I could certainly believe that Link Cap advertises "> 64us" of L1 Exit
Latency when it really should advertise "< 32us" or something.  But I
don't know how we could be confident in the correct value without
knowledge of the device design.

Bjorn

> Signed-off-by: Chris Chiu <chris.chiu@...onical.com>
> ---
>  drivers/pci/quirks.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 4944798e75b5..e1663e43846e 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5955,4 +5955,5 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b0, aspm_l1_acceptable_latency
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUNIX, PCI_DEVICE_ID_SUNIX_1999, aspm_l1_acceptable_latency);
>  #endif
> -- 
> 2.25.1
>