lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210820210309.GA3357515@bjorn-Precision-5520>
Date:   Fri, 20 Aug 2021 16:03:09 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Heiner Kallweit <hkallweit1@...il.com>
Cc:     Kai-Heng Feng <kai.heng.feng@...onical.com>, nic_swsd@...ltek.com,
        bhelgaas@...gle.com, davem@...emloft.net, kuba@...nel.org,
        netdev@...r.kernel.org, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v3 1/3] r8169: Implement dynamic ASPM mechanism

On Thu, Aug 19, 2021 at 05:45:22PM +0200, Heiner Kallweit wrote:
> On 19.08.2021 13:42, Bjorn Helgaas wrote:
> > On Thu, Aug 19, 2021 at 01:45:40PM +0800, Kai-Heng Feng wrote:
> >> r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
> >> Same issue can be observed with older vendor drivers.
> > 
> > On some platforms but not on others?  Maybe the PCIe topology is a
> > factor?  Do you have bug reports with data, e.g., "lspci -vv" output?
> > 
> >> The issue is however solved by the latest vendor driver. There's a new
> >> mechanism, which disables r8169's internal ASPM when the NIC traffic has
> >> more than 10 packets, and vice versa. 
> > 
> > Presumably there's a time interval related to the 10 packets?  For
> > example, do you want to disable ASPM if 10 packets are received (or
> > sent?) in a certain amount of time?
> > 
> >> The possible reason for this is
> >> likely because the buffer on the chip is too small for its ASPM exit
> >> latency.
> > 
> > Maybe this means the chip advertises incorrect exit latencies?  If so,
> > maybe a quirk could override that?
> > 
> >> Realtek confirmed that all their PCIe LAN NICs, r8106, r8168 and r8125
> >> use dynamic ASPM under Windows. So implement the same mechanism here to
> >> resolve the issue.
> > 
> > What exactly is "dynamic ASPM"?
> > 
> > I see Heiner's comment about this being intended only for a downstream
> > kernel.  But why?
> > 
> We've seen various more or less obvious symptoms caused by the broken
> ASPM support on Realtek network chips. Unfortunately Realtek releases
> neither datasheets nor errata information.
> Last time we attempted to re-enable ASPM numerous problem reports came
> in. These Realtek chips are used on basically every consumer mainboard.
> The proposed workaround has potential side effects: In case of a
> congestion in the chip it may take up to a second until ASPM gets
> disabled, what may affect performance, especially in case of alternating
> traffic patterns. Also we can't expect support from Realtek.
> Having said that my decision was that it's too risky to re-enable ASPM
> in mainline even with this workaround in place. Kai-Heng weights the
> power saving higher and wants to take the risk in his downstream kernel.
> If there are no problems downstream after few months, then this
> workaround may make it to mainline.

Since ASPM apparently works well on some platforms but not others, I'd
suspect some incorrect exit latencies.

Ideally we'd have some launchpad/bugzilla links, and a better
understanding of the problem, and maybe a quirk that makes this work
on all platforms without mucking up the driver with ASPM tweaks.

But I'm a little out of turn here because the only direct impact to
the PCI core is the pcie_aspm_supported() interface.  It *looks* like
these patches don't actually touch the PCIe architected ASPM controls
in Link Control; all I see is mucking with Realtek-specific registers.

I think this is more work than it should be and likely to be not as
reliable as it should be.  But I guess that's up to you guys.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ