lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 8 Oct 2021 08:58:21 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Kai-Heng Feng <kai.heng.feng@...onical.com>
Cc:     Heiner Kallweit <hkallweit1@...il.com>,
        nic_swsd <nic_swsd@...ltek.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Anthony Wong <anthony.wong@...onical.com>,
        Linux Netdev List <netdev@...r.kernel.org>,
        Linux PCI <linux-pci@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] [PATCH net-next v6 3/3] r8169: Implement dynamic ASPM
 mechanism

On Fri, Oct 08, 2021 at 02:18:55PM +0800, Kai-Heng Feng wrote:
> On Fri, Oct 8, 2021 at 3:11 AM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > On Fri, Oct 08, 2021 at 12:15:52AM +0800, Kai-Heng Feng wrote:
> > > r8169 NICs on some platforms have abysmal speed when ASPM is enabled.
> > > Same issue can be observed with older vendor drivers.
> > >
> > > The issue is however solved by the latest vendor driver. There's a new
> > > mechanism, which disables r8169's internal ASPM when the NIC traffic has
> > > more than 10 packets per second, and vice versa. The possible reason for
> > > this is likely because the buffer on the chip is too small for its ASPM
> > > exit latency.
> > > ...

> > I suppose that on the Intel system, if we enable ASPM, the link goes
> > to L1.2, and the NIC immediately receives 1000 packets in that second
> > before we can disable ASPM again, we probably drop a few packets?
> >
> > Whereas on the AMD system, we probably *never* drop any packets even
> > with L1.2 enabled all the time?
> 
> Yes and yes.

The fact that we drop some packets with dynamic ASPM on the Intel
system means we must be giving up some performance.

And I guess that on the AMD system, we should get full performance but
we must be using a little more power (probably unmeasurable) because
ASPM *could* be always enabled but dynamic ASPM disables it some of
the time.

> > And if we actually knew the root cause and could set the correct LTR
> > values or whatever is wrong on the Intel system, we probably wouldn't
> > need this dynamic scheme?
> 
> Because Realtek already implemented the dynamic ASPM workaround in
> their Windows and Linux driver, they never bother to find the root
> cause.
> So we'll never know what really happens here.

Looks like it.  Somebody with a PCIe analyzer could probably make
progress, but I agree, that doesn't seem likely.

Realtek no doubt has the equipment to do this, but apparently they
don't think it's worthwhile.  In their defense, the Linux ASPM code is
pretty impenetrable and there could be a problem there that causes or
contributes to this.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ