lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 19 Jul 2022 21:11:44 +0200
From:   Dmytro Maluka <dmy@...ihalf.com>
To:     Victor Ding <victording@...gle.com>,
        Bjorn Helgaas <helgaas@...nel.org>
Cc:     Ulf Hansson <ulf.hansson@...aro.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Ben Chuang <ben.chuang@...esyslogic.com.tw>,
        linux-pci@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
        linux-mmc@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>,
        Chris Packham <chris.packham@...iedtelesis.co.nz>,
        Kai-Heng Feng <kai.heng.feng@...onical.com>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        "Saheed O. Bolarinwa" <refactormyself@...il.com>,
        Vidya Sagar <vidyas@...dia.com>,
        Xiongfeng Wang <wangxiongfeng2@...wei.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Grzegorz Jaszczyk <jaz@...ihalf.com>,
        Tomasz Nowicki <tn@...ihalf.com>,
        Zide Chen <zide.chen@...el.com>
Subject: Re: [PATCH v2] PCI/ASPM: Disable ASPM when save/restore PCI state

On 7/18/22 18:21, Dmytro Maluka wrote:
> While we're at it, I'm also wondering why for the basic PCI config (the
> first 256 bytes) Linux on x86 always uses the legacy 0xCF8/0xCFC method
> instead of MMCFG, even if MMCFG is available. The legacy method is
> inherently non-atomic and does require the global lock, while the MMCFG
> method generally doesn't, so using MMCFG would significantly speed up
> PCI config accesses in high-contention scenarios like the parallel
> suspend/resume.
> 
> I've tried the below change which forces using MMCFG for the first 256
> bytes, and indeed, it makes suspend/resume of individual PCI devices
> with pm_async=1 almost as fast as with pm_async=0. In particular, it
> fixes the problem with slow GL9750 suspend/resume even without Victor's
> patch.
> 
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -40,20 +40,20 @@ const struct pci_raw_ops *__read_mostly raw_pci_ext_ops;
>  int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
>                                                 int reg, int len, u32 *val)
>  {
> -       if (domain == 0 && reg < 256 && raw_pci_ops)
> -               return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
>         if (raw_pci_ext_ops)
>                 return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
> +       if (domain == 0 && reg < 256 && raw_pci_ops)
> +               return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
>         return -EINVAL;
>  }
>  
>  int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
>                                                 int reg, int len, u32 val)
>  {
> -       if (domain == 0 && reg < 256 && raw_pci_ops)
> -               return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
>         if (raw_pci_ext_ops)
>                 return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
> +       if (domain == 0 && reg < 256 && raw_pci_ops)
> +               return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
>         return -EINVAL;
>  }
>  
> 
> Sounds good if I submit a patch like this? (I'm not suggesting it
> instead of Victor's patch, rather as a separate improvement.)

Ok, I found that a similar change was already suggested in the past by
Thomas [1] and got rejected by Linus [2].

Linus' arguments sound reasonable, and I understand that back then the
only known case of an issue with PCI config lock contention was with
Intel PMU counter registers which are in the extended config space
anyway. But now we know another case of such a contention, concerning
the basic config space too, namely: suspending or resuming many PCI
devices in parallel during system suspend/resume.

I've checked that on my box using MMCFG instead of Type 1 (i.e. using my
above patch) reduces the total suspend or resume time by 15-20 ms on
average. (I also had Victor's patch applied all the time, i.e. the ASPM
L1 exit latency issue was already resolved, so my test was about the PCI
lock contention in general.) So, not exactly a major improvement, yet
not exactly a negligible one. Maybe it's worth optimizing, maybe not.

Anyway, that's a bit of digression. Let's focus primarily on Victor's
ASPM patch.

[1]
https://lore.kernel.org/all/tip-b5b0f00c760b6e9673ab79b88ede2f3c7a039f74@git.kernel.org/

[2]
https://lore.kernel.org/all/CA+55aFwi0tkdugfqNEz6M28RXM2jx6WpaDF4nfA=doUVdZgUNQ@mail.gmail.com/

Thanks,
Dmytro

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ