lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20251216111513.7698-1-zhangz@hygon.cn>
Date: Tue, 16 Dec 2025 19:15:13 +0800
From: Yang Zhang <zhangz@...on.cn>
To: <tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
	<dave.hansen@...ux.intel.com>, <hpa@...or.com>, <bhelgaas@...gle.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
	<linux-pci@...r.kernel.org>, Yang Zhang <zhangz@...on.cn>
Subject: [PATCH] X86/PCI: Prioritize MMCFG access to hardware registers

As CPU performance demands increase, the configuration of some internal CPU
registers needs to be dynamically configured in the program, such as
configuring memory controller strategies within specific time windows.
These configurations place high demands on the efficiency of the
configuration instructions themselves, requiring them to retire and
take effect as quickly as possible.

However, the current kernel code forces the use of the IO Port method for
PCI accesses with domain=0 and offset less than 256. The IO Port method is
more like a legacy from historical reasons, and its performance is lower
than that of the MMCFG method. We conducted comparative tests on AMD and
Hygon CPUs respectively, even without considering the impact of indirect
access (IO Ports use 0xCF8 and 0xCFC), simply comparing the performance of
the following two code:

1)outl(0x400702,0xCFC);

2)mmio_config_writel(data_addr,0x400702);

while both codes access the same register. The results shows the MMCFG
(400+ cycle per access) method outperforms the IO Port (1000+ cycle
per access) by twice.

Through PMC/PMU event statistics within the AMD/Hygon microarchitecture,
we found IO Port access causes more stalls within the CPU's internal
dispatch module, and these stalls are mainly due to the front-end's
inability to decode the corresponding uops in a timely manner.
Therefore the main reason for the performance difference between the
two access methods is that the in/out instructions corresponding to
the IO Port access belong to microcode, and therefore their decoding
efficiency is lower than that of mmcfg.

For CPUs that support both MMCFG and IO Port access methods, if a hardware
register only supports IO Port access, this configuration may lead to
illegal access. However, we think registers that support I/O Port access
have corresponding MMCFG addresses. Even we test several AMD/Hygon CPUs
with this patch and found no problems, we still cannot rule out the
possibility that all CPUs are problem-free, especially older CPUs. To
address this risk, we have created a new macro, PREFER MMCONFIG, allowing
users to choose whether or not to enable this feature.

Signed-off-by: Yang Zhang <zhangz@...on.cn>
---
 arch/x86/Kconfig      | 15 +++++++++++++++
 arch/x86/pci/common.c | 14 ++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 80527299f..10dfd2b4e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2932,6 +2932,21 @@ config PCI_MMCONFIG
 
 	  Say Y otherwise.
 
+config PREFER_MMCONFIG
+        bool "Prefer to use mmconfig over IO Port"
+        depends on PCI_MMCONFIG
+        help
+          This setting will prioritize the use of mmcfg, which is superior to
+          io port from a performance perspective, mainly for the following reasons:
+          1) io port is an indirect access; 2) io port instructions are decoded
+          by microcode, which is more likely to cause CPU front-end bound compared
+          to mmcfg using mov instructions.
+
+          For CPUs that support both MMCFG and IO Port access methods, if a
+          hardware register only supports IO Port access, this configuration
+          may lead to illegal access. Therefore, users must ensure that the
+          configuration will not cause any exceptions before enabling it.
+
 config PCI_OLPC
 	def_bool y
 	depends on PCI && OLPC && (PCI_GOOLPC || PCI_GOANY)
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index ddb798603..8bde5d1df 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -40,20 +40,34 @@ const struct pci_raw_ops *__read_mostly raw_pci_ext_ops;
 int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
 						int reg, int len, u32 *val)
 {
+#ifdef CONFIG_PREFER_MMCONFIG
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
+	if (domain == 0 && reg < 256 && raw_pci_ops)
+		return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
+#else
 	if (domain == 0 && reg < 256 && raw_pci_ops)
 		return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
 	if (raw_pci_ext_ops)
 		return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
+#endif
 	return -EINVAL;
 }
 
 int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
 						int reg, int len, u32 val)
 {
+#ifdef CONFIG_PREFER_MMCONFIG
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
+	if (domain == 0 && reg < 256 && raw_pci_ops)
+		return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
+#else
 	if (domain == 0 && reg < 256 && raw_pci_ops)
 		return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
 	if (raw_pci_ext_ops)
 		return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
+#endif
 	return -EINVAL;
 }
 
-- 
2.34.1



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ