[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87qzsucqzz.ffs@tglx>
Date: Tue, 16 Dec 2025 17:48:16 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Yang Zhang <zhangz@...on.cn>, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, bhelgaas@...gle.com
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
Yang Zhang <zhangz@...on.cn>
Subject: Re: [PATCH] X86/PCI: Prioritize MMCFG access to hardware registers
On Tue, Dec 16 2025 at 18:03, Yang Zhang wrote:
> However, the current kernel code forces the use of the IO Port method for
> PCI accesses with domain=0 and offset less than 256. The IO Port method is
> more like a legacy from historical reasons, and its performance is lower
That code has a reason and if you would have taken the time to go back
in the git history and to read the related discussions in the LKML
archive then you could provide a proper explanation and not some
handwaving "like a legacy".
> than that of the MMCFG method. We conducted comparative tests on AMD and
> Hygon CPUs respectively, even without considering the impact of indirect
> access (IO Ports use 0xCF8 and 0xCFC), simply comparing the performance of
> the following two code:
>
> 1)outl(0x400702,0xCFC);
>
> 2)mmio_config_writel(data_addr,0x400702);
>
> while both codes access the same register. The results shows the MMCFG
> (400+ cycle per access) method outperforms the IO Port (1000+ cycle
> per access) by twice.
That's a known fact and has been discussed many times on LKML. See the
archive for details.
> Through PMC/PMU event statistics within the AMD/Hygon microarchitecture,
> we found IO Port access causes more stalls within the CPU's internal
> dispatch module, and these stalls are mainly due to the front-end's
> inability to decode the corresponding uops in a timely manner.
Interesting analysis.
> Therefore the main reason for the performance difference between the
> two access methods is that the in/out instructions corresponding to
> the IO Port access belong to microcode, and therefore their decoding
> efficiency is lower than that of mmcfg.
It's known forever that inb/outb are significantly slower not only due
to the micro code magic, but also because IO port instructions are
serializing against IO port instructions. See SDM/APM, it's documented.
> For CPUs that support both MMCFG and IO Port access methods, if a hardware
> register only supports IO Port access, this configuration may lead to
> illegal access. However, we think registers that support I/O Port access
> have corresponding MMCFG addresses.
We think? Either you know or not. By specification the MMIO config space
covers the complete config space from 0 to 4095.
> Even we test several AMD/Hygon CPUs with this patch and found no
> problems, we still cannot rule out the possibility that all CPUs are
> problem-free, especially older CPUs.
If you've had read the mailing list archives and the git history then
you would know for sure that there are systems out there which have
issues with accessing the lower config space via MMIO.
> To address this risk, we have created a new macro, PREFER MMCONFIG,
That's not a macro. That's a config switch, no?
> allowing users to choose whether or not to enable this feature.
Also please read and follow
https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog
> +config PREFER_MMCONFIG
> + bool "Perfer to use mmconfig over IO Port"
> + depends on PCI_MMCONFIG
> + help
> + This setting will prioritize the use of mmcfg, which is superior to
> + io port from a performance perspective, mainly for the following reasons:
> + 1) io port is an indirect access; 2) io port instructions are decoded
> + by microcode, which is more likely to cause CPU front-end bound compared
> + to mmcfg using mov instructions.
> +
> + For CPUs that support both MMCFG and IO Port access methods, if a
> + hardware register only supports IO Port access, this configuration
> + may lead to illegal access. Therefore, users must ensure that the
> + configuration will not cause any exceptions before enabling it.
Q: How is that supposed to work for distros?
A: Not at all.
The right thing to do here is:
1) Have a control variable, which determines the MMIO preference
2) Make this control default to false (backwards compatible)
3) Provide a command line option to enable/disable MMIO preference
4) Optionally allow the setup code to enable MMIO preference based
on e.g. CPU family/model cut-offs or some other reasonable method
which prevents a default on for the reportedly affected systems
(See LKML).
Thanks,
tglx
Powered by blists - more mailing lists