[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b25e6891-89cc-4ead-88b3-c1c548615daa@linux.intel.com>
Date: Thu, 3 Jul 2025 10:03:27 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Ioanna Alifieraki <ioanna-maria.alifieraki@...onical.com>
Cc: kevin.tian@...el.com, jroedel@...e.de, robin.murphy@....com,
will@...nel.org, joro@...tes.org, dwmw2@...radead.org,
iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
regressions@...ts.linux.dev, stable@...r.kernel.org
Subject: Re: [REGRESSION][BISECTED] Performance Regression in IOMMU/VT-d Since
Kernel 6.10
On 7/3/25 00:45, Ioanna Alifieraki wrote:
> On Wed, Jul 2, 2025 at 12:00 PM Baolu Lu<baolu.lu@...ux.intel.com> wrote:
>> On 7/2/2025 1:14 PM, Baolu Lu wrote:
>>> On 7/2/25 01:11, Ioanna Alifieraki wrote:
>>>> #regzbot introduced: 129dab6e1286
>>>>
>>>> Hello everyone,
>>>>
>>>> We've identified a performance regression that starts with linux
>>>> kernel 6.10 and persists through 6.16(tested at commit e540341508ce).
>>>> Bisection pointed to commit:
>>>> 129dab6e1286 ("iommu/vt-d: Use cache_tag_flush_range_np() in
>>>> iotlb_sync_map").
>>>>
>>>> The issue occurs when running fio against two NVMe devices located
>>>> under the same PCIe bridge (dual-port NVMe configuration). Performance
>>>> drops compared to configurations where the devices are on different
>>>> bridges.
>>>>
>>>> Observed Performance:
>>>> - Before the commit: ~6150 MiB/s, regardless of NVMe device placement.
>>>> - After the commit:
>>>> -- Same PCIe bridge: ~4985 MiB/s
>>>> -- Different PCIe bridges: ~6150 MiB/s
>>>>
>>>>
>>>> Currently we can only reproduce the issue on a Z3 metal instance on
>>>> gcp. I suspect the issue can be reproducible if you have a dual port
>>>> nvme on any machine.
>>>> At [1] there's a more detailed description of the issue and details
>>>> on the reproducer.
>>> This test was running on bare metal hardware instead of any
>>> virtualization guest, right? If that's the case,
>>> cache_tag_flush_range_np() is almost a no-op.
>>>
>>> Can you please show me the capability register of the IOMMU by:
>>>
>>> #cat/sys/bus/pci/devices/[pci_dev_name]/iommu/intel-iommu/cap
>> Also, can you please try whether the below changes make any difference?
>> I've also attached a patch file to this email so you can apply the
>> change more easily.
> Thanks for the patch Baolu, I've tested and I can confirm we get ~6150MiB/s
> for nvme pairs both under the same and different bridge.
> The output of
> cat/sys/bus/pci/devices/[pci_dev_name]/iommu/intel-iommu/cap
> 19ed008c40780c66
> for all nvmes.
> I got confirmation there's no virtualization happening on this instance
> at all.
> FWIW, I had run perf when initially investigating the issue and it was
> showing quite some time spent in cache_tag_flush_range_np().
Okay, I will post a formal fix patch for this. Thank you!
Thanks,
baolu
Powered by blists - more mailing lists