[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231115202506.GB19552@noisy.programming.kicks-ass.net>
Date: Wed, 15 Nov 2023 21:25:06 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Jacob Pan <jacob.jun.pan@...ux.intel.com>
Cc: LKML <linux-kernel@...r.kernel.org>, X86 Kernel <x86@...nel.org>,
iommu@...ts.linux.dev, Thomas Gleixner <tglx@...utronix.de>,
Lu Baolu <baolu.lu@...ux.intel.com>, kvm@...r.kernel.org,
Dave Hansen <dave.hansen@...el.com>,
Joerg Roedel <joro@...tes.org>,
"H. Peter Anvin" <hpa@...or.com>, Borislav Petkov <bp@...en8.de>,
Ingo Molnar <mingo@...hat.com>,
Raj Ashok <ashok.raj@...el.com>,
"Tian, Kevin" <kevin.tian@...el.com>, maz@...nel.org,
seanjc@...gle.com, Robin Murphy <robin.murphy@....com>
Subject: Re: [PATCH RFC 09/13] x86/irq: Install posted MSI notification
handler
On Wed, Nov 15, 2023 at 12:04:01PM -0800, Jacob Pan wrote:
> we are interleaving cacheline read and xchg. So made it to
Hmm, I wasn't expecting that to be a problem, but sure.
> for (i = 0; i < 4; i++) {
> pir_copy[i] = pid->pir_l[i];
> }
>
> for (i = 0; i < 4; i++) {
> if (pir_copy[i]) {
> pir_copy[i] = arch_xchg(&pid->pir_l[i], 0);
> handled = true;
> }
> }
>
> With DSA MEMFILL test just one queue one MSI, we are saving 3 xchg per loop.
> Here is the performance comparison in IRQ rate:
>
> Original RFC 9.29 m/sec,
> Optimized in your email 8.82m/sec,
> Tweaked above: 9.54m/s
>
> I need to test with more MSI vectors spreading out to all 4 u64. I suspect
> the benefit will decrease since we need to do both read and xchg for
> non-zero entries.
Ah, but performance was not the reason I suggested this. Code
compactness and clarity was.
Possibly using less xchg is just a bonus :-)
Powered by blists - more mailing lists