[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc3df866-9144-42f0-a24c-fbdcedd48315@amd.com>
Date: Tue, 10 Jun 2025 12:02:08 +0530
From: Neeraj Upadhyay <Neeraj.Upadhyay@....com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: linux-kernel@...r.kernel.org, bp@...en8.de, tglx@...utronix.de,
mingo@...hat.com, dave.hansen@...ux.intel.com, Thomas.Lendacky@....com,
nikunj@....com, Santosh.Shukla@....com, Vasant.Hegde@....com,
Suravee.Suthikulpanit@....com, David.Kaplan@....com, x86@...nel.org,
hpa@...or.com, peterz@...radead.org, pbonzini@...hat.com,
kvm@...r.kernel.org, kirill.shutemov@...ux.intel.com, huibo.wang@....com,
naveen.rao@....com, francescolavra.fl@...il.com
Subject: Re: [PATCH v5 01/20] KVM: x86: Move find_highest_vector() to a common
header
On 4/29/2025 8:12 PM, Sean Christopherson wrote:
> Please slot the below in. And if there is any more code in this series that is
> duplicating existing functionality, try to figure out a clean way to share code
> instead of open coding yet another version.
>
> --
> From: Sean Christopherson <seanjc@...gle.com>
> Date: Tue, 29 Apr 2025 07:30:47 -0700
> Subject: [PATCH] x86/apic: KVM: Deduplicate APIC vector => register+bit math
>
> Consolidate KVM's {REG,VEC}_POS() macros and lapic_vector_set_in_irr()'s
> open coded equivalent logic in anticipation of the kernel gaining more
> usage of vector => reg+bit lookups.
>
> Use lapic_vector_set_in_irr()'s math as using divides for both the bit
> number and register offset makes it easier to connect the dots, and for at
> least one user, fixup_irqs(), "/ 32 * 0x10" generates ever so slightly
> better code with gcc-14 (shaves a whole 3 bytes from the code stream):
>
> ((v) >> 5) << 4:
> c1 ef 05 shr $0x5,%edi
> c1 e7 04 shl $0x4,%edi
> 81 c7 00 02 00 00 add $0x200,%edi
>
> (v) / 32 * 0x10:
> c1 ef 05 shr $0x5,%edi
> 83 c7 20 add $0x20,%edi
> c1 e7 04 shl $0x4,%edi
>
> Keep KVM's tersely named macros as "wrappers" to avoid unnecessary churn
> in KVM, and because the shorter names yield more readable code overall in
> KVM.
>
> No functional change intended (clang-19 and gcc-14 generate bit-for-bit
> identical code for all of kvm.ko).
>
With this change, I am observing difference in generated assembly for VEC_POS
and REG_POS, as KVM code passes vector param with type "int" to these macros.
Type casting "v" param of APIC_VECTOR_TO_BIT_NUMBER and APIC_VECTOR_TO_REG_OFFSET
to "unsigned int" in the macro definition restores the original assembly. Can
you have a look at this once? Below is the updated patch for this. Can you please
share your feedback on this?
--
x86/apic: KVM: Deduplicate APIC vector => register+bit math
Consolidate KVM's {REG,VEC}_POS() macros and lapic_vector_set_in_irr()'s
open coded equivalent logic in anticipation of the kernel gaining more
usage of vector => reg+bit lookups.
Use lapic_vector_set_in_irr()'s math as using divides for both the bit
number and register offset makes it easier to connect the dots, and for at
least one user, fixup_irqs(), "/ 32 * 0x10" generates ever so slightly
better code with gcc-14 (shaves a whole 3 bytes from the code stream):
((v) >> 5) << 4:
c1 ef 05 shr $0x5,%edi
c1 e7 04 shl $0x4,%edi
81 c7 00 02 00 00 add $0x200,%edi
(v) / 32 * 0x10:
c1 ef 05 shr $0x5,%edi
83 c7 20 add $0x20,%edi
c1 e7 04 shl $0x4,%edi
Keep KVM's tersely named macros as "wrappers" to avoid unnecessary churn
in KVM, and because the shorter names yield more readable code overall in
KVM.
The new macros type cast the vector parameter to "unsigned int". This is
required from better code generation for cases where an "int" is passed
to these macros in kvm code.
int v;
((v) >> 5) << 4:
c1 f8 05 sar $0x5,%eax
c1 e0 04 shl $0x4,%eax
((v) / 32 * 0x10):
85 ff test %edi,%edi
8d 47 1f lea 0x1f(%rdi),%eax
0f 49 c7 cmovns %edi,%eax
c1 f8 05 sar $0x5,%eax
c1 e0 04 shl $0x4,%eax
((unsigned int)(v) / 32 * 0x10):
c1 f8 05 sar $0x5,%eax
c1 e0 04 shl $0x4,%eax
(v) & (32 - 1):
89 f8 mov %edi,%eax
83 e0 1f and $0x1f,%eax
(v) % 32
89 fa mov %edi,%edx
c1 fa 1f sar $0x1f,%edx
c1 ea 1b shr $0x1b,%edx
8d 04 17 lea (%rdi,%rdx,1),%eax
83 e0 1f and $0x1f,%eax
29 d0 sub %edx,%eax
(unsigned int)(v) % 32:
89 f8 mov %edi,%eax
83 e0 1f and $0x1f,%eax
Overall kvm.ko size is impacted if "unsigned int" is not used.
Bin Orig New (w/o unsigned int) New (w/ unsigned int)
lapic.o 28580 28772 28580
kvm.o 670810 671002 670810
kvm.ko 708079 708271 708079
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@...gle.com>
[Neeraj: Type cast vec macro param to "unsigned int", provide data
in commit log on "unsigned int" requirement]
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@....com>
---
arch/x86/include/asm/apic.h | 7 +++++--
arch/x86/kvm/lapic.h | 4 ++--
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 23d86c9750b9..c84d4e86fe4e 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -488,11 +488,14 @@ static inline void apic_setup_apic_calls(void) { }
extern void apic_ack_irq(struct irq_data *data);
+#define APIC_VECTOR_TO_BIT_NUMBER(v) ((unsigned int)(v) % 32)
+#define APIC_VECTOR_TO_REG_OFFSET(v) ((unsigned int)(v) / 32 * 0x10)
+
static inline bool lapic_vector_set_in_irr(unsigned int vector)
{
- u32 irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
+ u32 irr = apic_read(APIC_IRR + APIC_VECTOR_TO_REG_OFFSET(vector));
- return !!(irr & (1U << (vector % 32)));
+ return !!(irr & (1U << APIC_VECTOR_TO_BIT_NUMBER(vector)));
}
static inline bool is_vector_pending(unsigned int vector)
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 1638a3da383a..56369d331bfc 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -145,8 +145,8 @@ void kvm_lapic_exit(void);
u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic);
-#define VEC_POS(v) ((v) & (32 - 1))
-#define REG_POS(v) (((v) >> 5) << 4)
+#define VEC_POS(v) APIC_VECTOR_TO_BIT_NUMBER(v)
+#define REG_POS(v) APIC_VECTOR_TO_REG_OFFSET(v)
static inline void kvm_lapic_clear_vector(int vec, void *bitmap)
{
--
Powered by blists - more mailing lists