linux-kernel - Re: [RFC v2 05/17] x86/apic: Add update

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87jz8i31dv.ffs@tglx>
Date: Fri, 21 Mar 2025 15:27:08 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Neeraj Upadhyay <Neeraj.Upadhyay@....com>, linux-kernel@...r.kernel.org
Cc: bp@...en8.de, mingo@...hat.com, dave.hansen@...ux.intel.com,
 Thomas.Lendacky@....com, nikunj@....com, Santosh.Shukla@....com,
 Vasant.Hegde@....com, Suravee.Suthikulpanit@....com, David.Kaplan@....com,
 x86@...nel.org, hpa@...or.com, peterz@...radead.org, seanjc@...gle.com,
 pbonzini@...hat.com, kvm@...r.kernel.org, kirill.shutemov@...ux.intel.com,
 huibo.wang@....com, naveen.rao@....com
Subject: Re: [RFC v2 05/17] x86/apic: Add update_vector callback for Secure
 AVIC

On Wed, Feb 26 2025 at 14:35, Neeraj Upadhyay wrote:
> Add update_vector callback to set/clear ALLOWED_IRR field in
> the APIC backing page. The ALLOWED_IRR field indicates the
> interrupt vectors which the guest allows the hypervisor to
> send (typically for emulated devices). Interrupt vectors used
> exclusively by the guest itself (like IPI vectors) should not
> be allowed to be injected into the guest for security reasons.
> The update_vector callback is invoked from APIC vector domain
> whenever a vector is allocated, freed or moved.

Your changelog tells a lot about the WHAT. Please read and follow the
documentation, which describes how a change log should be structured.

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog

> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> index 72fa4bb78f0a..e0c9505e05f8 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -174,6 +174,8 @@ static void apic_update_vector(struct irq_data *irqd, unsigned int newvec,
>  		apicd->prev_cpu = apicd->cpu;
>  		WARN_ON_ONCE(apicd->cpu == newcpu);
>  	} else {
> +		if (apic->update_vector)
> +			apic->update_vector(apicd->cpu, apicd->vector, false);
>  		irq_matrix_free(vector_matrix, apicd->cpu, apicd->vector,
>  				managed);
>  	}
> @@ -183,6 +185,8 @@ static void apic_update_vector(struct irq_data *irqd, unsigned int newvec,
>  	apicd->cpu = newcpu;
>  	BUG_ON(!IS_ERR_OR_NULL(per_cpu(vector_irq, newcpu)[newvec]));
>  	per_cpu(vector_irq, newcpu)[newvec] = desc;
> +	if (apic->update_vector)
> +		apic->update_vector(apicd->cpu, apicd->vector, true);

A trivial

static inline void apic_update_vector(....)
{
        if (apic->update_vector)
           ....
}

would be too easy to read and add not enough line count, right?

>  static void vector_assign_managed_shutdown(struct irq_data *irqd)
> @@ -528,11 +532,15 @@ static bool vector_configure_legacy(unsigned int virq, struct irq_data *irqd,
>  	if (irqd_is_activated(irqd)) {
>  		trace_vector_setup(virq, true, 0);
>  		apic_update_irq_cfg(irqd, apicd->vector, apicd->cpu);
> +		if (apic->update_vector)
> +			apic->update_vector(apicd->cpu, apicd->vector, true);
>  	} else {
>  		/* Release the vector */
>  		apicd->can_reserve = true;
>  		irqd_set_can_reserve(irqd);
>  		clear_irq_vector(irqd);
> +		if (apic->update_vector)
> +			apic->update_vector(apicd->cpu, apicd->vector, false);
>  		realloc = true;

This is as incomplete as it gets. None of the other code paths which
invoke clear_irq_vector() nor those which invoke free_moved_vector() are
mopping up the leftovers in the backing page.

And no, you don't sprinkle this nonsense all over the call sites. There
is only a very limited number of functions which are involed in setting
up and tearing down a vector. Doing this at the call sites is a
guarantee for missing out as you demonstrated.

> +#define VEC_POS(v)	((v) & (32 - 1))
> +#define REG_POS(v)	(((v) >> 5) << 4)

This is unreadable, undocumented and incomprehensible garbage.

>  static DEFINE_PER_CPU(void *, apic_backing_page);
>  
>  struct apic_id_node {
> @@ -192,6 +195,22 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
>  	__send_IPI_mask(mask, vector, APIC_DEST_ALLBUT);
>  }
>  
> +static void x2apic_savic_update_vector(unsigned int cpu, unsigned int vector, bool set)
> +{
> +	void *backing_page;
> +	unsigned long *reg;
> +	int reg_off;
> +
> +	backing_page = per_cpu(apic_backing_page, cpu);
> +	reg_off = SAVIC_ALLOWED_IRR_OFFSET + REG_POS(vector);
> +	reg = (unsigned long *)((char *)backing_page + reg_off);
> +
> +	if (set)
> +		test_and_set_bit(VEC_POS(vector), reg);
> +	else
> +		test_and_clear_bit(VEC_POS(vector), reg);
> +}

What's the test_and_ for if you ignore the return value anyway?

Als I have no idea what SAVIC_ALLOWED_IRR_OFFSET means. Whether it's
something from the datashit or a made up thing does not matter. It's
patently non-informative.

Again:

struct apic_page {
	union {
		u32	regs[NR_APIC_REGS];
		u8	bytes[PAGE_SIZE];
	};
};                

       struct apic_page *ap = this_cpu_ptr(apic_page);
       unsigned long *sirr;

       /*
        * apic_page.regs[SAVIC_ALLOWED_IRR_OFFSET...] is an array of
        * consecutive 32-bit registers, which represents a vector bitmap.
        */
        sirr = (unsigned long *) &ap->regs[SAVIC_ALLOWED_IRR_OFFSET];
        if (set)
        	set_bit(sirr, vector);
        else
        	clear_bit(sirr, vector);

See how code suddenly becomes self explaining, obvious and
comprehensible?

Thanks,

        tglx