linux-kernel - Re: [PATCH] Allocate DMAR fault interrupts locally

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240301115000.4ba2e411@jacob-builder>
Date: Fri, 1 Mar 2024 11:50:00 -0800
From: Jacob Pan <jacob.jun.pan@...ux.intel.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Dimitri Sivanich <sivanich@....com>, David Woodhouse
 <dwmw2@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>, Joerg Roedel
 <joro@...tes.org>, Will Deacon <will@...nel.org>, Robin Murphy
 <robin.murphy@....com>, iommu@...ts.linux.dev,
 linux-kernel@...r.kernel.org, Steve Wahl <steve.wahl@....com>, Russ
 Anderson <russ.anderson@....com>, jacob.jun.pan@...ux.intel.com
Subject: Re: [PATCH] Allocate DMAR fault interrupts locally

Hi Thomas,

On Thu, 29 Feb 2024 23:18:37 +0100, Thomas Gleixner <tglx@...utronix.de>
wrote:

> Dimitri!
> 
> On Thu, Feb 29 2024 at 14:07, Dimitri Sivanich wrote:
> 
> The subject lacks a subsystem prefix. You're doing this for how many
> decades now?
> 
> > The Intel IOMMU code currently tries to allocate all DMAR fault
> > interrupt 
> > +#ifdef CONFIG_X86_LOCAL_APIC  
> 
> I seriously doubt that this code can ever be compiled w/o X86_LOCAL_APIC:
> 
> obj-$(CONFIG_DMAR_TABLE) += dmar.o
> 
> config DMAR_TABLE
>         bool
> 
> config INTEL_IOMMU
>         depends on PCI_MSI && ACPI && X86
>         select DMAR_TABLE
> 
> config IRQ_REMAP
>         depends on X86_64 && X86_IO_APIC && PCI_MSI && ACPI
>         select DMAR_TABLE
> 
> config X86_LOCAL_APIC
>         def_bool y
>         depends on X86_64 || SMP || X86_32_NON_STANDARD || X86_UP_APIC ||
> PCI_MSI
> 
> What are you trying to achieve here other than #ifdef voodoo?
> 
> > +static void __init irq_remap_enable_fault_handling_thr(struct
> > work_struct *work) +{
> > +	irq_remap_enable_fault_handling();  
> 
> because if INTEL_IOMMU=y and IRQ_REMAP=n then X86_LOCAL_APIC=y and this
> muck gets invoked for nothing. 'git grep irq_remap_enable_fault_handling
> include/' might give you a hint.
> 
> > +}
> > +
> > +static int __init assign_dmar_vectors(void)
> > +{
> > +	struct work_struct irq_remap_work;
> > +	int nid;
> > +
> > +	INIT_WORK(&irq_remap_work,
> > irq_remap_enable_fault_handling_thr);
> > +	cpus_read_lock();
> > +	for_each_online_node(nid) {
> > +		/* Boot cpu dmar vectors are assigned before the rest
> > */
> > +		if (nid == cpu_to_node(get_boot_cpu_id()))
> > +			continue;
> > +		schedule_work_on(cpumask_first(cpumask_of_node(nid)),
> > +				 &irq_remap_work);
> > +		flush_work(&irq_remap_work);
> > +	}
> > +	cpus_read_unlock();
> > +	return 0;
> > +}
> > +
> > +arch_initcall(assign_dmar_vectors);  
> 
> Stray newline before arch_initcall(), but that's not the problem.
> 
> The real problems are:
> 
>  1) This approach only works when _ALL_ APs have been brought up during
>     boot. With 'maxcpus=N' on the command line this will fail to enable
>     fault handling when the APs which have not been brought up initially
>     are onlined later on.
> 
>     This might be working in practice because intel_iommu_init() will
>     enable the interrupts later on via init_dmars() unconditionally, but
>     that's far from correct because IRQ_REMAP does not depend on
>     INTEL_IOMMU.
The dmar fault interrupt is VT-d's own interrupt, not subject to IRQ_REMAP.
So this set up has nothing to do with IR, right?

Maybe we should not call it irq_remap_work, call it dmar_fault_irq_work
instead?

>  2) It leaves a gap where the reporting is not working between bringing
>     up the APs during boot and this initcall. Mostly theoretical, but
>     that does not make it more correct either.
> 
> What you really want is a cpu hotplug state in the CPUHP_BP_PREPARE_DYN
> space which enables the interrupt for the node _before_ the first AP of
> the node is brought up. That will solve the problem nicely w/o any of
> the above issues.
> 
> Thanks,
> 
>         tglx
> 


Thanks,

Jacob