linux-kernel - Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4B94CEFC.40405@redhat.com>
Date:	Mon, 08 Mar 2010 12:18:36 +0200
From:	Avi Kivity <avi@...hat.com>
To:	Kerstin Jonsson <kerstin.jonsson@...csson.com>
CC:	Thomas Renninger <trenn@...e.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"jbohac@...ell.com" <jbohac@...ell.com>,
	Yinghai Lu <yinghai@...nel.org>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"mingo@...e.hu" <mingo@...e.hu>
Subject: Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec

On 02/26/2010 09:47 PM, Kerstin Jonsson wrote:
>>
>>      
>>> From: Kerstin Jonsson<kerstin.jonsson@...csson.com>
>>>
>>> When the SMP kernel decides to crash_kexec() the local APICs may have
>>> pending interrupts in their vector tables.
>>> The setup routine for the local APIC has a deficient mechanism for
>>> clearing these interrupts, it only handles interrupts that has already
>>> been dispatched to the local core for servicing (the ISR register)
>>> safely, it doesn't consider lower prioritized queued interrupts stored
>>> in the IRR register.
>>>
>>> If you have more than one pending interrupt within the same 32 bit word
>>> in the LAPIC vector table registers you may find yourself entering the
>>> IO APIC setup with pending interrupts left in the LAPIC. This is a
>>> situation for wich the IO APIC setup is not prepared. Depending of
>>> what/which interrupt vector/vectors are stuck in the APIC tables your
>>> system may show various degrees of malfunctioning.
>>> That was the reason why the check_timer() failed in our system, the
>>> timer interrupts was blocked by pending interrupts from the old kernel
>>> when routed trough the IO APIC.
>>>
>>> Additional comment from Jiri Bohac:
>>> ==============
>>> If this should go into stable release,
>>> I'd add some kind of limit on the number of iterations, just to be safe from
>>> hard to debug lock-ups:
>>>
>>> +if (loops++>   MAX_LOOPS) {
>>> +        printk("LAPIC pending clean-up")
>>> +        break;
>>> +}
>>>    while (queued);
>>>
>>> with MAX_LOOPS something like 1E9 this would leave plenty of time for the
>>> pending IRQs to be cleared and would and still cause at most a second of delay
>>> if the loop were to lock-up for whatever reason.
>>> ==============
>>>
>>>   From trenn@...e.de:
>>> Merged Jiri suggestion into the patch.
>>> Also made the max_loops depend on cpu_khz. Not sure how long an apic_read
>>> takes, as it is on the CPU it may only be one cycle and we now wait 1 sec
>>> in WARN_ON(..) case?
>>>
>>>
>>>
>>>        
>> An apic_read() can take a couple of microseconds when running
>> virtualized, so this loop may run for hours.  On the other hand,
>> virtualized hardware is unlikely to misbehave.
>>
>> Still I recommend using a clocksource (tsc would do) and not a loop count.
>>
>> --
>> error compiling committee.c: too many arguments to function
>>
>>
>>
>>      
> Is it possible/thinkable to distinguish between real and virtual targets?
> I.e. to somehow detect that the target is a virtual machine and adapt accordingly.
> There may be other cases as well, in which one would benefit from taking
> target type into consideration when e.g. estimating the reasonable number of cycles
> for a specific operation

It's possible (cpuid hypervisor bit), but I don't think it's a good 
idea.  Splitting up code paths doubles the chance of bugs.  Much better 
to find something that works both ways.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/