lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 23 Feb 2010 14:03:29 +0200
From:	Avi Kivity <avi@...hat.com>
To:	Thomas Renninger <trenn@...e.de>
CC:	linux-kernel@...r.kernel.org,
	Kerstin Jonsson <kerstin.jonsson@...csson.com>,
	jbohac@...ell.com, Yinghai Lu <yinghai@...nel.org>,
	akpm@...ux-foundation.org, mingo@...e.hu
Subject: Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec

On 02/23/2010 01:51 PM, Thomas Renninger wrote:
> From: Kerstin Jonsson<kerstin.jonsson@...csson.com>
>
> When the SMP kernel decides to crash_kexec() the local APICs may have
> pending interrupts in their vector tables.
> The setup routine for the local APIC has a deficient mechanism for
> clearing these interrupts, it only handles interrupts that has already
> been dispatched to the local core for servicing (the ISR register)
> safely, it doesn't consider lower prioritized queued interrupts stored
> in the IRR register.
>
> If you have more than one pending interrupt within the same 32 bit word
> in the LAPIC vector table registers you may find yourself entering the
> IO APIC setup with pending interrupts left in the LAPIC. This is a
> situation for wich the IO APIC setup is not prepared. Depending of
> what/which interrupt vector/vectors are stuck in the APIC tables your
> system may show various degrees of malfunctioning.
> That was the reason why the check_timer() failed in our system, the
> timer interrupts was blocked by pending interrupts from the old kernel
> when routed trough the IO APIC.
>
> Additional comment from Jiri Bohac:
> ==============
> If this should go into stable release,
> I'd add some kind of limit on the number of iterations, just to be safe from
> hard to debug lock-ups:
>
> +if (loops++>  MAX_LOOPS) {
> +        printk("LAPIC pending clean-up")
> +        break;
> +}
>   while (queued);
>
> with MAX_LOOPS something like 1E9 this would leave plenty of time for the
> pending IRQs to be cleared and would and still cause at most a second of delay
> if the loop were to lock-up for whatever reason.
> ==============
>
>  From trenn@...e.de:
> Merged Jiri suggestion into the patch.
> Also made the max_loops depend on cpu_khz. Not sure how long an apic_read
> takes, as it is on the CPU it may only be one cycle and we now wait 1 sec
> in WARN_ON(..) case?
>
>    

An apic_read() can take a couple of microseconds when running 
virtualized, so this loop may run for hours.  On the other hand, 
virtualized hardware is unlikely to misbehave.

Still I recommend using a clocksource (tsc would do) and not a loop count.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ