lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1pr2z8pe9.fsf@fess.ebiederm.org>
Date:	Fri, 19 Mar 2010 23:42:54 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org,
	Kerstin Jonsson <kerstin.jonsson@...csson.com>,
	jbohac@...ell.com, "Yinghai Lu" <yinghai@...nel.org>,
	mingo@...e.hu, "Avi Kivity" <avi@...hat.com>,
	Thomas Renninger <trenn@...e.de>
Subject: Re: [PATCH] x86 apic: Ack all pending irqs when crashed/on kexec - V5

ebiederm@...ssion.com (Eric W. Biederman) writes:

> Andrew thanks for finding this.  I have a test case for this that
> reproduces about every other time, and I will plug this patch in and
> see it helps.  I'm not wild about how the max_loops variable is
> reused both as a timer and as a countdown timer, but the basic
> principle feels solid.
>
> I have been seeing this and for some reason I thought I was dying
> in calibrate_delay_loop().  But this is much later and much easier
> to deal with.  Since we make it to smp_init() there isn't any
> good excuse for us to fail to come up.
>
> I'm curious how much testing have you been able to do on this piece
> of code?

This code definitely makes things better in my test case.
I had the patience to wait for 12 iterations and I was
expecting 6 failures and I saw none.

I have reservations about the timeout, but the rest of the patch
is definitely doing the right thing, and something is a lot better
than nothing.

Tested-by: "Eric W. Biederman" <ebiederm@...ssion.com>


> Thomas Renninger <trenn@...e.de> writes:
>
>> From: Kerstin Jonsson <kerstin.jonsson@...csson.com>
>>
>> When the SMP kernel decides to crash_kexec() the local APICs may have
>> pending interrupts in their vector tables.
>> The setup routine for the local APIC has a deficient mechanism for
>> clearing these interrupts, it only handles interrupts that has already
>> been dispatched to the local core for servicing (the ISR register)
>> safely, it doesn't consider lower prioritized queued interrupts stored
>> in the IRR register.
>>
>> If you have more than one pending interrupt within the same 32 bit word
>> in the LAPIC vector table registers you may find yourself entering the
>> IO APIC setup with pending interrupts left in the LAPIC. This is a
>> situation for wich the IO APIC setup is not prepared. Depending of
>> what/which interrupt vector/vectors are stuck in the APIC tables your
>> system may show various degrees of malfunctioning.
>> That was the reason why the check_timer() failed in our system, the
>> timer interrupts was blocked by pending interrupts from the old kernel
>> when routed trough the IO APIC.
>>
>> Additional comment from Jiri Bohac:
>> ==============
>> If this should go into stable release,
>> I'd add some kind of limit on the number of iterations, just to be safe from
>> hard to debug lock-ups:
>>
>> +if (loops++  > MAX_LOOPS) {
>> +        printk("LAPIC pending clean-up")
>> +        break;
>> +}
>>  while (queued);
>>
>> with MAX_LOOPS something like 1E9 this would leave plenty of time for the
>> pending IRQs to be cleared and would and still cause at most a second of delay
>> if the loop were to lock-up for whatever reason.
>> ==============
>>
>>>>From trenn@...e.de:
>> V2: Use tsc if avail to bail out after 1 sec due to possible virtual apic_read
>>     calls which may take rather long (suggested by: Avi Kivity <avi@...hat.com>)
>>     If no tsc is available bail out quickly after cpu_khz, if we broke out too
>>     early and still have irqs pending (which should never happen?) we still
>>     get a WARN_ON...
>>
>> V3: - Fixed indentation -> checkpatch clean
>>     - max_loops must be signed
>>
>> V4: - Fix typo, mixed up tsc and ntsc in first rdtscll() call
>>
>> V5: Adjust WARN_ON() condition to also catch error in cpu_has_tsc case
>>
>> CC: jbohac@...ell.com
>> CC: "Yinghai Lu" <yinghai@...nel.org>
>> CC: akpm@...ux-foundation.org
>> CC: mingo@...e.hu
>> CC: "Kerstin Jonsson" <kerstin.jonsson@...csson.com>
>> CC: "Avi Kivity" <avi@...hat.com>
>> Signed-off-by: Thomas Renninger <trenn@...e.de>
>> ---
>>  arch/x86/kernel/apic/apic.c |   41 +++++++++++++++++++++++++++++++++--------
>>  1 files changed, 33 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
>> index 00187f1..cfcc87f 100644
>> --- a/arch/x86/kernel/apic/apic.c
>> +++ b/arch/x86/kernel/apic/apic.c
>> @@ -51,6 +51,7 @@
>>  #include <asm/smp.h>
>>  #include <asm/mce.h>
>>  #include <asm/kvm_para.h>
>> +#include <asm/tsc.h>
>>  
>>  unsigned int num_processors;
>>  
>> @@ -1151,8 +1152,13 @@ static void __cpuinit lapic_setup_esr(void)
>>   */
>>  void __cpuinit setup_local_APIC(void)
>>  {
>> -	unsigned int value;
>> -	int i, j;
>> +	unsigned int value, queued;
>> +	int i, j, acked = 0;
>> +	unsigned long long tsc = 0, ntsc;
>> +	long long max_loops = cpu_khz;
>> +
>> +	if (cpu_has_tsc)
>> +		rdtscll(tsc);
>>  
>>  	if (disable_apic) {
>>  		arch_disable_smp_support();
>> @@ -1204,13 +1210,32 @@ void __cpuinit setup_local_APIC(void)
>>  	 * the interrupt. Hence a vector might get locked. It was noticed
>>  	 * for timer irq (vector 0x31). Issue an extra EOI to clear ISR.
>>  	 */
>> -	for (i = APIC_ISR_NR - 1; i >= 0; i--) {
>> -		value = apic_read(APIC_ISR + i*0x10);
>> -		for (j = 31; j >= 0; j--) {
>> -			if (value & (1<<j))
>> -				ack_APIC_irq();
>> +	do {
>> +		queued = 0;
>> +		for (i = APIC_ISR_NR - 1; i >= 0; i--)
>> +			queued |= apic_read(APIC_IRR + i*0x10);
>> +
>> +		for (i = APIC_ISR_NR - 1; i >= 0; i--) {
>> +			value = apic_read(APIC_ISR + i*0x10);
>> +			for (j = 31; j >= 0; j--) {
>> +				if (value & (1<<j)) {
>> +					ack_APIC_irq();
>> +					acked++;
>> +				}
>> +			}
>>  		}
>> -	}
>> +		if (acked > 256) {
>> +			printk(KERN_ERR "LAPIC pending interrupts after %d EOI\n",
>> +			       acked);
>> +			break;
>> +		}
>> +		if (cpu_has_tsc) {
>> +			rdtscll(ntsc);
>> +			max_loops = (cpu_khz << 10) - (ntsc - tsc);
>> +		} else
>> +			max_loops--;
>> +	} while (queued && max_loops > 0);
>> +	WARN_ON(max_loops <= 0);
>>  
>>  	/*
>>  	 * Now that we are all set up, enable the APIC
>> -- 
>> 1.6.3
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ