lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20071222020700.GA7966@elte.hu>
Date:	Sat, 22 Dec 2007 03:07:00 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Glauber de Oliveira Costa <gcosta@...hat.com>
Cc:	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	glommer@...il.com, tglx@...utronix.de, ehabkost@...hat.com,
	jeremy@...p.org, avi@...ranet.com, anthony@...emonkey.ws,
	virtualization@...ts.linux-foundation.org, rusty@...tcorp.com.au,
	ak@...e.de, chrisw@...s-sol.org, rostedt@...dmis.org,
	hpa@...or.com, zach@...are.com, roland@...hat.com
Subject: Re: [PATCH 18/19] move _set_gate and its users to a common location


* Glauber de Oliveira Costa <gcosta@...hat.com> wrote:

> This patch moves _set_gate and its users to desc.h. We can now use 
> common code for x86_64 and i386.

a few days ago i started seeing weird crashes on 64-bit x86 in the 
random-kernel-bootup tests. Nothing truly reproducable to be bisectable, 
but quality of x86.git went down drastically. Double faults, triple 
faults, crashes all around the place, with every few dozen kernel 
bootups.

Today i found a config and a workload that triggered the crashes a bit 
more reliably. I still needed a 6 hours bisection marathon to pinpoint 
the patch - the one in this thread. A review of it with H. Peter Anvin 
pinpointed the breakage:

> +static inline void set_system_gate(unsigned int n, void *addr)
> +{
> +	BUG_ON((unsigned)n > 0xFF);
> +	_set_gate(n, GATE_TRAP, addr, 0x3, 0, __KERNEL_CS);
> +}

> -static inline void set_system_gate(int nr, void *func)
> -{
> -	BUG_ON((unsigned)nr > 0xFF);
> -	_set_gate(nr, GATE_INTERRUPT, (unsigned long) func, 3, 0);
> -}

you changed the type of system gates on 64-bit from GATE_INTERRUPT to 
GATE_TRAP. The effect of this is that these gates enter with interrupts 
enabled - instead of interrupts disabled. This, amongst others, affects 
the following key gate:

        set_system_gate(IA32_SYSCALL_VECTOR, ia32_syscall);

which relies on disabled interrupts to fix up its stack. If an interrupt 
comes in the wrong moment then we get a kernel stack corruption that is 
not survivable.

the reason for this bug was that you tried to do too many changes in a 
single patch. You did a cleanup, you did unification and you moved code 
around. It was totally non-obvious what you did and the resulting patch 
was not reviewable at all - even after the bisection poinpointed the 
patch, it took us almost 30 minutes to figure out where the bug was. If 
this unstructured, careless mess of patches continues then we are not 
going to be able to accept any more 64-bit paravirt patches into 
x86.git.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ