lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080814011340.GA30038@Krystal>
Date:	Wed, 13 Aug 2008 21:13:40 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	"H. Peter Anvin" <hpa@...or.com>
Cc:	Andi Kleen <andi@...stfloor.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Steven Rostedt <rostedt@...dmis.org>,
	Steven Rostedt <srostedt@...hat.com>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Miller <davem@...emloft.net>,
	Roland McGrath <roland@...hat.com>,
	Ulrich Drepper <drepper@...hat.com>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Gregory Haskins <ghaskins@...ell.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	"Luis Claudio R. Goncalves" <lclaudio@...g.org>,
	Clark Williams <williams@...hat.com>,
	Christoph Lameter <cl@...ux-foundation.org>
Subject: Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race with
	preemptible kernel and CPU hotplug

* H. Peter Anvin (hpa@...or.com) wrote:
> Mathieu Desnoyers wrote:
>> If a kernel thread is preempted in single-cpu mode right after the NOP 
>> (nop
>> about to be turned into a lock prefix), then we CPU hotplug a CPU, and 
>> then the
>> thread is scheduled back again, a SMP-unsafe atomic operation will be used 
>> on
>> shared SMP variables, leading to corruption. No corruption would happen in 
>> the
>> reverse case : going from SMP to UP is ok because we split a bit 
>> instruction
>> into tiny pieces, which does not present this condition.
>> Changing the 0x90 (single-byte nop) currently used into a 0x3E DS segment
>> override prefix should fix this issue. Since the default of the atomic
>> instructions is to use the DS segment anyway, it should not affect the
>> behavior.
>
> I believe this should be okay.  In 32-bit mode some of the security and 
> hypervisor frameworks want to set segment limits, but I don't believe they 
> ever would set DS and SS inconsistently, or that we'd handle a #GP versus 
> an #SS differently (segment violations on the stack segment are #SS, not 
> #GP.)  To be 100% sure we'd have to pick apart the modr/m byte to figure 
> out what the base register is but I think that's total overkill.
>

I guess some testing of this patch under an virtualized Linux would not
hurt. Anyone have a setup ready ? The test case is simple : Run a kernel
on a multi-CPU virtual guest.

export NR_CPUS=...
for a in `seq 1 $NR_CPUS`; do echo 0 > ./devices/system/cpu/cpu$a/online;done

> I have a vague notion that DS: prefixes came with a penalty on older CPUs, 
> so we may want to do this only when CPU hotplug is enabled, to avoid 
> penalizing older embedded systems.
>
> 	-hpa

Reading the "Intel Architecture Optimizations Manual" for older Intels :

http://developer.intel.com/design/pentium/manuals/242816.htm
Chapter 3.7 Prefixed Opcodes

The penality for instructions prefixed with other prefixes than 0x0f,
0x66 or 0x67 seems to be 1 added clock cycle to detect the prefix when
it cannot be paired.

Since we are choosing between the existing 0x90 nop followed by the
atomic instruction and this prefix applied to the atomic instruction,
we have to consider the penality cost of this nop. From the same manual,
the NOP is decoded into 1 micro-op.

Unless these architectures (386SX/DX, 486, Pentium Pro, Pentium MMX,
Pentium II) can execute more than 1 micro-op per cycle, I doubt the DS
prefix would cause any degradation compared to the 0x90 nop. And this
would free the lower stages of the pipeline from executing this NOP
micro-op.

I guess some quick performance tests with the modules I provide on my
website (URL in the patch header) could confirm or infirm this.
Actually, I just removed the dust from an old Pentium II, here are the
results. There is no performance overhead nor degradation.

NR_TESTS                                    10000000
test empty cycles :                        200833494
test test 1-byte nop xadd cycles :         340000130
test test DS override prefix xadd cycles : 340000126 *
test test LOCK xadd cycles :               530000078

processor : 0
vendor_id : GenuineIntel
cpu family  : 6
model   : 5
model name  : Pentium II (Deschutes)
stepping  : 2
cpu MHz   : 350.867
cache size  : 512 KB
fdiv_bug  : no
hlt_bug   : no
f00f_bug  : no
coma_bug  : no
fpu   : yes
fpu_exception : yes
cpuid level : 2
wp    : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr
bogomips  : 690.17

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ