lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 12 Aug 2007 14:55:01 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Arjan van de Ven <arjan@...radead.org>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [patch 0/8] Immediate Values

* Arjan van de Ven (arjan@...radead.org) wrote:
> 
> On Sun, 2007-08-12 at 11:07 -0400, Mathieu Desnoyers wrote:
> > Hi Andrew,
> > 
> > Here is the latest version of immediate values. It applies on 2.6.23-rc2-mm2 in
> > this order:
> 
> 
> I have a concern; you seem to be patching potentially "live" code....
> 
> there are basically two options
> 1) you run the risk of triple faulting (patching an instruction while
> some other core/cpu may be decoding it may cause a triple fault)
> 2) you do an IPI to all other cpus and prevent them from executing any
> code except a small loop during the patching... this is expensive.
> 
> To be honest, neither sound very attractive to me ;(
> 

Yup, the concern is appropriate. That's why I dealt with it in the
"Immediate Values - i386 Optimization" patch. (I guess your concern
is specific to i386, x86_64 and ia64).

I have currently only implemented the i386 optimization, but x86_64 and
ia64 should be similar.

The triple fault in question is discussed in Intel's errata under the
title "Centrino Duo Processor Technology Specification Update, AH33.
Unsynchronized Cross-Modifying Code Operations Can Cause Unexpected
Instruction Execution Results." (if you refer to something else, please
tell me).

I discuss thoroughly the algorithm I use in the patch comments, which is
none of the two options you point out.

Quoting the patch:

"Overall design

The algorithm proposed by Intel applies not so well in kernel context: it
would imply disabling interrupts and looping on every CPUs while modifying
the code and would not support instrumentation of code called from interrupt
sources that cannot be disabled.

Therefore, we use a different algorithm to respect Intel's erratum (see the
quoted discussion above). We make sure that no CPU sees an out-of-date copy
of a pre-fetched instruction by 1 - using a breakpoint, which skips the
instruction that is going to be modified, 2 - issuing an IPI to every CPU to
execute a sync_core(), to make sure that even when the breakpoint is removed,
no cpu could possibly still have the out-of-date copy of the instruction,
modify the now unused 2nd byte of the instruction, and then put back the
original 1st byte of the instruction.

It has exactly the same intent as the algorithm proposed by Intel, but
it has less side-effects, scales better and supports NMI, SMI and MCE."

Moreover, just to be cautious, I arrange the alignment of instructions
to modify so their updates will be atomic. Therefore, a 5 bytes movl
will always be aligned on 4 bytes boundaries - 1, so the immediate value
that follows the 1 byte opcode will be itself aligned on 4 bytes
boundaries.

Mathieu


-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ