linux-kernel - Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 03 Apr 2009 16:43:53 -0700
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Masami Hiramatsu <mhiramat@...hat.com>
CC:	Jim Keniston <jkenisto@...ibm.com>, Ingo Molnar <mingo@...e.hu>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Andi Kleen <andi@...stfloor.org>, kvm@...r.kernel.org,
	Steven Rostedt <rostedt@...dmis.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	systemtap-ml <systemtap@...rces.redhat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Vegard Nossum <vegard.nossum@...il.com>,
	Avi Kivity <avi@...hat.com>, Roland McGrath <roland@...hat.com>
Subject: Re: [PATCH -tip 3/6 V4.1] x86: instruction decorder API

Masami Hiramatsu wrote:
> Add x86 instruction decoder to arch-specific libraries. This decoder
> can decode all x86 instructions into prefix, opcode, modrm, sib,
> displacement and immediates. This can also show the length of
> instructions.
> 
> changes from v4:
>  - make bitmap tables static.

Hi Masami,

On the surface the overall structure looks fine, but I have a couple of 
concerns:

1. is this meant to be able to decode userspace code or just kernel 
code?  If it is supposed to be able to decode userspace code, is there a 
reason you're not dealing with 16-bit or V86 mode code at all?  If not, 
why are you including the 32-bit decoder in a 64-bit kernel (as well as 
instructions which we're pretty much guaranteed to never use in the 
kernel, such as ENTER.)

2. you're already not dealing with all existing three-byte opcode 
spaces, nor with DREX or VEX encodings for upcoming processors.  This 
doesn't matter so much for the kernel, but it does matter if this is 
supposed to be used for user-space code.

3. is there any need to deal with instruction set differences among 
processors?  (Again, this depends on the usage model.)

4. you have a bunch of magic opcode constants all over the place.  This 
means that as new instructions come in -- and they're going to be coming 
in -- this is going to be hard to update.  It would be cleaner if we 
could have an intermediate format that preprocesses down to all the 
relevant tables and perhaps even some of the code rather than 
open-coding everything in C.

This matters... for example you have:

+		} else if (opcode == 0xea /* jmp far seg:offs */) {
+			__get_immptr(insn);

... but nothing similar for opcode 0x9a.  This is extremely hard to spot 
with this kind of structure.

The more data-driven we can make it (without bloating the code too much) 
the better off we are, I believe.

	-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/