[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1e6bfa0c-6733-4de2-80ae-5bc08ccbf58b@intel.com>
Date: Mon, 1 Dec 2025 20:15:19 +0200
From: Adrian Hunter <adrian.hunter@...el.com>
To: Borislav Petkov <bp@...en8.de>, Masami Hiramatsu <mhiramat@...nel.org>
CC: <linux-kernel@...r.kernel.org>, Arnaldo Carvalho de Melo
<acme@...nel.org>, Jiri Olsa <jolsa@...hat.com>, Dan Williams
<dan.j.williams@...el.com>, Ingo Molnar <mingo@...hat.com>, "H. Peter Anvin"
<hpa@...or.com>, Thomas Gleixner <tglx@...utronix.de>, Andy Lutomirski
<luto@...capital.net>, X86 ML <x86@...nel.org>
Subject: Re: [PATCH V2 2/4] x86/insn: Add AVX-512 support to the instruction
decoder
On 01/12/2025 13:25, Borislav Petkov wrote:
> On Sun, Nov 30, 2025 at 05:05:28PM +0100, Borislav Petkov wrote:
>> Resurrecting a very old thread...
>>
>> On Wed, Jul 20, 2016 at 11:30:35AM +0300, Adrian Hunter wrote:
>>> Add support for Intel's AVX-512 instructions to the instruction decoder.
>>>
>>> AVX-512 instructions are documented in Intel Architecture Instruction Set
>>> Extensions Programming Reference (February 2016).
>>>
>>> AVX-512 instructions are identified by a EVEX prefix which, for the purpose
>>> of instruction decoding, can be treated as though it were a 4-byte VEX
>>> prefix.
>>>
>>> Existing instructions which can now accept an EVEX prefix need not be
>>> further annotated in the op code map (x86-opcode-map.txt). In the case of
>>> new instructions, the op code map is updated accordingly.
>>>
>>> Also add associated Mask Instructions that are used to manipulate mask
>>> registers used in AVX-512 instructions.
>>>
>>> 'perf tools' instruction decoder is updated in a subsequent patch. And a
>>> representative set of instructions is added to the perf tools new
>>> instructions test in a subsequent patch.
>>>
>>> Signed-off-by: Adrian Hunter <adrian.hunter@...el.com>
>>> ---
>>> arch/x86/include/asm/inat.h | 17 ++-
>>> arch/x86/include/asm/insn.h | 12 +-
>>> arch/x86/lib/insn.c | 18 ++-
>>> arch/x86/lib/x86-opcode-map.txt | 263 +++++++++++++++++++++++------------
>>> arch/x86/tools/gen-insn-attr-x86.awk | 11 +-
>>> 5 files changed, 220 insertions(+), 101 deletions(-)
>>>
>>> +78: VMREAD Ey,Gy | vcvttps2udq/pd2udq Vx,Wpd (evo) | vcvttsd2usi Gv,Wx (F2),(ev) | vcvttss2usi Gv,Wx (F3),(ev) | vcvttps2uqq/pd2uqq Vx,Wx (66),(ev)
>>> +79: VMWRITE Gy,Ey | vcvtps2udq/pd2udq Vx,Wpd (evo) | vcvtsd2usi Gv,Wx (F2),(ev) | vcvtss2usi Gv,Wx (F3),(ev) | vcvtps2uqq/pd2uqq Vx,Wx (66),(ev)
>>
>> This is all fine and dandy but those (ev*) flags cause the escape table to
>> have INAT_EVEXONLY as a flag:
>>
>> const insn_attr_t inat_escape_table_1_1[INAT_OPCODE_TABLE_SIZE] = {
>> ...
>>
>> [0x79] = INAT_MODRM | INAT_VEXOK | INAT_EVEXONLY,
>>
>> };
>>
>> except that that opcode is not EVEX only. Intel's VMREAD and VMWRITE are *not*
>> EVEX insns and AMD has there EXTRQ and INSERTQ with prefixes 66 and F2
>> respectively which are SSE4a and both are not EVEX.
>>
>> The VMREAD and VMWRITE decoding happens to work out by pure chance because
>> those are without a prefix and the check for prefix id in
>> inat_get_escape_attribute() happens to not select that escape table.
>>
>> So the first thing that comes to mind is excluding opcodes like 0x79 which can
>> be mixed type from that inat_must_vex() enforcement...?
>>
>> Masami, any other ideas?
>
> This hack seems to do the trick. We probably should take a look at all the
> insn tables and if there are more opcodes like that, to turn the mixed bool
> below into a proper flag:
>
>
> diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
> index 1d1c57c74d1f..e3216da11a7c 100644
> --- a/tools/arch/x86/lib/insn.c
> +++ b/tools/arch/x86/lib/insn.c
> @@ -276,6 +276,7 @@ int insn_get_prefixes(struct insn *insn)
> int insn_get_opcode(struct insn *insn)
> {
> struct insn_field *opcode = &insn->opcode;
> + bool mixed = false;
> int pfx_id, ret;
> insn_byte_t op;
>
> @@ -348,13 +359,25 @@ int insn_get_opcode(struct insn *insn)
> while (inat_is_escape(insn->attr)) {
> /* Get escaped opcode */
> op = get_next(insn_byte_t, insn);
> +
> opcode->bytes[opcode->nbytes++] = op;
> pfx_id = insn_last_prefix_id(insn);
> +
> + printf("%s: escaped op: 0x%x, pfx_id (insn table: none, 66, f3, f2): 0x%x, attr: 0x%x\n",
> + __func__, op, pfx_id, insn->attr);
> +
> insn->attr = inat_get_escape_attribute(op, pfx_id, insn->attr);
> +
> + printf("got attr: 0x%x\n", insn->attr);
> }
>
> - if (inat_must_vex(insn->attr)) {
> + mixed = (opcode->bytes[0] == 0xf) && (opcode->bytes[1] == 0x79);
> +
> + printf("%s: must_vex, mixed: %d\n", __func__, mixed);
> +
> + if (inat_must_vex(insn->attr) && !mixed) {
> /* This instruction is bad */
> + printf("%s: must_vex bad\n", __func__);
> insn->attr = 0;
> return -EINVAL;
> }
> diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt
> index 0139b864ceef..d059c8e63bfe 100644
> --- a/tools/arch/x86/lib/x86-opcode-map.txt
> +++ b/tools/arch/x86/lib/x86-opcode-map.txt
> @@ -474,7 +474,7 @@ AVXcode: 1
> # Note: Remove (v), because vzeroall and vzeroupper becomes emms without VEX.
> 77: emms | vzeroupper | vzeroall
> 78: VMREAD Ey,Gy | vcvttps2udq/pd2udq Vx,Wpd (evo) | vcvttsd2usi Gv,Wx (F2),(ev) | vcvttss2usi Gv,Wx (F3),(ev) | vcvttps2uqq/pd2uqq Vx,Wx (66),(ev)
> -79: VMWRITE Gy,Ey | vcvtps2udq/pd2udq Vx,Wpd (evo) | vcvtsd2usi Gv,Wx (F2),(ev) | vcvtss2usi Gv,Wx (F3),(ev) | vcvtps2uqq/pd2uqq Vx,Wx (66),(ev) | EXTRQ
> +79: VMWRITE Gy,Ey | EXTRQ Vo,Uo (66) | vcvtps2udq/pd2udq Vx,Wpd (evo) | vcvtsd2usi Gv,Wx (F2),(ev) | vcvtss2usi Gv,Wx (F3),(ev) | vcvtps2uqq/pd2uqq Vx,Wx (66),(ev)
EXTRQ Vo,Uo (66) has a mandatory 66 prefix like vcvtps2uqq/pd2uqq Vx,Wx (66),(ev) so they end up on the same attribute table, but (ev) results in INAT_EVEXONLY which is unwanted.
Changing that from (ev) to (evo) is probably all that is needed e.g.
+79: VMWRITE Gy,Ey | EXTRQ Vo,Uo (66) | vcvtps2udq/pd2udq Vx,Wpd (evo) | vcvtsd2usi Gv,Wx (F2),(ev) | vcvtss2usi Gv,Wx (F3),(ev) | vcvtps2uqq/pd2uqq Vx,Wx (66),(evo)
Powered by blists - more mailing lists