[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3367da83-16b7-4c6a-bd08-d14ec4067025@oracle.com>
Date: Fri, 14 Nov 2025 10:56:48 +0100
From: Alexandre Chartre <alexandre.chartre@...cle.com>
To: Josh Poimboeuf <jpoimboe@...nel.org>
Cc: alexandre.chartre@...cle.com, linux-kernel@...r.kernel.org,
mingo@...nel.org, peterz@...radead.org
Subject: Re: [PATCH v4 00/28] objtool: Function validation tracing
On 11/14/25 02:48, Josh Poimboeuf wrote:
> On Thu, Nov 13, 2025 at 05:48:49PM +0100, Alexandre Chartre wrote:
>> Changes:
>> ========
>>
>> V4:
>> ---
>> This version fixes a build issue when disassembly is not available. Compared
>> with V3, this is addresses by changes in patch 14 (objtool: Improve tracing
>> of alternative instructions). Other patches are similar to V3.
>
> For the next revision, please base on tip/master, as there are some
> major objtool changes pending for the next merge window.
Ok, I will rebase the next revision on tip/master.
> Most of my comments below are bikeshedding, they are not required for
> the next revision and can be addressed in followup patch sets if you'd
> rather do it that way.
If changes are simple, I will try to address them immediately otherwise
defer to next patches.
>> - Each alternative of a group alternative is displayed with its feature
>> name and flags: <flags><feature-name>
>>
>> <flags> is made of the following characters:
>>
>> '!' : ALT_FLAG_NOT
>> '+' : ALT_FLAG_DIRECT_CALL
>> '?' : unknown flag (i.e. any other flags)
>
> Other than '!', the meaning of the flags isn't intuitive. Maybe it
> should just show the source code names:
>
> ALT_NOT(X86_FEATURE_FOO)
>
> ALT_DIRECT_CALL(X86_FEATURE_BAR)
>
> ALT_UNKNOWN_FLAG(X86_FEATURE_BAZ)
>
I think '?' is meaningful too, but I wasn't sure about '+'.
I am using single characters to keep the alternative name short. It can already
be fairly long because of the feature name (like "X86_FEATURE_SPEC_STORE_BYPASS_DISABLE")
Also I am assuming that flags can be combined (although that's not currently
the case) so that would be more difficult with full ALT_* names and the
result would be much longer.
>> - If an alternative is a jump table then "JUMP" is used as the feature
>> name.
>
> Hm, it's a bit confusing to label a jump label as an "alternative" as
> those are two distinct things (though I'm aware that objtool conflates
> the two).
>
>> - If an alternative is an exception table then "EXCEPTION" is used as the
>> feature name.
>
> Ditto.
>
Yes, the wording is not good, I use it just because objtool handles jump
labels and exception tables as alternative. I will reword to something
better.
>> Disassembly can show default alternative jumping to .altinstr_aux
>> -----------------------------------------------------------------
>> Disassembly can show a default alternative jumping to .altinstr_aux. This
>> happens when the _static_cpu_has() function is used. Its default code
>> jumps to .altinstr_aux where a test sequence is executed (test; jnz; jmp).
>>
>> At runtime, this sequence is not used because the _static_cpu_has()
>> an alternative with the X86_FEATURE_ALWAYS feature.
>>
>>
>> debc: perf_get_x86_pmu_capability+0xc jmpq 0xdec1 <.altinstr_aux+0xfc> | NOP5 (X86_FEATURE_HYBRID_CPU) | jmpq 0x61a <perf_get_x86_pmu_capability+0x37> (X86_FEATURE_ALWAYS) # <alternative.debc>
>
> I'm finding this one-line format considerably more difficult to parse
> than the slightly longer two-line form:
>
> debc: perf_get_x86_pmu_capability+0xc <alternative.debc> | X86_FEATURE_HYBRID_CPU | X86_FEATURE_ALWAYS
> debc: perf_get_x86_pmu_capability+0xc jmpq 0xdec1 <.altinstr_aux+0xfc> | NOP5 | jmpq 0x61a <perf_get_x86_pmu_capability+0x37>
Another option could be:
debc: perf_get_x86_pmu_capability+0xc jmpq 0xdec1 <.altinstr_aux+0xfc> (<alternative.debc>) |
NOP5 (X86_FEATURE_HYBRID_CPU) |
jmpq 0x61a <perf_get_x86_pmu_capability+0x37> (X86_FEATURE_ALWAYS)
I think I will use this option when displaying alternative one after the other,
and your suggestion when displaying side-by-side, and add an option to select
the display.
>
> Also, I wonder if we can make NOP5 lowercase (nop5), since it really is
> just an instruction, not something special like a feature.
This indicates that this is a pseudo instruction, NOP5 is actually nopl 0x00(%eax,%eax,1).
Even NOP1 can be a simple nop but also xchg %rax,%rax.
>> Disassembly can show alternative jumping to the next instruction
>> ----------------------------------------------------------------
>>
>> The disassembly can show jump tables with an alternative which jumps
>> to the next instruction.
>>
>> For example:
>>
>> def9: perf_get_x86_pmu_capability+0x49 NOP2 | jmp defb <perf_get_x86_pmu_capability+0x4b> (JUMP) # <alternative.def9>
>
> I'm also struggling to read this one.
>
> Maybe this needs a two-line form as well:
>
> def9: perf_get_x86_pmu_capability+0x49 <static_branch.def9> |
> def9: perf_get_x86_pmu_capability+0x49 NOP2 | jmp defb <perf_get_x86_pmu_capability+0x4b>
I will do something similar as suggested above.
>> Example 2 (--disas option): Single Instruction Alternatives
>> -----------------------------------------------------------
>
> I would like to convert this to a dedicated "disas" subcommand which can
> be run like "objtool disas <func>" or so. But again that can probably
> be done in a followup.
Ok, I will look at it.
>> Example 3 (--disas option): Alternatives with multiple instructions
>> -------------------------------------------------------------------
>> Alternatives with multiple instructions are displayed side-by-side, with
>> an header describing the alternative. The code in the first column is the
>> default code of the alternative.
>>
>>
>> $ ./tools/objtool/objtool --disas=__switch_to_asm --link vmlinux.o
>> __switch_to_asm:
>> 82c0: __switch_to_asm+0x0 push %rbp
>> 82c1: __switch_to_asm+0x1 push %rbx
>> 82c2: __switch_to_asm+0x2 push %r12
>> 82c4: __switch_to_asm+0x4 push %r13
>> 82c6: __switch_to_asm+0x6 push %r14
>> 82c8: __switch_to_asm+0x8 push %r15
>> 82ca: __switch_to_asm+0xa mov %rsp,0x1670(%rdi)
>> 82d1: __switch_to_asm+0x11 mov 0x1670(%rsi),%rsp
>> 82d8: __switch_to_asm+0x18 mov 0xad8(%rsi),%rbx
>> 82df: __switch_to_asm+0x1f mov %rbx,%gs:0x0(%rip) # 0x82e7 <__stack_chk_guard>
>> 82e7: __switch_to_asm+0x27 | <alternative.82e7> | !X86_FEATURE_ALWAYS | X86_FEATURE_RSB_CTXSW
>
> Are the alternatives swapped? I believe this comes from the following
> code, so the !X86_FEATURE_ALWAYS column should be last?
>
> .macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
> ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
> __stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
> __stringify(nop;nop;__FILL_ONE_RETURN), \ftr2
>
> .Lskip_rsb_\@:
> .endm
I will check but I process/print alternative in the order provided by
objtool (in struct alternative)
>> 82e7: __switch_to_asm+0x27 | jmp 0x8312 <__switch_to_asm+0x52> | NOP1 | mov $0x10,%r12
>> 82e8: __switch_to_asm+0x28 | | NOP1 |
>> 82e9: __switch_to_asm+0x29 | NOP1 | callq 0x82ef <__switch_to_asm+0x2f> |
>> 82ea: __switch_to_asm+0x2a | NOP1 | |
>> 82eb: __switch_to_asm+0x2b | NOP1 | |
>> 82ec: __switch_to_asm+0x2c | NOP1 | |
>> 82ed: __switch_to_asm+0x2d | NOP1 | |
>> 82ee: __switch_to_asm+0x2e | NOP1 | int3 | callq 0x82f4 <__switch_to_asm+0x34>
>> 82ef: __switch_to_asm+0x2f | NOP1 | add $0x8,%rsp |
>> 82f0: __switch_to_asm+0x30 | NOP1 | |
>> 82f1: __switch_to_asm+0x31 | NOP1 | |
>> 82f2: __switch_to_asm+0x32 | NOP1 | |
>> 82f3: __switch_to_asm+0x33 | NOP1 | lfence | int3
>> 82f4: __switch_to_asm+0x34 | NOP1 | | callq 0x82fa <__switch_to_asm+0x3a>
>> 82f5: __switch_to_asm+0x35 | NOP1 | |
>> 82f6: __switch_to_asm+0x36 | NOP1 | |
>> 82f7: __switch_to_asm+0x37 | NOP1 | |
>> 82f8: __switch_to_asm+0x38 | NOP1 | |
>> 82f9: __switch_to_asm+0x39 | NOP1 | | int3
>> 82fa: __switch_to_asm+0x3a | NOP1 | | add $0x10,%rsp
>> 82fb: __switch_to_asm+0x3b | NOP1 | |
>> 82fc: __switch_to_asm+0x3c | NOP1 | |
>> 82fd: __switch_to_asm+0x3d | NOP1 | |
>> 82fe: __switch_to_asm+0x3e | NOP1 | | dec %r12
>> 82ff: __switch_to_asm+0x3f | NOP1 | |
>> 8300: __switch_to_asm+0x40 | NOP1 | |
>> 8301: __switch_to_asm+0x41 | NOP1 | | jne 0x82ee <__switch_to_asm+0x2e>
>> 8302: __switch_to_asm+0x42 | NOP1 | |
>> 8303: __switch_to_asm+0x43 | NOP1 | | lfence
>> 8304: __switch_to_asm+0x44 | NOP1 | |
>> 8305: __switch_to_asm+0x45 | NOP1 | |
>> 8306: __switch_to_asm+0x46 | NOP1 | | movq $0xffffffffffffffff,%gs:0x0(%rip) # 0x20b <__x86_call_depth>
>> 8307: __switch_to_asm+0x47 | NOP1 | |
>> 8308: __switch_to_asm+0x48 | NOP1 | |
>> 8309: __switch_to_asm+0x49 | NOP1 | |
>> 830a: __switch_to_asm+0x4a | NOP1 | |
>> 830b: __switch_to_asm+0x4b | NOP1 | |
>> 830c: __switch_to_asm+0x4c | NOP1 | |
>> 830d: __switch_to_asm+0x4d | NOP1 | |
>> 830e: __switch_to_asm+0x4e | NOP1 | |
>> 830f: __switch_to_asm+0x4f | NOP1 | |
>> 8310: __switch_to_asm+0x50 | NOP1 | |
>> 8311: __switch_to_asm+0x51 | NOP1 | |
>
> I like this a lot, but I think it could be vertically compressed quite a
> bit, and superfluous NOPs removed:
>
> 82e7: __switch_to_asm+0x27 | <alternative.82e7> | !X86_FEATURE_ALWAYS | X86_FEATURE_RSB_CTXSW
> 82e7: __switch_to_asm+0x27 | jmp 0x8312 <__switch_to_asm+0x52> | nop1 | mov $0x10,%r12
> 82e8: __switch_to_asm+0x28 | | nop1 |
> 82e9: __switch_to_asm+0x29 | | callq 0x82ef <__switch_to_asm+0x2f> |
> 82ee: __switch_to_asm+0x2e | | int3 | callq 0x82f4 <__switch_to_asm+0x34>
> 82ef: __switch_to_asm+0x2f | | add $0x8,%rsp |
> 82f3: __switch_to_asm+0x33 | | lfence | int3
> 82f4: __switch_to_asm+0x34 | | | callq 0x82fa <__switch_to_asm+0x3a>
> 82f9: __switch_to_asm+0x39 | | | int3
> 82fa: __switch_to_asm+0x3a | | | add $0x10,%rsp
> 82fe: __switch_to_asm+0x3e | | | dec %r12
> 8301: __switch_to_asm+0x41 | | | jne 0x82ee <__switch_to_asm+0x2e>
> 8303: __switch_to_asm+0x43 | | | lfence
> 8306: __switch_to_asm+0x46 | | | movq $0xffffffffffffffff,%gs:0x0(%rip) # 0x20b <__x86_call_depth>
>
> That reads much nicer to me.
>
Yeah, better. I can easily do that by getting rid of trailing NOPs.
Thanks,
alex.
Powered by blists - more mailing lists