lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3367da83-16b7-4c6a-bd08-d14ec4067025@oracle.com>
Date: Fri, 14 Nov 2025 10:56:48 +0100
From: Alexandre Chartre <alexandre.chartre@...cle.com>
To: Josh Poimboeuf <jpoimboe@...nel.org>
Cc: alexandre.chartre@...cle.com, linux-kernel@...r.kernel.org,
        mingo@...nel.org, peterz@...radead.org
Subject: Re: [PATCH v4 00/28] objtool: Function validation tracing


On 11/14/25 02:48, Josh Poimboeuf wrote:
> On Thu, Nov 13, 2025 at 05:48:49PM +0100, Alexandre Chartre wrote:
>> Changes:
>> ========
>>
>> V4:
>> ---
>> This version fixes a build issue when disassembly is not available. Compared
>> with V3, this is addresses by changes in patch 14 (objtool: Improve tracing
>> of alternative instructions). Other patches are similar to V3.
> 
> For the next revision, please base on tip/master, as there are some
> major objtool changes pending for the next merge window.

Ok, I will rebase the next revision on tip/master.


> Most of my comments below are bikeshedding, they are not required for
> the next revision and can be addressed in followup patch sets if you'd
> rather do it that way.

If changes are simple, I will try to address them immediately otherwise
defer to next patches.


>> - Each alternative of a group alternative is displayed with its feature
>>    name and flags: <flags><feature-name>
>>
>>    <flags> is made of the following characters:
>>
>>      '!' : ALT_FLAG_NOT
>>      '+' : ALT_FLAG_DIRECT_CALL
>>      '?' : unknown flag (i.e. any other flags)
> 
> Other than '!', the meaning of the flags isn't intuitive.  Maybe it
> should just show the source code names:
> 
>    ALT_NOT(X86_FEATURE_FOO)
> 
>    ALT_DIRECT_CALL(X86_FEATURE_BAR)
> 
>    ALT_UNKNOWN_FLAG(X86_FEATURE_BAZ)
> 

I think '?' is meaningful too, but I wasn't sure about '+'.

I am using single characters to keep the alternative name short. It can already
be fairly long because of the feature name (like "X86_FEATURE_SPEC_STORE_BYPASS_DISABLE")

Also I am assuming that flags can be combined (although that's not currently
the case) so that would be more difficult with full ALT_* names and the
result would be much longer.


>> - If an alternative is a jump table then "JUMP" is used as the feature
>>    name.
> 
> Hm, it's a bit confusing to label a jump label as an "alternative" as
> those are two distinct things (though I'm aware that objtool conflates
> the two).
> 
>> - If an alternative is an exception table then "EXCEPTION" is used as the
>>    feature name.
> 
> Ditto.
> 

Yes, the wording is not good, I use it just because objtool handles jump
labels and exception tables as alternative. I will reword to something
better.


>> Disassembly can show default alternative jumping to .altinstr_aux
>> -----------------------------------------------------------------
>> Disassembly can show a default alternative jumping to .altinstr_aux. This
>> happens when the _static_cpu_has() function is used. Its default code
>> jumps to .altinstr_aux where a test sequence is executed (test; jnz; jmp).
>>
>> At runtime, this sequence is not used because the _static_cpu_has()
>> an alternative with the X86_FEATURE_ALWAYS feature.
>>
>>
>>    debc:  perf_get_x86_pmu_capability+0xc      jmpq   0xdec1 <.altinstr_aux+0xfc> | NOP5  (X86_FEATURE_HYBRID_CPU) | jmpq   0x61a <perf_get_x86_pmu_capability+0x37>  (X86_FEATURE_ALWAYS)   # <alternative.debc>
> 
> I'm finding this one-line format considerably more difficult to parse
> than the slightly longer two-line form:
> 
>      debc:  perf_get_x86_pmu_capability+0xc      <alternative.debc>		   | X86_FEATURE_HYBRID_CPU | X86_FEATURE_ALWAYS
>      debc:  perf_get_x86_pmu_capability+0xc      jmpq   0xdec1 <.altinstr_aux+0xfc> | NOP5		    | jmpq   0x61a <perf_get_x86_pmu_capability+0x37>


Another option could be:

     debc:  perf_get_x86_pmu_capability+0xc      jmpq   0xdec1 <.altinstr_aux+0xfc> (<alternative.debc>) |
                                                 NOP5  (X86_FEATURE_HYBRID_CPU) |
                                                 jmpq   0x61a <perf_get_x86_pmu_capability+0x37>  (X86_FEATURE_ALWAYS)

I think I will use this option when displaying alternative one after the other,
and your suggestion when displaying side-by-side, and add an option to select
the display.

> 
> Also, I wonder if we can make NOP5 lowercase (nop5), since it really is
> just an instruction, not something special like a feature.

This indicates that this is a pseudo instruction, NOP5 is actually nopl 0x00(%eax,%eax,1).
Even NOP1 can be a simple nop but also xchg %rax,%rax.


>> Disassembly can show alternative jumping to the next instruction
>> ----------------------------------------------------------------
>>
>> The disassembly can show jump tables with an alternative which jumps
>> to the next instruction.
>>
>> For example:
>>
>> def9:  perf_get_x86_pmu_capability+0x49    NOP2 | jmp    defb <perf_get_x86_pmu_capability+0x4b>  (JUMP)   # <alternative.def9>
> 
> I'm also struggling to read this one.
> 
> Maybe this needs a two-line form as well:
> 
>    def9:  perf_get_x86_pmu_capability+0x49    <static_branch.def9> |
>    def9:  perf_get_x86_pmu_capability+0x49    NOP2		  | jmp    defb <perf_get_x86_pmu_capability+0x4b>
  
I will do something similar as suggested above.


>> Example 2 (--disas option): Single Instruction Alternatives
>> -----------------------------------------------------------
> 
> I would like to convert this to a dedicated "disas" subcommand which can
> be run like "objtool disas <func>" or so.  But again that can probably
> be done in a followup.

Ok, I will look at it.


>> Example 3 (--disas option): Alternatives with multiple instructions
>> -------------------------------------------------------------------
>> Alternatives with multiple instructions are displayed side-by-side, with
>> an header describing the alternative. The code in the first column is the
>> default code of the alternative.
>>
>>
>> $ ./tools/objtool/objtool --disas=__switch_to_asm --link vmlinux.o
>> __switch_to_asm:
>>    82c0:  __switch_to_asm+0x0      push   %rbp
>>    82c1:  __switch_to_asm+0x1	  push   %rbx
>>    82c2:  __switch_to_asm+0x2	  push   %r12
>>    82c4:  __switch_to_asm+0x4	  push   %r13
>>    82c6:  __switch_to_asm+0x6	  push   %r14
>>    82c8:  __switch_to_asm+0x8	  push   %r15
>>    82ca:  __switch_to_asm+0xa	  mov    %rsp,0x1670(%rdi)
>>    82d1:  __switch_to_asm+0x11	  mov    0x1670(%rsi),%rsp
>>    82d8:  __switch_to_asm+0x18	  mov    0xad8(%rsi),%rbx
>>    82df:  __switch_to_asm+0x1f	  mov    %rbx,%gs:0x0(%rip)        # 0x82e7 <__stack_chk_guard>
>>    82e7:  __switch_to_asm+0x27	  | <alternative.82e7>                   | !X86_FEATURE_ALWAYS                  | X86_FEATURE_RSB_CTXSW
> 
> Are the alternatives swapped?  I believe this comes from the following
> code, so the !X86_FEATURE_ALWAYS column should be last?
> 
> .macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
> 	ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
> 		__stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
> 		__stringify(nop;nop;__FILL_ONE_RETURN), \ftr2
> 
> .Lskip_rsb_\@:
> .endm

I will check but I process/print alternative in the order provided by
objtool (in struct alternative)


>>    82e7:  __switch_to_asm+0x27	  | jmp    0x8312 <__switch_to_asm+0x52> | NOP1                                 | mov    $0x10,%r12
>>    82e8:  __switch_to_asm+0x28	  |                                      | NOP1                                 |
>>    82e9:  __switch_to_asm+0x29	  | NOP1                                 | callq  0x82ef <__switch_to_asm+0x2f> |
>>    82ea:  __switch_to_asm+0x2a	  | NOP1                                 |                                      |
>>    82eb:  __switch_to_asm+0x2b	  | NOP1                                 |                                      |
>>    82ec:  __switch_to_asm+0x2c	  | NOP1                                 |                                      |
>>    82ed:  __switch_to_asm+0x2d	  | NOP1                                 |                                      |
>>    82ee:  __switch_to_asm+0x2e	  | NOP1                                 | int3                                 | callq  0x82f4 <__switch_to_asm+0x34>
>>    82ef:  __switch_to_asm+0x2f	  | NOP1                                 | add    $0x8,%rsp                     |
>>    82f0:  __switch_to_asm+0x30	  | NOP1                                 |                                      |
>>    82f1:  __switch_to_asm+0x31	  | NOP1                                 |                                      |
>>    82f2:  __switch_to_asm+0x32	  | NOP1                                 |                                      |
>>    82f3:  __switch_to_asm+0x33	  | NOP1                                 | lfence                               | int3
>>    82f4:  __switch_to_asm+0x34	  | NOP1                                 |                                      | callq  0x82fa <__switch_to_asm+0x3a>
>>    82f5:  __switch_to_asm+0x35	  | NOP1                                 |                                      |
>>    82f6:  __switch_to_asm+0x36	  | NOP1                                 |                                      |
>>    82f7:  __switch_to_asm+0x37	  | NOP1                                 |                                      |
>>    82f8:  __switch_to_asm+0x38	  | NOP1                                 |                                      |
>>    82f9:  __switch_to_asm+0x39	  | NOP1                                 |                                      | int3
>>    82fa:  __switch_to_asm+0x3a	  | NOP1                                 |                                      | add    $0x10,%rsp
>>    82fb:  __switch_to_asm+0x3b	  | NOP1                                 |                                      |
>>    82fc:  __switch_to_asm+0x3c	  | NOP1                                 |                                      |
>>    82fd:  __switch_to_asm+0x3d	  | NOP1                                 |                                      |
>>    82fe:  __switch_to_asm+0x3e	  | NOP1                                 |                                      | dec    %r12
>>    82ff:  __switch_to_asm+0x3f	  | NOP1                                 |                                      |
>>    8300:  __switch_to_asm+0x40	  | NOP1                                 |                                      |
>>    8301:  __switch_to_asm+0x41	  | NOP1                                 |                                      | jne    0x82ee <__switch_to_asm+0x2e>
>>    8302:  __switch_to_asm+0x42	  | NOP1                                 |                                      |
>>    8303:  __switch_to_asm+0x43	  | NOP1                                 |                                      | lfence
>>    8304:  __switch_to_asm+0x44	  | NOP1                                 |                                      |
>>    8305:  __switch_to_asm+0x45	  | NOP1                                 |                                      |
>>    8306:  __switch_to_asm+0x46	  | NOP1                                 |                                      | movq   $0xffffffffffffffff,%gs:0x0(%rip)        # 0x20b <__x86_call_depth>
>>    8307:  __switch_to_asm+0x47	  | NOP1                                 |                                      |
>>    8308:  __switch_to_asm+0x48	  | NOP1                                 |                                      |
>>    8309:  __switch_to_asm+0x49	  | NOP1                                 |                                      |
>>    830a:  __switch_to_asm+0x4a	  | NOP1                                 |                                      |
>>    830b:  __switch_to_asm+0x4b	  | NOP1                                 |                                      |
>>    830c:  __switch_to_asm+0x4c	  | NOP1                                 |                                      |
>>    830d:  __switch_to_asm+0x4d	  | NOP1                                 |                                      |
>>    830e:  __switch_to_asm+0x4e	  | NOP1                                 |                                      |
>>    830f:  __switch_to_asm+0x4f	  | NOP1                                 |                                      |
>>    8310:  __switch_to_asm+0x50	  | NOP1                                 |                                      |
>>    8311:  __switch_to_asm+0x51	  | NOP1                                 |                                      |
> 
> I like this a lot, but I think it could be vertically compressed quite a
> bit, and superfluous NOPs removed:
> 
>      82e7:  __switch_to_asm+0x27	  | <alternative.82e7>                   | !X86_FEATURE_ALWAYS                  | X86_FEATURE_RSB_CTXSW
>      82e7:  __switch_to_asm+0x27	  | jmp    0x8312 <__switch_to_asm+0x52> | nop1                                 | mov    $0x10,%r12
>      82e8:  __switch_to_asm+0x28	  |                                      | nop1                                 |
>      82e9:  __switch_to_asm+0x29	  |                                      | callq  0x82ef <__switch_to_asm+0x2f> |
>      82ee:  __switch_to_asm+0x2e	  |                                      | int3                                 | callq  0x82f4 <__switch_to_asm+0x34>
>      82ef:  __switch_to_asm+0x2f	  |                                      | add    $0x8,%rsp                     |
>      82f3:  __switch_to_asm+0x33	  |                                      | lfence                               | int3
>      82f4:  __switch_to_asm+0x34	  |                                      |                                      | callq  0x82fa <__switch_to_asm+0x3a>
>      82f9:  __switch_to_asm+0x39	  |                                      |                                      | int3
>      82fa:  __switch_to_asm+0x3a	  |                                      |                                      | add    $0x10,%rsp
>      82fe:  __switch_to_asm+0x3e	  |                                      |                                      | dec    %r12
>      8301:  __switch_to_asm+0x41	  |                                      |                                      | jne    0x82ee <__switch_to_asm+0x2e>
>      8303:  __switch_to_asm+0x43	  |                                      |                                      | lfence
>      8306:  __switch_to_asm+0x46	  |                                      |                                      | movq   $0xffffffffffffffff,%gs:0x0(%rip)        # 0x20b <__x86_call_depth>
> 
> That reads much nicer to me.
> 

Yeah, better. I can easily do that by getting rid of trailing NOPs.

Thanks,

alex.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ