[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7017ead1-677f-4318-b12a-4c0269358065@oracle.com>
Date: Wed, 24 Sep 2025 11:37:37 +0200
From: Alexandre Chartre <alexandre.chartre@...cle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: alexandre.chartre@...cle.com, linux-kernel@...r.kernel.org,
mingo@...nel.org, jpoimboe@...nel.org
Subject: Re: [RFC PATCH v2 00/17] objtool: Function validation tracing
On 9/24/25 09:36, Peter Zijlstra wrote:
>
> Sorry, but I hadn't seen these patches until Josh replied to them just
> now :/
>
> In general I really like the direction this is taking; I agree with Josh
> on the +0 thing, in general foo and foo+0 are not the same. Where foo
> can be the entire function, foo+0 is always the first instruction.
Ok, I will use this rule.
> Furthermore I have some nits about the output, but I don't think we
> should hold up merging these patches for that; it could just be me
> needing to rewire my brain or something ;-)
>
>> Example 1 (--trace option): Trace the validation of the os_save() function
>> --------------------------------------------------------------------------
>>
>> $ ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr --hacks=skylake --ibt --orc --retpoline --rethunk --sls --static-call --uaccess --prefix=16 --link --trace os_xsave -v vmlinux.o
>> os_xsave: validation begin
>> 65c20: os_xsave+0x0 push %r12 - state: cfa=rsp+16 r12=(cfa-16) stack_size=16
>> 65c22: os_xsave+0x2 mov 0x0(%rip),%eax # alternatives_patched
>> 65c28: os_xsave+0x8 push %rbp - state: cfa=rsp+24 rbp=(cfa-24) stack_size=24
>> 65c29: os_xsave+0x9 mov %rdi,%rbp
>> 65c2c: os_xsave+0xc push %rbx - state: cfa=rsp+32 rbx=(cfa-32) stack_size=32
>> 65c2d: os_xsave+0xd mov 0x8(%rdi),%rbx
>> 65c31: os_xsave+0x11 mov %rbx,%r12
>> 65c34: os_xsave+0x14 shr $0x20,%r12
>> 65c38: os_xsave+0x18 test %eax,%eax
>> 65c3a: os_xsave+0x1a je 65c6a <os_xsave+0x4a> - jump taken
>> 65c6a: os_xsave+0x4a | ud2
>> 65c6c: os_xsave+0x4c | jmp 65c3c <os_xsave+0x1c> - unconditional jump
>> 65c3c: os_xsave+0x1c | xor %edx,%edx
>> 65c3e: os_xsave+0x1e | mov %rbx,%rsi
>> 65c41: os_xsave+0x21 | mov %rbp,%rdi
>> 65c44: os_xsave+0x24 | callq xfd_validate_state - call
>> 65c49: os_xsave+0x29 | mov %ebx,%eax
>> 65c4b: os_xsave+0x2b | mov %r12d,%edx
>> 65c4e: os_xsave+0x2e | <alternative.65c4e> alt 1/4 begin
>> 65c55: os_xsave+0x35 | | test %ebx,%ebx
>> 65c57: os_xsave+0x37 | | jne 65c6e <os_xsave+0x4e> - jump taken
>> 65c6e: os_xsave+0x4e | | | ud2
>> 65c70: os_xsave+0x50 | | | pop %rbx - state: cfa=rsp+24 rbx=<undef> stack_size=24
>> 65c71: os_xsave+0x51 | | | pop %rbp - state: cfa=rsp+16 rbp=<undef> stack_size=16
>> 65c72: os_xsave+0x52 | | | pop %r12 - state: cfa=rsp+8 r12=<undef> stack_size=8
>> 65c74: os_xsave+0x54 | | | xor %eax,%eax
>> 65c76: os_xsave+0x56 | | | xor %edx,%edx
>> 65c78: os_xsave+0x58 | | | xor %esi,%esi
>> 65c7a: os_xsave+0x5a | | | xor %edi,%edi
>> 65c7c: os_xsave+0x5c | | | jmpq __x86_return_thunk - return
>> 65c57: os_xsave+0x37 | | jne 65c6e <os_xsave+0x4e> - jump not taken
>> 65c59: os_xsave+0x39 | | pop %rbx - state: cfa=rsp+24 rbx=<undef> stack_size=24
>> 65c5a: os_xsave+0x3a | | pop %rbp - state: cfa=rsp+16 rbp=<undef> stack_size=16
>> 65c5b: os_xsave+0x3b | | pop %r12 - state: cfa=rsp+8 r12=<undef> stack_size=8
>> 65c5d: os_xsave+0x3d | | xor %eax,%eax
>> 65c5f: os_xsave+0x3f | | xor %edx,%edx
>> 65c61: os_xsave+0x41 | | xor %esi,%esi
>> 65c63: os_xsave+0x43 | | xor %edi,%edi
>> 65c65: os_xsave+0x45 | | jmpq __x86_return_thunk - return
>
>> | <alternative.65c4e> alt 1/4 end
>> 65c4e: os_xsave+0x2e | <alternative.65c4e> alt 2/4 begin
>> 1c3d: .altinstr_replacement+0x1c3d | | xsaves64 0x40(%rbp)
>> 65c53: os_xsave+0x33 | | xor %ebx,%ebx
>> 65c55: os_xsave+0x35 | | test %ebx,%ebx - already visited
>> | <alternative.65c4e> alt 2/4 end
>> 65c4e: os_xsave+0x2e | <alternative.65c4e> alt 3/4 begin
>> 1c38: .altinstr_replacement+0x1c38 | | xsavec64 0x40(%rbp)
>> 65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
>> | <alternative.65c4e> alt 3/4 end
>> 65c4e: os_xsave+0x2e | <alternative.65c4e> alt 4/4 begin
>> 1c33: .altinstr_replacement+0x1c33 | | xsaveopt64 0x40(%rbp)
>> 65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
>> | <alternative.65c4e> alt 4/4 end
>> 65c4e: os_xsave+0x2e | <alternative.65c4e> alt default
>> 65c4e: os_xsave+0x2e | xsave64 0x40(%rbp)
>> 65c53: os_xsave+0x33 | xor %ebx,%ebx - already visited
>
> I find it *very* hard to read these alternatives. If at all possible, I
> think something like:
>
> 65c4e: os_xsave+0x2e | xsave64 | xsaveopt64 | xsavec64 | xsaves64
> 65c53: os_xsave+0x33 | xor %ebx,%ebx
>
> Would be *much* easier to follow
>
Ok, I can try. It will certainly works well when the alternative has a very small
number of instructions (in particular for single instructions). I am not sure it
will display nicely with more instructions, especially when instructions are not
aligned on the same addresses.
But it's worth trying, and maybe have options for different formatting.
>> 65c3a: os_xsave+0x1a je 65c6a <os_xsave+0x4a> - jump not taken
>> 65c3c: os_xsave+0x1c xor %edx,%edx - already visited
>> os_xsave: validation end
>>
>>
>> Example 2 (--disas option): Disassemble perf_get_x86_pmu_capability()
>> ---------------------------------------------------------------------
>>
>> $ ./tools/objtool/objtool --disas=perf_get_x86_pmu_capability --link vmlinux.o
>> perf_get_x86_pmu_capability:
>> d000: perf_get_x86_pmu_capability endbr64
>> d004: perf_get_x86_pmu_capability+0x4 callq __fentry__
>> d009: perf_get_x86_pmu_capability+0x9 mov %rdi,%rdx
>> <alternative.d00c> default - begin
>> d00c: perf_get_x86_pmu_capability+0xc | jmpq .altinstr_aux+0x90
>
> (you probably need to relocate the target -- we never jump into alinstr)
>
Right. I probably doesn't resolve reloc in alternatives; I will check that.
>> <alternative.d00c> default - end
>> <alternative.d00c> 1/2 - begin
>> | <fake nop> (5 bytes)
>> <alternative.d00c> 1/2 end
>> <alternative.d00c> 2/2 - begin
>> 5e5: .altinstr_replacement+0x5e5 | jmpq perf_get_x86_pmu_capability+0x3f
>> <alternative.d00c> 2/2 end
>
> Idem; the above is *really* hard to decipher.
>
> d00c: perf_get_x86_pmu_capability+0xc | jmpq .altinstr_aux+0x90 | nop5 | jmpq perf_get_x86_pmu_capability+0x3f
>
>> d011: perf_get_x86_pmu_capability+0x11 ud2
>> d013: perf_get_x86_pmu_capability+0x13 movq $0x0,(%rdx)
>> d01a: perf_get_x86_pmu_capability+0x1a movq $0x0,0x8(%rdx)
>> d022: perf_get_x86_pmu_capability+0x22 movq $0x0,0x10(%rdx)
>> d02a: perf_get_x86_pmu_capability+0x2a movq $0x0,0x18(%rdx)
>> d032: perf_get_x86_pmu_capability+0x32 xor %eax,%eax
>> d034: perf_get_x86_pmu_capability+0x34 xor %edx,%edx
>> d036: perf_get_x86_pmu_capability+0x36 xor %ecx,%ecx
>> d038: perf_get_x86_pmu_capability+0x38 xor %edi,%edi
>> d03a: perf_get_x86_pmu_capability+0x3a jmpq __x86_return_thunk
>> d03f: perf_get_x86_pmu_capability+0x3f cmpq $0x0,0x0(%rip) # x86_pmu+0x10
>> d047: perf_get_x86_pmu_capability+0x47 je d013 <perf_get_x86_pmu_capability+0x13>
>> d049: perf_get_x86_pmu_capability+0x49 mov 0x0(%rip),%eax # x86_pmu+0x8
>> d04f: perf_get_x86_pmu_capability+0x4f mov %eax,(%rdi)
>> <jump alternative.d051> default
>> d051: perf_get_x86_pmu_capability+0x51 | xchg %ax,%ax
>> <jump alternative.d051> else
>> d051: perf_get_x86_pmu_capability+0x51 | jmp d053 <perf_get_x86_pmu_capability+0x53>
>> <jump alternative.d051> end
>
> this is a jump_label; if we would retain the whole 'key' reloc, and
> not only the key_addend, you could make it something like:
>
> d051: perf_get_x86_pmu_capability+0x51 [ jmp.d8 d053 <perf_get_x86_pmu_capability+0x53> ] * perf_is_hybrid
>
> (also, this here reads like it is either nop2 or jmp.d8 +0, which is
> 'weird')
>
I will try to improve this.
Thanks for feedback.
alex.
>
>> d053: perf_get_x86_pmu_capability+0x53 mov 0x0(%rip),%rdi # x86_pmu+0xd8
>> <alternative.d05a> default - begin
>> d05a: perf_get_x86_pmu_capability+0x5a | callq __sw_hweight64
>> <alternative.d05a> default - end
>> <alternative.d05a> 1/1 - begin
>> 5ea: .altinstr_replacement+0x5ea | popcnt %rdi,%rax
>> <alternative.d05a> 1/1 end
>
> d05a: perf_get_x86_pmu_capability+0x5a | callq __sw_hweight64 | popcnt %rdi,%rax
>
>
>> d05f: perf_get_x86_pmu_capability+0x5f mov %eax,0x4(%rdx)
>
>> <jump alternative.d062> default
>> d062: perf_get_x86_pmu_capability+0x62 | xchg %ax,%ax
>> <jump alternative.d062> else
>> d062: perf_get_x86_pmu_capability+0x62 | jmp d064 <perf_get_x86_pmu_capability+0x64>
>> <jump alternative.d062> end
>
> Same jump_label again, with the same problem, the target seems 'wrong'
>
>> d064: perf_get_x86_pmu_capability+0x64 mov 0x0(%rip),%rdi # x86_pmu+0xe0
>> <alternative.d06b> default - begin
>> d06b: perf_get_x86_pmu_capability+0x6b | callq __sw_hweight64
>> <alternative.d06b> default - end
>> <alternative.d06b> 1/1 - begin
>> 5ef: .altinstr_replacement+0x5ef | popcnt %rdi,%rax
>> <alternative.d06b> 1/1 end
>> d070: perf_get_x86_pmu_capability+0x70 mov %eax,0x8(%rdx)
>> d073: perf_get_x86_pmu_capability+0x73 mov 0x0(%rip),%eax # x86_pmu+0xf8
>> d079: perf_get_x86_pmu_capability+0x79 mov %eax,0xc(%rdx)
>> d07c: perf_get_x86_pmu_capability+0x7c mov %eax,0x10(%rdx)
>> d07f: perf_get_x86_pmu_capability+0x7f mov 0x0(%rip),%rax # x86_pmu+0x108
>> d086: perf_get_x86_pmu_capability+0x86 mov %eax,0x14(%rdx)
>> d089: perf_get_x86_pmu_capability+0x89 mov 0x0(%rip),%eax # x86_pmu+0x110
>> d08f: perf_get_x86_pmu_capability+0x8f mov %eax,0x18(%rdx)
>> d092: perf_get_x86_pmu_capability+0x92 movzbl 0x0(%rip),%ecx # x86_pmu+0x1d1
>> d099: perf_get_x86_pmu_capability+0x99 movzbl 0x1c(%rdx),%eax
>> d09d: perf_get_x86_pmu_capability+0x9d shr %cl
>> d09f: perf_get_x86_pmu_capability+0x9f and $0x1,%ecx
>> d0a2: perf_get_x86_pmu_capability+0xa2 and $0xfffffffe,%eax
>> d0a5: perf_get_x86_pmu_capability+0xa5 or %ecx,%eax
>> d0a7: perf_get_x86_pmu_capability+0xa7 mov %al,0x1c(%rdx)
>> d0aa: perf_get_x86_pmu_capability+0xaa xor %eax,%eax
>> d0ac: perf_get_x86_pmu_capability+0xac xor %edx,%edx
>> d0ae: perf_get_x86_pmu_capability+0xae xor %ecx,%ecx
>> d0b0: perf_get_x86_pmu_capability+0xb0 xor %edi,%edi
>> d0b2: perf_get_x86_pmu_capability+0xb2 jmpq __x86_return_thunk
Powered by blists - more mailing lists