linux-kernel - Re: [PATCH v4.1 06/10] x86/entry/vdso32: remove open-coded DWARF in sigreturn.S

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <B4C09939-9216-457E-8C93-052521FFE96F@zytor.com>
Date: Mon, 02 Feb 2026 19:57:42 -0800
From: "H. Peter Anvin" <hpa@...or.com>
To: Jens Remus <jremus@...ux.ibm.com>
CC: "Jason A. Donenfeld" <Jason@...c4.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        "Theodore Ts'o" <tytso@....edu>,
        Thomas Weißschuh <thomas.weissschuh@...utronix.de>,
        Xin Li <xin@...or.com>, Andrew Cooper <andrew.cooper3@...rix.com>,
        Andy Lutomirski <luto@...nel.org>, Ard Biesheuvel <ardb@...nel.org>,
        Borislav Petkov <bp@...en8.de>, Brian Gerst <brgerst@...il.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Ingo Molnar <mingo@...hat.com>, James Morse <james.morse@....com>,
        Jarkko Sakkinen <jarkko@...nel.org>,
        Josh Poimboeuf <jpoimboe@...nel.org>, Kees Cook <kees@...nel.org>,
        Nam Cao <namcao@...utronix.de>, Oleg Nesterov <oleg@...hat.com>,
        Perry Yuan <perry.yuan@....com>, Thomas Gleixner <tglx@...utronix.de>,
        Thomas Huth <thuth@...hat.com>, Uros Bizjak <ubizjak@...il.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-sgx@...r.kernel.org, x86@...nel.org,
        Indu Bhagat <indu.bhagat@...cle.com>,
        Claudiu Zissulescu-Ianculescu <claudiu.zissulescu-ianculescu@...cle.com>,
        Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>
Subject: Re: [PATCH v4.1 06/10] x86/entry/vdso32: remove open-coded DWARF in sigreturn.S

On February 2, 2026 9:02:48 AM PST, Jens Remus <jremus@...ux.ibm.com> wrote:
>Hello Peter!
>
>On 1/6/2026 10:18 PM, H. Peter Anvin wrote:
>> The vdso32 sigreturn.S contains open-coded DWARF bytecode, which
>> includes a hack for gdb to not try to step back to a previous call
>> instruction when backtracing from a signal handler.
>> 
>> Neither of those are necessary anymore: the backtracing issue is
>> handled by ".cfi_entry simple" and ".cfi_signal_frame", both of which
>> have been supported for a very long time now, which allows the
>> remaining frame to be built using regular .cfi annotations.
>
>Hopefully Glibc developers will do something similar for x86-64
>__restore_rt() in Glibc sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c.
>
>> 
>> Add a few more register offsets to the signal frame just for good
>> measure.
>> 
>> Replace the nop on fallthrough of the system call (which should never,
>> ever happen) with a ud2a trap.
>> diff --git a/arch/x86/entry/vdso/vdso32/sigreturn.S b/arch/x86/entry/vdso/vdso32/sigreturn.S
>
>>  	.text
>>  	.globl __kernel_sigreturn
>>  	.type __kernel_sigreturn,@function
>> -	nop /* this guy is needed for .LSTARTFDEDLSI1 below (watch for HACK) */
>>  	ALIGN
>>  __kernel_sigreturn:
>> -.LSTART_sigreturn:
>> -	popl %eax		/* XXX does this mean it needs unwind info? */
>> +	STARTPROC_SIGNAL_FRAME IA32_SIGFRAME_sigcontext
>> +	popl %eax
>> +	CFI_ADJUST_CFA_OFFSET -4
>>  	movl $__NR_sigreturn, %eax
>>  	int $0x80
>> -.LEND_sigreturn:
>
>...
>
>>  	.globl __kernel_rt_sigreturn
>>  	.type __kernel_rt_sigreturn,@function
>>  	ALIGN
>>  __kernel_rt_sigreturn:
>> -.LSTART_rt_sigreturn:
>> +	STARTPROC_SIGNAL_FRAME IA32_RT_SIGFRAME_sigcontext
>>  	movl $__NR_rt_sigreturn, %eax
>>  	int $0x80
>> -.LEND_rt_sigreturn:
>
>...
>
>> -	.section .eh_frame,"a",@progbits
>> -.LSTARTFRAMEDLSI1:
>> -	.long .LENDCIEDLSI1-.LSTARTCIEDLSI1
>> -.LSTARTCIEDLSI1:
>> -	.long 0			/* CIE ID */
>> -	.byte 1			/* Version number */
>> -	.string "zRS"		/* NUL-terminated augmentation string */
>
>Note that the "S" in "zRS" is the signal frame indication.
>
>> -	.uleb128 1		/* Code alignment factor */
>> -	.sleb128 -4		/* Data alignment factor */
>> -	.byte 8			/* Return address register column */
>> -	.uleb128 1		/* Augmentation value length */
>> -	.byte 0x1b		/* DW_EH_PE_pcrel|DW_EH_PE_sdata4. */
>> -	.byte 0			/* DW_CFA_nop */
>> -	.align 4
>> -.LENDCIEDLSI1:
>> -	.long .LENDFDEDLSI1-.LSTARTFDEDLSI1 /* Length FDE */
>> -.LSTARTFDEDLSI1:
>> -	.long .LSTARTFDEDLSI1-.LSTARTFRAMEDLSI1 /* CIE pointer */
>> -	/* HACK: The dwarf2 unwind routines will subtract 1 from the
>> -	   return address to get an address in the middle of the
>> -	   presumed call instruction.  Since we didn't get here via
>> -	   a call, we need to include the nop before the real start
>> -	   to make up for it.  */
>> -	.long .LSTART_sigreturn-1-.	/* PC-relative start address */
>
>Your version does no longer have this nop nor does the FDE start one
>byte earlier.  Isn't that required for unwinders any longer?
>See excerpt from dumped DWARF and disassembly for __kernel_sigreturn()
>below.
>
>> -	.long .LEND_sigreturn-.LSTART_sigreturn+1
>> -	.uleb128 0			/* Augmentation */
>...
>
>> -	.align 4
>> -.LENDFDEDLSI1:
>> -
>> -	.long .LENDFDEDLSI2-.LSTARTFDEDLSI2 /* Length FDE */
>> -.LSTARTFDEDLSI2:
>> -	.long .LSTARTFDEDLSI2-.LSTARTFRAMEDLSI1 /* CIE pointer */
>> -	/* HACK: See above wrt unwind library assumptions.  */
>> -	.long .LSTART_rt_sigreturn-1-.	/* PC-relative start address */
>
>Ditto.
>
>> -	.long .LEND_rt_sigreturn-.LSTART_rt_sigreturn+1
>> -	.uleb128 0			/* Augmentation */
>
>Excerpt from dump of DWARF and disassembly with your patch:
>
>$ objdump -d -Wf arch/x86/entry/vdso/vdso32/vdso32.so.dbg
>...
>000001cc 0000003c 00000000 CIE   <-- CIE for __kernel_sigreturn
>  Version:               1
>  Augmentation:          "zRS"
>  Code alignment factor: 1
>  Data alignment factor: -4
>  Return address column: 8
>  Augmentation data:     1b
>  DW_CFA_def_cfa: r4 (esp) ofs 4
>  DW_CFA_offset_extended_sf: r8 (eip) at cfa+56
>  DW_CFA_offset_extended_sf: r0 (eax) at cfa+44
>  DW_CFA_offset_extended_sf: r3 (ebx) at cfa+32
>  DW_CFA_offset_extended_sf: r1 (ecx) at cfa+40
>  DW_CFA_offset_extended_sf: r2 (edx) at cfa+36
>  DW_CFA_offset_extended_sf: r4 (esp) at cfa+28
>  DW_CFA_offset_extended_sf: r5 (ebp) at cfa+24
>  DW_CFA_offset_extended_sf: r6 (esi) at cfa+20
>  DW_CFA_offset_extended_sf: r7 (edi) at cfa+16
>  DW_CFA_offset_extended_sf: r40 (es) at cfa+8
>  DW_CFA_offset_extended_sf: r41 (cs) at cfa+60
>  DW_CFA_offset_extended_sf: r42 (ss) at cfa+72
>  DW_CFA_offset_extended_sf: r43 (ds) at cfa+12
>  DW_CFA_offset_extended_sf: r9 (eflags) at cfa+64
>  DW_CFA_nop
>
>0000020c 00000010 00000044 FDE cie=000001cc pc=00001a40..00001a4a   <-- FDE for __kernel_sigreturn
>  DW_CFA_advance_loc: 1 to 00001a41
>  DW_CFA_def_cfa_offset: 0
>
>[ The FDE covers the range [1a40..1a4a[.  Previously it would have
>started one byte earlier (at the nop), so that the range would have
>been [1a3f..1a4a[.  This is/was required for unwinders that always
>subtract one from the unwound return address, so that it points into
>the instruction that invoked the function (e.g. call) instead of behind
>it, in case it was invoked by a non-returning function.  Such an
>unwinder would now lookup IP=1a3f as belonging to int80_landing_pad (and
>use the DWARF rules applicable to its last instruction) instead of
>__kernel_sigreturn (and its rules).  Likewise for __kernel_rt_sigreturn. ]
>
>...
>00001a3c <int80_landing_pad>:
>    1a3c:       5d                      pop    %ebp
>    1a3d:       5a                      pop    %edx
>    1a3e:       59                      pop    %ecx
>    1a3f:       c3                      ret
>
>00001a40 <__kernel_sigreturn>:
>    1a40:       58                      pop    %eax
>    1a41:       b8 77 00 00 00          mov    $0x77,%eax
>    1a46:       cd 80                   int    $0x80
>
>00001a48 <vdso32_sigreturn_landing_pad>:
>    1a48:       0f 0b                   ud2
>    1a4a:       8d b6 00 00 00 00       lea    0x0(%esi),%esi
>
>
>Excerpt without your patch:
>
>$ objdump -d -Wf arch/x86/entry/vdso/vdso32/vdso32.so.dbg
>...
>000001cc 00000010 00000000 CIE  <-- CIE for __kernel_sigreturn and __kernel_rt_sigreturn
>  Version:               1
>  Augmentation:          "zRS"
>  Code alignment factor: 1
>  Data alignment factor: -4
>  Return address column: 8
>  Augmentation data:     1b
>  DW_CFA_nop
>  DW_CFA_nop
>
>000001e0 00000068 00000018 FDE cie=000001cc pc=00001a6f..00001a78  <-- FDE for __kernel_sigreturn
>  DW_CFA_def_cfa_expression (DW_OP_breg4 (esp): 32; DW_OP_deref)
>  DW_CFA_expression: r0 (eax) (DW_OP_breg4 (esp): 48)
>  DW_CFA_expression: r1 (ecx) (DW_OP_breg4 (esp): 44)
>  DW_CFA_expression: r2 (edx) (DW_OP_breg4 (esp): 40)
>  DW_CFA_expression: r3 (ebx) (DW_OP_breg4 (esp): 36)
>  DW_CFA_expression: r5 (ebp) (DW_OP_breg4 (esp): 28)
>  DW_CFA_expression: r6 (esi) (DW_OP_breg4 (esp): 24)
>  DW_CFA_expression: r7 (edi) (DW_OP_breg4 (esp): 20)
>  DW_CFA_expression: r8 (eip) (DW_OP_breg4 (esp): 60)
>  DW_CFA_advance_loc: 2 to 00001a71
>  DW_CFA_def_cfa_expression (DW_OP_breg4 (esp): 28; DW_OP_deref)
>  DW_CFA_expression: r0 (eax) (DW_OP_breg4 (esp): 44)
>  DW_CFA_expression: r1 (ecx) (DW_OP_breg4 (esp): 40)
>  DW_CFA_expression: r2 (edx) (DW_OP_breg4 (esp): 36)
>  DW_CFA_expression: r3 (ebx) (DW_OP_breg4 (esp): 32)
>  DW_CFA_expression: r5 (ebp) (DW_OP_breg4 (esp): 24)
>  DW_CFA_expression: r6 (esi) (DW_OP_breg4 (esp): 20)
>  DW_CFA_expression: r7 (edi) (DW_OP_breg4 (esp): 16)
>  DW_CFA_expression: r8 (eip) (DW_OP_breg4 (esp): 56)
>
>
>[ See how the FDE for __kernel_sigreturn covers the range [1a6f..1a78[.
>An unwinder that always subtracts one from the return address would
>lookup IP=1a6f as belonging to __kernel_sigreturn (and use the DWARF
>rules applicable to the nop preceeding its symbol).  Likewise for
>__kernel_rt_sigreturn.  Or is that no longer true?  ]
>
>...
>00001a5c <int80_landing_pad>:
>    1a5c:       5d                      pop    %ebp
>    1a5d:       5a                      pop    %edx
>    1a5e:       59                      pop    %ecx
>    1a5f:       c3                      ret
>    1a60:       90                      nop
>    1a61:       8d b4 26 00 00 00 00    lea    0x0(%esi,%eiz,1),%esi
>    1a68:       2e 8d b4 26 00 00 00    lea    %cs:0x0(%esi,%eiz,1),%esi
>    1a6f:       00
>
>00001a70 <__kernel_sigreturn>:
>    1a70:       58                      pop    %eax
>    1a71:       b8 77 00 00 00          mov    $0x77,%eax
>    1a76:       cd 80                   int    $0x80
>
>Thanks and regards,
>Jens

That hack dates back from before the signal frame extension. It is no longer necessary.