lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Jan 2022 23:53:32 -0800
From:   Dan Li <ashimida@...ux.alibaba.com>
To:     gcc-patches@....gnu.org, richard.earnshaw@....com,
        marcus.shawcroft@....com, kyrylo.tkachov@....com, hp@....gnu.org,
        ndesaulniers@...gle.com, nsz@....gnu.org, pageexec@...il.com,
        qinzhao@....gnu.org, richard.sandiford@....com,
        linux-hardening@...r.kernel.org
Cc:     pcc@...gle.com, samitolvanen@...gle.com, ardb@...nel.org,
        keescook@...omium.org
Subject: Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler
 support for Shadow Call Stack

Hi, all,

Sorry for bothering.

I'm trying to commit aarch64 scs code to the gcc and there is an issue
that I'm not sure about, could someone give me some suggestions?
(To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) )

When clang enables scs, the following instructions are usually generated:

str     x30, [x18], 8
ldp     x29, x30, [sp], 16
......
ldp     x29, x30, [sp], 16		## x30 pop
ldr     x30, [x18, -8]!   		## x30 pop again
ret

The x30 register is popped twice here, Richard suggested that we can
omit the first x30 pop here.

AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm
missing something with the kernel, could someone give some suggestions?

The previous discussion can be found here [1].

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html

Thanks a lot!
Dan

On 1/25/22 22:51, Dan Li wrote:
> 
> 
> On 1/25/22 02:19, Richard Sandiford wrote:
>> Dan Li <ashimida@...ux.alibaba.com> writes:
>>>>> +
>>>>>       if (flag_stack_usage_info)
>>>>>         current_function_static_stack_size = constant_lower_bound (frame_size);
>>>>> @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall)
>>>>>           RTX_FRAME_RELATED_P (insn) = 1;
>>>>>         }
>>>>> +  /* Pop return address from shadow call stack.  */
>>>>> +  if (aarch64_shadow_call_stack_enabled ())
>>>>> +    emit_insn (gen_scs_pop ());
>>>>> +
>>>>
>>>> This looks correct, but following on from the above, I guess we continue
>>>> to restore x30 from the frame in the traditional way first, and then
>>>> overwrite it with the SCS value.  I think the output code would be
>>>> slightly neater if we suppressed the first restore of x30.
>>>>
>>> Yes, the current epilogue is like:
>>>       .......
>>>       ldp     x29, x30, [sp], #16
>>> +   ldr     x30, [x18, #-8]!
>>>
>>> I think may be we can keep the two x30 pops here, for two reasons:
>>> 1) The implementation of scs in clang is to pop x30 twice, it might be
>>> better to be consistent with clang here[1].
>>
>> This doesn't seem a very compelling reason on its own, unless it's
>> explicitly meant to be observable behaviour.  (But then, observed how?)
>>
> 
> Well, probably sticking to pop x30 twice is not a good idea.
> AFAICT, there doesn't seem to be an explicit requirement that
> this behavior must be followed.
> 
> BTW:
> Do we still need to emit the .cfi_restore 30 directive here if we
> want to save a pop x30? (Sorry I'm not familiar enough with DWARF.)
> 
> Since the aarch64 linux kernel always enables -fno-omit-frame-pointer,
> the generated assembly code may be as follows:
> 
> str     x30, [x18], 8
> ldp     x29, x30, [sp], 16
> ......
> ldr     x29, [sp], 16
>                          ## Do we still need a .cfi_restore 30 here
> .cfi_restore 29
> .cfi_def_cfa_offset 0
> ldr     x30, [x18, -8]!
>                          ## Or may be a non-CFA based directive here
> ret
> 
>>> 2) If we keep the first restore of x30, it may provide some flexibility
>>> for other scenarios. For example, we can directly patch the scs_push/pop
>>> insns in the binary to turn it into a binary without scs;
>>
>> Is that a supported (and ideally documented) use case?  If it is,
>> then it becomes important for correctness that the compiler emits
>> exactly the opcodes in the original patch.  If it isn't, then the
>> compiler can emit other instructions that have the same effect.
>>
> 
> Oh, no, this is just a little trick that can be used for compatibility.
> (Maybe some scenarios such as our company sometimes need to be
> compatible with some non-open source binaries from different
> manufacturers, two pops could make life easier :). )
> 
> If this is not a consideration for our community, please ignore
> this request.
> 
>> I'd like a definitive ruling on this from the kernel folks before
>> the patch goes in.
>>
> 
> Ok, I'll cc some kernel folks to make sure I didn't miss something.
> 
>> If binary patching is supposed to be possible then scs_push and
>> scs_pop *do* need to be separate define_insns.  But they also need
>> to have some magic unspec that differentiates them from normal
>> pushes and pops, e.g.:
>>
>>    (set ...mem...
>>         (unspec:DI [...reg...] UNSPEC_SCS_PUSH))
>>
>> so that there is no chance that the pattern would be treated as
>> a normal move and optimised accordingly.
>>
> 
> Yeah, this template looks more appropriate if it is to be treated
> as a special directive.
> 
> Thanks for your suggestions,
> Dan

Powered by blists - more mailing lists