linux-hardening - Re: [PATCH -next] powerpc: add support for syscall stack randomization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87pmki7uwf.fsf@mpe.ellerman.id.au>
Date:   Thu, 12 May 2022 23:17:04 +1000
From:   Michael Ellerman <mpe@...erman.id.au>
To:     xiujianfeng <xiujianfeng@...wei.com>,
        Nicholas Piggin <npiggin@...il.com>, benh@...nel.crashing.org,
        christophe.leroy@...roup.eu, mark.rutland@....com,
        paulus@...ba.org, tglx@...utronix.de
Cc:     linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH -next] powerpc: add support for syscall stack randomization

xiujianfeng <xiujianfeng@...wei.com> writes:
> 在 2022/5/10 17:23, Nicholas Piggin 写道:
>> Excerpts from Xiu Jianfeng's message of May 5, 2022 9:19 pm:
>>> Add support for adding a random offset to the stack while handling
>>> syscalls. This patch uses mftb() instead of get_random_int() for better
>>> performance.
>>
...
>>
>>> @@ -405,6 +407,7 @@ interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
>>>
>>>   	/* Restore user access locks last */
>>>   	kuap_user_restore(regs);
>>> +	choose_random_kstack_offset(mftb() & 0xFF);
>>>
>>>   	return ret;
>>>   }
>> So this seems to be what x86 and s390 do, but why are we choosing a
>> new offset for every interrupt when it's only used on a syscall?
>> I would rather you do what arm64 does and just choose the offset
>> at the end of system_call_exception.
> thanks for you suggestion, will do in v2.
>>
>> I wonder why the choose is separated from the add? I guess it's to
>> avoid a data dependency for stack access on an expensive random
>> function, so that makes sense (a comment would be nice in the
>> generic code).
>>
>> I don't actually know if mftb() is cheaper here than a RNG. It
>> may not be conditioned all that well either. I would be tempted

> #if defined(__powerpc64__) && (defined(CONFIG_PPC_CELL) ||
> defined(CONFIG_E500))
> #define mftb()          ({unsigned long rval;                           \
>                          asm volatile(                                   \
>                                  "90:    mfspr %0, %2;\n"                \
> ASM_FTR_IFSET(                          \
>                                          "97:    cmpwi %0,0;\n"          \
>                                          "       beq- 90b;\n", "", %1)   \
>                          : "=r" (rval) \
>                          : "i" (CPU_FTR_CELL_TB_BUG), "i" (SPRN_TBRL) :
> "cr0"); \
>                          rval;})
> #elif defined(CONFIG_PPC_8xx)
> #define mftb()          ({unsigned long rval;   \
>                          asm volatile("mftbl %0" : "=r" (rval)); rval;})
> #else
> #define mftb()          ({unsigned long rval;   \
>                          asm volatile("mfspr %0, %1" : \
>                                       "=r" (rval) : "i" (SPRN_TBRL));
> rval;})
> #endif /* !CONFIG_PPC_CELL */
>
> there are 3 implementations of mftb() in
> arch/powerpc/include/asm/vdso/timebase.h,
>
> the last two cases have only one instruction, It's obviously cheaper
> than get_random_int,

Just because it's one instruction doesn't mean it's obviously cheaper.
On some CPUs mftb takes 10s of cycles, and can also stall the pipeline.

But looking at get_random_u32() it does look pretty complicated, it
takes a lock and so on. It's also silly to call get_random_u32() for
4-bits of randomness.

My initial impression was that mftb() is too predictable to be useful
against a determined attacker. But looking closer I see that
choose_random_kstack_offset() xor's the value we pass with the existing
value. So that makes me less worried about using mftb().

We could additionally call choose_random_kstack_offset(get_random_int())
less regularly, eg. during context switch. But I guess that's too
infrequent to actually make any difference.

But limiting it to 4-bits of randomness seems insufficient. It seems
like we should allow the full 6 (10) bits, and anyone turning this
option on should probably also consider increasing their stack size.

Also did you check the help text about stack-protector under
HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET?

cheers