linux-kernel - Re: [kernel-hardening] Re: [RFC PATCH 6/6] arm64: add VMAP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKv+Gu88sHCmq8t9nAe3Dqx_Sg_+pArt6ev_wDrKLBYSOy2PuA@mail.gmail.com>
Date:   Thu, 20 Jul 2017 18:30:10 +0100
From:   Ard Biesheuvel <ard.biesheuvel@...aro.org>
To:     James Morse <james.morse@....com>,
        Laura Abbott <labbott@...hat.com>,
        Mark Rutland <mark.rutland@....com>,
        Laura Abbott <labbott@...oraproject.org>
Cc:     Kernel Hardening <kernel-hardening@...ts.openwall.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Takahiro Akashi <akashi.takahiro@...aro.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Dave Martin <dave.martin@....com>,
        Will Deacon <will.deacon@....com>,
        Kees Cook <keescook@...omium.org>
Subject: Re: [kernel-hardening] Re: [RFC PATCH 6/6] arm64: add VMAP_STACK and
 detect out-of-bounds SP

On 20 July 2017 at 09:56, Ard Biesheuvel <ard.biesheuvel@...aro.org> wrote:
> On 20 July 2017 at 09:36, James Morse <james.morse@....com> wrote:
>> Hi Ard,
>>
>> On 20/07/17 06:35, Ard Biesheuvel wrote:
>>> On 20 July 2017 at 00:32, Laura Abbott <labbott@...hat.com> wrote:
>>>> I didn't notice any performance impact but I also wasn't trying that
>>>> hard. I did try this with a different configuration and ran into
>>>> stackspace errors almost immediately:
>>>>
>>>> [ 0.358026] smp: Brought up 1 node, 8 CPUs
>>>> [ 0.359359] SMP: Total of 8 processors activated.
>>>> [ 0.359542] CPU features: detected feature: 32-bit EL0 Support
>>>> [    0.361781] Insufficient stack space to handle exception!
>>
>> [...]
>>
>>>> [    0.367382] Task stack: [0xffffff8008e80000..0xffffff8008e84000]
>>>> [    0.367519] IRQ stack:  [0xffffffc03bf62000..0xffffffc03bf66000]
>>>
>>> The IRQ stack is not 16K aligned ...
>>
>>>> [    0.367687] ESR: 0x00000000 -- Unknown/Uncategorized
>>>> [    0.367868] FAR: 0x0000000000000000
>>>> [    0.368059] Kernel panic - not syncing: kernel stack overflow
>>>> [    0.368252] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.12.0-00018-ge9cf49d604ef-dirty #23
>>>> [    0.368427] Hardware name: linux,dummy-virt (DT)
>>>> [    0.368612] Call trace:
>>>> [    0.368774] [<ffffff8008087fd8>] dump_backtrace+0x0/0x228
>>>> [    0.368979] [<ffffff80080882c8>] show_stack+0x10/0x20
>>>> [    0.369270] [<ffffff80084602dc>] dump_stack+0x88/0xac
>>>> [    0.369459] [<ffffff800816328c>] panic+0x120/0x278
>>>> [    0.369582] [<ffffff8008088b40>] handle_bad_stack+0xd0/0xd8
>>>> [    0.369799] [<ffffff80080bfb94>] __do_softirq+0x74/0x210
>>>> [    0.370560] SMP: stopping secondary CPUs
>>>> [    0.384269] Rebooting in 5 seconds..
>>>>
>>>> The config is based on what I use for booting my Hikey android
>>>> board. I haven't been able to narrow down exactly which
>>>> set of configs set this off.
>>>>
>>>
>>> ... so for some reason, the percpu atom size change fails to take effect here.
>>
>> I'm not completely up to speed with these series, so this may be noise:
>>
>> When we added the IRQ stack Jungseok Lee discovered that alignment greater than
>> PAGE_SIZE only applies to CPU0. Secondary CPUs read the per-cpu init data into a
>> page-aligned area, but any greater alignment requirement is lost.
>>
>> Because of this the irqstack was only 16byte aligned, and struct thread_info had
>> to be be discovered without depending on stack alignment.
>>
>
> We [attempted to] address that by increasing the per-CPU atom size to
> THREAD_ALIGN if CONFIG_VMAP_STACK=y, but as I am typing this, I wonder
> if that percolates all the way down to the actual vmap() calls. I will
> investigate ...

The issue is easily reproducible in QEMU as well, when building from
the same config. I tracked it down to CONFIG_NUMA=y, which sets
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y, affecting the placement of
the static per-CPU data (including the IRQ stack).

However, what I hadn't realised is that the first chunk is referenced
via the linear mapping, so we will need to [vm]allocate the per-CPU
IRQ stacks explicitly, and record the address in a per-CPU pointer
variable instead.

I have updated my branch accordingly.