lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <6F5FEFFD-0A9A-4181-8D15-5FC323632BA6@amacapital.net>
Date:   Wed, 11 Jul 2018 14:34:10 -0700
From:   Andy Lutomirski <luto@...capital.net>
To:     Jann Horn <jannh@...gle.com>
Cc:     yu-cheng.yu@...el.com, the arch/x86 maintainers <x86@...nel.org>,
        "H . Peter Anvin" <hpa@...or.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        kernel list <linux-kernel@...r.kernel.org>,
        linux-doc@...r.kernel.org, Linux-MM <linux-mm@...ck.org>,
        linux-arch <linux-arch@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        Arnd Bergmann <arnd@...db.de>, bsingharora@...il.com,
        Cyrill Gorcunov <gorcunov@...il.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Florian Weimer <fweimer@...hat.com>, hjl.tools@...il.com,
        Jonathan Corbet <corbet@....net>, keescook@...omiun.org,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Nadav Amit <nadav.amit@...il.com>,
        Oleg Nesterov <oleg@...hat.com>, Pavel Machek <pavel@....cz>,
        Peter Zijlstra <peterz@...radead.org>,
        ravi.v.shankar@...el.com, vedvyas.shanbhogue@...el.com
Subject: Re: [RFC PATCH v2 17/27] x86/cet/shstk: User-mode shadow stack support



> On Jul 11, 2018, at 2:10 PM, Jann Horn <jannh@...gle.com> wrote:
> 
>> On Tue, Jul 10, 2018 at 3:31 PM Yu-cheng Yu <yu-cheng.yu@...el.com> wrote:
>> 
>> This patch adds basic shadow stack enabling/disabling routines.
>> A task's shadow stack is allocated from memory with VM_SHSTK
>> flag set and read-only protection.  The shadow stack is
>> allocated to a fixed size.
>> 
>> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@...el.com>
> [...]
>> diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
>> new file mode 100644
>> index 000000000000..96bf69db7da7
>> --- /dev/null
>> +++ b/arch/x86/kernel/cet.c
> [...]
>> +static unsigned long shstk_mmap(unsigned long addr, unsigned long len)
>> +{
>> +       struct mm_struct *mm = current->mm;
>> +       unsigned long populate;
>> +
>> +       down_write(&mm->mmap_sem);
>> +       addr = do_mmap(NULL, addr, len, PROT_READ,
>> +                      MAP_ANONYMOUS | MAP_PRIVATE, VM_SHSTK,
>> +                      0, &populate, NULL);
>> +       up_write(&mm->mmap_sem);
>> +
>> +       if (populate)
>> +               mm_populate(addr, populate);
>> +
>> +       return addr;
>> +}
> 
> How does this interact with UFFDIO_REGISTER?
> 
> Is there an explicit design decision on whether FOLL_FORCE should be
> able to write to shadow stacks? I'm guessing the answer is "yes,
> FOLL_FORCE should be able to write to shadow stacks"? It might make
> sense to add documentation for this.

FOLL_FORCE should be able to write them, IMO. Otherwise we’ll need a whole new debugging API.

By the time an attacker can do FOLL_FORCE writes, the attacker can directly modify *text*, and CET is useless.  We should probably audit all uses of FOLL_FORCE and remove as many as we can get away with.

> 
> Should the kernel enforce that two shadow stacks must have a guard
> page between them so that they can not be directly adjacent, so that
> if you have too much recursion, you can't end up corrupting an
> adjacent shadow stack?

I think the answer is a qualified “no”. I would like to instead enforce a general guard page on all mmaps that don’t use MAP_FORCE. We *might* need to exempt any mmap with an address hint for compatibility.

My commercial software has been manually adding guard pages on every single mmap done by tcmalloc for years, and it has caught a couple bugs and costs essentially nothing.

Hmm. Linux should maybe add something like Windows’ “reserved” virtual memory. It’s basically a way to ask for a VA range that explicitly contains nothing and can be subsequently be turned into something useful with the equivalent of MAP_FORCE.

> 
>> +int cet_setup_shstk(void)
>> +{
>> +       unsigned long addr, size;
>> +
>> +       if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
>> +               return -EOPNOTSUPP;
>> +
>> +       size = in_ia32_syscall() ? SHSTK_SIZE_32:SHSTK_SIZE_64;
>> +       addr = shstk_mmap(0, size);
>> +
>> +       /*
>> +        * Return actual error from do_mmap().
>> +        */
>> +       if (addr >= TASK_SIZE_MAX)
>> +               return addr;
>> +
>> +       set_shstk_ptr(addr + size - sizeof(u64));
>> +       current->thread.cet.shstk_base = addr;
>> +       current->thread.cet.shstk_size = size;
>> +       current->thread.cet.shstk_enabled = 1;
>> +       return 0;
>> +}
> [...]
>> +void cet_disable_free_shstk(struct task_struct *tsk)
>> +{
>> +       if (!cpu_feature_enabled(X86_FEATURE_SHSTK) ||
>> +           !tsk->thread.cet.shstk_enabled)
>> +               return;
>> +
>> +       if (tsk == current)
>> +               cet_disable_shstk();
>> +
>> +       /*
>> +        * Free only when tsk is current or shares mm
>> +        * with current but has its own shstk.
>> +        */
>> +       if (tsk->mm && (tsk->mm == current->mm) &&
>> +           (tsk->thread.cet.shstk_base)) {
>> +               vm_munmap(tsk->thread.cet.shstk_base,
>> +                         tsk->thread.cet.shstk_size);
>> +               tsk->thread.cet.shstk_base = 0;
>> +               tsk->thread.cet.shstk_size = 0;
>> +       }
>> +
>> +       tsk->thread.cet.shstk_enabled = 0;
>> +}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ