linux-kernel - Re: [PATCH v7 1/4] syscalls: Restore address limit after a syscall

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170425062324.pdpi5v7ypobw74ki@gmail.com>
Date:   Tue, 25 Apr 2017 08:23:24 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Kees Cook <keescook@...omium.org>
Cc:     Thomas Garnier <thgarnie@...gle.com>,
        Martin Schwidefsky <schwidefsky@...ibm.com>,
        Heiko Carstens <heiko.carstens@...ibm.com>,
        Arnd Bergmann <arnd@...db.de>,
        Dave Hansen <dave.hansen@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Howells <dhowells@...hat.com>,
        René Nyffenegger <mail@...enyffenegger.ch>,
        "Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Oleg Nesterov <oleg@...hat.com>,
        Stephen Smalley <sds@...ho.nsa.gov>,
        Pavel Tikhomirov <ptikhomirov@...tuozzo.com>,
        Ingo Molnar <mingo@...hat.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Rik van Riel <riel@...hat.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Borislav Petkov <bp@...en8.de>,
        Brian Gerst <brgerst@...il.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Russell King <linux@...linux.org.uk>,
        Will Deacon <will.deacon@....com>,
        Catalin Marinas <catalin.marinas@....com>,
        Mark Rutland <mark.rutland@....com>,
        James Morse <james.morse@....com>,
        "linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "kernel-hardening@...ts.openwall.com" 
        <kernel-hardening@...ts.openwall.com>
Subject: Re: [PATCH v7 1/4] syscalls: Restore address limit after a syscall


* Kees Cook <keescook@...omium.org> wrote:

> On Mon, Apr 10, 2017 at 9:44 AM, Thomas Garnier <thgarnie@...gle.com> wrote:
> > This patch ensures a syscall does not return to user-mode with a kernel
> > address limit. If that happened, a process can corrupt kernel-mode
> > memory and elevate privileges.
> >
> > For example, it would mitigation this bug:
> >
> > - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
> >
> > The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> > added so each architecture can optimize this change.
> >
> > Signed-off-by: Thomas Garnier <thgarnie@...gle.com>
> > Tested-by: Kees Cook <keescook@...omium.org>
> 
> Ingo, I think this series is ready. Can you pull it? (And if not, what
> should next steps be?)

I have some feedback for other patches in this series, plus for this one as well:

> > +/*
> > + * Called before coming back to user-mode. Returning to user-mode with an
> > + * address limit different than USER_DS can allow to overwrite kernel memory.
> > + */
> > +static inline void verify_pre_usermode_state(void) {
> > +       BUG_ON(!segment_eq(get_fs(), USER_DS));
> > +}

That's not standard kernel coding style.

Also, patch titles should start with a verb - 75% of the series doesn't.

> > +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> > +#define __CHECK_USER_CALLER() \
> > +       bool user_caller = segment_eq(get_fs(), USER_DS)
> > +#define __VERIFY_PRE_USERMODE_STATE() \
> > +       if (user_caller) verify_pre_usermode_state()
> > +#else
> > +#define __CHECK_USER_CALLER()
> > +#define __VERIFY_PRE_USERMODE_STATE()
> > +asmlinkage void address_limit_check_failed(void);
> > +#endif

> > +#ifdef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE

That Kconfig name is way too long.

Plus please don't put logical operations into Kconfig names.

> > +/*
> > + * This function is called when an architecture specific implementation detected
> > + * an invalid address limit. The generic user-mode state checker will finish on
> > + * the appropriate BUG_ON.
> > + */
> > +asmlinkage void address_limit_check_failed(void)
> > +{
> > +       verify_pre_usermode_state();
> > +       panic("address_limit_check_failed called with a valid user-mode state");
> > +}
> > +#endif

Awful naming all around:

	verify_pre_usermode_state()
	address_limit_check_failed()

Both names start with very common names that makes one read these again and again. 
(And yes, there's lots of bad names in the kernel, but we should not follow bad 
examples.)

Best practice for such functionality is to use a common prefix that is both easy 
to recognize and easy to skip. For example we could use 'addr_limit_check' as the 
prefix:

	addr_limit_check_failed()
	addr_limit_check_syscall()

No need to over-specify it that it's a "pre" check - it's obvious from existing 
implementation and should be documented in the function itself for new 
implementations.

Harmonize the Kconfig namespace to the common prefix as well, i.e. use something 
like:

	CONFIG_ADDR_LIMIT_CHECK

No need to add 'ARCH' I think - an architecture that enables this should get it 
unconditionally.

etc.

It's all cobbled together I'm afraid and will need more iterations.

Thanks,

	Ingo