lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 14 Aug 2018 11:54:17 -0700
From:   Kees Cook <keescook@...omium.org>
To:     "Yuanxiaofeng (XiAn)" <yuanxiaofeng1@...wei.com>
Cc:     Matthew Wilcox <willy@...radead.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] usercopy: optimize stack check flow when the
 page-spanning test is disabled

(Please use contextual quoting in replies... mixing contextual with
top-posting becomes very hard to read...)

On Tue, Aug 14, 2018 at 6:02 AM, Yuanxiaofeng (XiAn)
<yuanxiaofeng1@...wei.com> wrote:
> On Tue, Aug 14, 2018 at 8:35PM Matthew Wilcox wrote:
>> On Tue, Aug 14, 2018 at 08:17:31PM +0800, Xiaofeng Yuan wrote:
>>> The check_heap_object() checks the spanning multiple pages and slab.
>>> When the page-spanning test is disabled, the check_heap_object() is
>>> redundant for spanning multiple pages. However, the kernel stacks are
>>> multiple pages under certain conditions: CONFIG_ARCH_THREAD_STACK_ALLOCATOR
>>> is not defined and (THREAD_SIZE >= PAGE_SIZE). At this point, We can skip
>>> the check_heap_object() for kernel stacks to improve performance.
>>> Similarly, the virtually-mapped stack can skip check_heap_object() also,
>>> beacause virt_addr_valid() will return.
>>
>> Why not just check_stack_object() first, then check_heap_object() second?

Most of the dynamically-sized copies (i.e. those that will trigger
__check_object_size being used at all) come out of heap. Stack copies
tend to be a fixed size. That said, the stack check is pretty cheap:
if it's not bounded by task_stack_page(current) ... +THREAD_SIZE, it
kicks out immediately. The frame-walking will only happen if it IS
actually stack (and once finished will short-circuit all remaining
tests).

> 1, When the THREAD_SIZE is less than PAGE_SIZE, the stack will allocate memory by kmem_cache_alloc_node(), it's slab memory and will execute __check_heap_object().

Correct, though if an architecture supports stack frame analysis, this
is a more narrow check than the bulk heap object check. (i.e. it may
have sub-object granularity to determine if a copy spans a stack
frame.) This supports the idea of just doing the stack check first,
though.

> 2, When CONFIG_HARDENED_USERCOPY_PAGESPAN is enabled, the multiple-pages stacks will do some check in check_page_span().

PAGESPAN checking is buggy for a lot of reasons, unfortunately. It
should generally stay disabled unless someone is working on getting
rid of allocations that _should_ have marked themselves as spanning
pages. It's unclear if this is even a solvable problem in the kernel
right now due to how networking code manages skbs.

> So, I set some restrictions to make sure the useful check will not be skipped.

It'd be nice to find some workloads that visibly change by making the
heap/stack order change. I think the known worst-case (small-packet
UDP flooding) wouldn't get worse since both checks will be performed
in either case.

(Maybe we should also short-circuit early in heap checks if it IS a
valid heap object: no reason to go do the kernel text check after
that...)

-Kees

-- 
Kees Cook
Pixel Security

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ