lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgKTcOx1hhWAGJ-g9_9o7xiGJ9v9n2RskBSCkaUMBxDkw@mail.gmail.com>
Date:   Tue, 27 Dec 2022 11:10:31 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:     Dave Hansen <dave.hansen@...ux.intel.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
        Kostya Serebryany <kcc@...gle.com>,
        Andrey Ryabinin <ryabinin.a.a@...il.com>,
        Andrey Konovalov <andreyknvl@...il.com>,
        Alexander Potapenko <glider@...gle.com>,
        Taras Madan <tarasmadan@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        "H . J . Lu" <hjl.tools@...il.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Rick Edgecombe <rick.p.edgecombe@...el.com>,
        Bharata B Rao <bharata@....com>,
        Jacob Pan <jacob.jun.pan@...ux.intel.com>,
        Ashok Raj <ashok.raj@...el.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCHv13 05/16] x86/uaccess: Provide untagged_addr() and remove
 tags before address check

On Mon, Dec 26, 2022 at 7:08 PM Kirill A. Shutemov
<kirill.shutemov@...ux.intel.com> wrote:
>
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -21,6 +22,37 @@ static inline bool pagefault_disabled(void);
>  # define WARN_ON_IN_IRQ()
>  #endif
>
> +#ifdef CONFIG_X86_64

I think this should be CONFIG_ADDRESS_MASKING or something like that.

This is not a "64 vs 32-bit feature". This is something else.

Even if you then were to select it unconditionally for 64-bit kernels
(but why would you?) it reads better if the #ifdef's make sense.

> +#define __untagged_addr(mm, addr)      ({                              \
> +       u64 __addr = (__force u64)(addr);                               \
> +       s64 sign = (s64)__addr >> 63;                                   \
> +       __addr &= READ_ONCE((mm)->context.untag_mask) | sign;           \

Now the READ_ONCE() doesn't make much sense. There shouldn't be any
data races on that thing.

Plus:

> +#define untagged_addr(addr) __untagged_addr(current->mm, addr)

I think this should at least allow caching it in 'current' without the
mm indirection.

In fact, it might be even better off as a per-cpu variable.

Because it is now in somewhat crititcal code sections:

> -#define get_user(x,ptr) ({ might_fault(); do_get_user_call(get_user,x,ptr); })
> +#define get_user(x,ptr)                                                        \
> +({                                                                     \
> +       might_fault();                                                  \
> +       do_get_user_call(get_user,x,untagged_ptr(ptr)); \
> +})

This is disgusting and wrong.

The whole reason we do do_get_user_call() as a function call is
because we *don't* want to do this kind of stuff at the call sites. We
used to inline it all, but with all the clac/stac and access_ok
checks, it all just ended up ballooning so much that it was much
better to make it a special function call with particular calling
conventions.

That untagged_ptr() should be done in that asm function, not in every call site.

Now, the sad part is that we got *rid* of all this kind of crap not
that long ago when Christoph cleaned up the old legacy set_fs() mess,
and we were able to make the task limit be a constant (ok, be _two_
constants, depending on LA57). So we'd have to re-introduce that nasty
"look up task size dynamically". See commit 47058bb54b57 ("x86: remove
address space overrides using set_fs()") for the removal that would
have to be re-instated.

But see above about "maybe it should be a per-cpu variable" - and
making that ALTERNATIVE th8ing even nastier.

Another alternative mght be to *only* test the sign bit in the
get_user/put_user functions, and just take the fault instead. Right
now we warn about non-canonical addresses because it implies somebody
might have missed an access_ok(), but we'd just mark those
get_user/put_user accesses special.

That would get this all entirely off the critical path. Most other
address masking is for relatively rare things (ie mmap/munmap), but
the user accesses are hot.

Hmm?

             Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ