lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 15 Oct 2020 14:47:28 -0700
From:   Ian Rogers <irogers@...gle.com>
To:     hpa@...or.com
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        x86 <x86@...nel.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Numfor Mbiziwo-Tiapo <nums@...gle.com>
Subject: Re: [PATCH v2] x86/insn, tools/x86: Fix some potential undefined behavior.

On Thu, Oct 15, 2020 at 2:35 PM <hpa@...or.com> wrote:
>
> On October 15, 2020 9:12:16 AM PDT, Ian Rogers <irogers@...gle.com> wrote:
> >From: Numfor Mbiziwo-Tiapo <nums@...gle.com>
> >
> >Don't perform unaligned loads in __get_next and __peek_nbyte_next as
> >these are forms of undefined behavior.
> >
> >These problems were identified using the undefined behavior sanitizer
> >(ubsan) with the tools version of the code and perf test. Part of this
> >patch was previously posted here:
> >https://lore.kernel.org/lkml/20190724184512.162887-4-nums@google.com/
> >
> >v2. removes the validate_next check and merges the 2 changes into one
> >as
> >requested by Masami Hiramatsu <mhiramat@...nel.org>
> >
> >Signed-off-by: Ian Rogers <irogers@...gle.com>
> >Signed-off-by: Numfor Mbiziwo-Tiapo <nums@...gle.com>
> >---
> > arch/x86/lib/insn.c       | 4 ++--
> > tools/arch/x86/lib/insn.c | 4 ++--
> > 2 files changed, 4 insertions(+), 4 deletions(-)
> >
> >diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
> >index 404279563891..be88ab250146 100644
> >--- a/arch/x86/lib/insn.c
> >+++ b/arch/x86/lib/insn.c
> >@@ -20,10 +20,10 @@
> >       ((insn)->next_byte + sizeof(t) + n <= (insn)->end_kaddr)
> >
> > #define __get_next(t, insn)   \
> >-      ({ t r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); r; })
> >+      ({ t r; memcpy(&r, insn->next_byte, sizeof(t)); insn->next_byte +=
> >sizeof(t); r; })
> >
> > #define __peek_nbyte_next(t, insn, n) \
> >-      ({ t r = *(t*)((insn)->next_byte + n); r; })
> >+      ({ t r; memcpy(&r, (insn)->next_byte + n, sizeof(t)); r; })
> >
> > #define get_next(t, insn)     \
> >       ({ if (unlikely(!validate_next(t, insn, 0))) goto err_out;
> >__get_next(t, insn); })
> >diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
> >index 0151dfc6da61..92358c71a59e 100644
> >--- a/tools/arch/x86/lib/insn.c
> >+++ b/tools/arch/x86/lib/insn.c
> >@@ -20,10 +20,10 @@
> >       ((insn)->next_byte + sizeof(t) + n <= (insn)->end_kaddr)
> >
> > #define __get_next(t, insn)   \
> >-      ({ t r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); r; })
> >+      ({ t r; memcpy(&r, insn->next_byte, sizeof(t)); insn->next_byte +=
> >sizeof(t); r; })
> >
> > #define __peek_nbyte_next(t, insn, n) \
> >-      ({ t r = *(t*)((insn)->next_byte + n); r; })
> >+      ({ t r; memcpy(&r, (insn)->next_byte + n, sizeof(t)); r; })
> >
> > #define get_next(t, insn)     \
> >       ({ if (unlikely(!validate_next(t, insn, 0))) goto err_out;
> >__get_next(t, insn); })
>
> Wait, what?
>
> You are taking about x86-specific code, and on x86 unaligned memory accesses are supported, well-defined, and ubiquitous.

On why this is undefined behavior:
https://lore.kernel.org/lkml/CAP-5=fU2XBoOa2=00VCuWYqsLUzMSMzUXY63ZJt9rz-NJ+vYwA@mail.gmail.com/

> This is B.S. at best, and unless the compiler turns the memcpy() right back into what you started with, deleterious for performance.

On performance, the memcpys are fixed size and so lowered to loads on
x86 by any reasonable compiler. See the thread above.

> If you have a *very* good reason for this kind of churn, wrap it in the unaligned access macros, but using memcpy() is insane. All you are doing is making the code worse.

The decoder is a shared code and using unaligned macros makes life
hard for the other users of the code. Memcpy is the "standard"
workaround for this kind of undefined behavior.
https://lore.kernel.org/lkml/e4269cb2-d8e6-da26-6afd-a9df72d4be36@intel.com/

For motivation, beyond just having perf be sanitizer clean, see discussion here:
https://lore.kernel.org/lkml/CAP-5=fUoSGy3NAzTSbF3YLEPABSs7oPsxLkCx36XkEzzm341yw@mail.gmail.com/
https://lore.kernel.org/lkml/160208761921.7002.1321765913567405137.tip-bot2@tip-bot2/

Thanks,
Ian

> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

Powered by blists - more mailing lists