[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20201024162103.a479e06af9bbc6c83ffea1a7@kernel.org>
Date: Sat, 24 Oct 2020 16:21:03 +0900
From: Masami Hiramatsu <mhiramat@...nel.org>
To: Andy Lutomirski <luto@...nel.org>
Cc: Borislav Petkov <bp@...en8.de>,
Masami Hiramatsu <mhiramat@...nel.org>,
x86-ml <x86@...nel.org>, Joerg Roedel <jroedel@...e.de>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] Have insn decoder functions return success/failure
On Fri, 23 Oct 2020 17:12:49 -0700
Andy Lutomirski <luto@...nel.org> wrote:
> On Fri, Oct 23, 2020 at 4:27 PM Borislav Petkov <bp@...en8.de> wrote:
> >
> > On Fri, Oct 23, 2020 at 07:47:04PM +0900, Masami Hiramatsu wrote:
> > > Thanks! I look forward to it.
> >
> > Ok, here's a first stab, it is a single big diff and totally untested
> > but it should show what I mean. I've made some notes while converting,
> > as I went along.
> >
> > Have a look at insn_decode() and its call sites: they are almost trivial
> > now because caller needs simply to do:
> >
> > if (insn_decode(insn, buffer, ...))
> >
> > and not care about any helper functions.
> >
> > For some of the call sites it still makes sense to do a piecemeal insn
> > decoding and I've left them this way but they can be converted too, if
> > one wants.
> >
> > In any case, just have a look please and lemme know if that looks OKish.
> > I'll do the actual splitting and testing afterwards.
> >
> > And what Andy wants can't be done with the decoder because it already
> > gets a fixed size buffer and length - it doesn't do the fetching. The
> > caller does.
> >
> > What you wanna do:
> >
> > > len = min(15, remaining bytes in page);
> > > fetch len bytes;
> > > insn_init();
> > > ret = insn_decode_fully();
> >
> > <--- you can't always know here whether the insn is valid if you don't
> > have all the bytes. But you can always fetch *all* bytes and then give
> > it to the decoder for checking.
> >
> > Also, this doesn't make any sense: try insn decode on a subset of bytes
> > and then if it fails, try it on the whole set of bytes. Why even try the
> > subset - it will almost always fail.
>
> I disagree. A real CPU does exactly what I'm describing. If I stick
> 0xcc at the end of a page and a make the next page not-present, I get
> #BP, not #PF. But if I stick 0x0F at the end of a page and mark the
> next page not-present, I get #PF. If we're trying to decode an
> instruction in user memory, we can kludge it by trying to fetch 15
> bytes and handling -EFAULT by fetching fewer bytes, but that's gross
> and doesn't really have the right semantics. What we actually want is
> to fetch up to the page boundary and try to decode it. If it's a
> valid instruction or if it's definitely invalid, we're done.
> Otherwise we fetch across the page boundary.
>
> Eventually we should wrap this whole mess up in an insn_decode_user()
> helper that does the right thing. And we can then make that helper
> extra fancy by getting PKRU and EPT-hacker-execute-only right, whereas
> we currently get these cases wrong.
+1. To handle the user-space (untrusted) instruction, we need to
take more care about page boundary and presense. Also less side-effect
is perferrable.
Thank you,
--
Masami Hiramatsu <mhiramat@...nel.org>
Powered by blists - more mailing lists