[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a1980256cf3d2ea8e91707fcda7cf141b27a212d.camel@kernel.crashing.org>
Date: Mon, 03 Sep 2018 10:48:05 +1000
From: Benjamin Herrenschmidt <benh@...nel.crashing.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
Jiri Kosina <jikos@...nel.org>
Cc: Jürgen Groß <jgross@...e.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Michal Hocko <mhocko@...e.com>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Michael Ellerman <mpe@...erman.id.au>,
Will Deacon <will.deacon@....com>
Subject: Re: Access to non-RAM pages
On Sat, 2018-09-01 at 11:06 -0700, Linus Torvalds wrote:
> [ Adding a few new people the the cc.
>
> The issue is the worry about software-speculative accesses (ie
> things like CONFIG_DCACHE_WORD_ACCESS - not talking about the hw
> speculation now) accessing past RAM into possibly contiguous IO ]
>
> On Sat, Sep 1, 2018 at 10:27 AM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > If you have a machine with RAM that touches IO, you need to disable
> > the last page, exactly the same way we disable and marked reserved the
> > first page at zero.
So I missed the departure of that train ... stupid question, with
CONFIG_DCACHE_WORD_ACCESS, if that can be unaligned (I assume it can),
what prevents it from crossing into a non-mapped page (not even IO) and
causing an oops ? Looking at a random user in fs/dcache.c its not a
uaccess-style read with recovery.... Or am I missing somethign obvious
here ?
IE, should we "reserve" the last page of any memory region (maybe mark
it read-only) to avoid this along with avoiding leakage into IO space ?
> > I thought we already did that.
>
> We don't seem to do that.
>
> And it's not just the last page, it's _any_ last page in a region that
> bumps up to IO. That's actually much more common in the low 4G area on
> PC's, I suspect, although the reserved BIOS ranges always tend to be
> there.
What makes IO more "wrong" than oopsing due to the page not being
mapped ?
> I suspect it should be trivial to do - maybe in
> e820__memblock_setup()? That's where we already trim partial pages
> etc.
>
> In fact, I think this might be done as an extension of commit
> 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into
> memblock.reserved"), except making sure that non-RAM regions mark one
> page _previous_ as reserved too.
>
> I assume memory hotplug might have the same issue, and checking
> whether ARM64 and powerpc perhaps might have already done something
> like this (or might need to add it).
>
> We discussed long ago the case of user space mapping IO in user space,
> and decided we didn't care. But the kernel should probably explicitly
> make sure we don't either, even if I can't recall having ever seen a
> machine that actually maps IO contiguously to RAM. The layout always
> tends to end up having holes anyway.
Can't we put the safety in generic memblock ? IE, don't hand out an
allocation that contain the last page of a "block" and handle that last
page in the memblock->buddy transition rather than in arch specific
code ?
Cheers,
Ben.
Powered by blists - more mailing lists