linux-kernel - Re: [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <533C6F6E.4080601@linaro.org>
Date:	Wed, 02 Apr 2014 13:13:34 -0700
From:	John Stultz <john.stultz@...aro.org>
To:	Johannes Weiner <hannes@...xchg.org>
CC:	Dave Hansen <dave@...1.net>, "H. Peter Anvin" <hpa@...or.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Android Kernel Team <kernel-team@...roid.com>,
	Robert Love <rlove@...gle.com>, Mel Gorman <mel@....ul.ie>,
	Hugh Dickins <hughd@...gle.com>,
	Rik van Riel <riel@...hat.com>,
	Dmitry Adamushko <dmitry.adamushko@...il.com>,
	Neil Brown <neilb@...e.de>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Mike Hommey <mh@...ndium.org>, Taras Glek <tglek@...illa.com>,
	Jan Kara <jack@...e.cz>,
	KOSAKI Motohiro <kosaki.motohiro@...il.com>,
	Michel Lespinasse <walken@...gle.com>,
	Minchan Kim <minchan@...nel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder

On 04/02/2014 12:47 PM, Johannes Weiner wrote:
> On Wed, Apr 02, 2014 at 12:01:00PM -0700, John Stultz wrote:
>> On Wed, Apr 2, 2014 at 10:58 AM, Johannes Weiner <hannes@...xchg.org> wrote:
>>> On Wed, Apr 02, 2014 at 10:40:16AM -0700, John Stultz wrote:
>>>> That point beside, I think the other problem with the page-cleaning
>>>> volatility approach is that there are other awkward side effects. For
>>>> example: Say an application marks a range as volatile. One page in the
>>>> range is then purged. The application, due to a bug or otherwise,
>>>> reads the volatile range. This causes the page to be zero-filled in,
>>>> and the application silently uses the corrupted data (which isn't
>>>> great). More problematic though, is that by faulting the page in,
>>>> they've in effect lost the purge state for that page. When the
>>>> application then goes to mark the range as non-volatile, all pages are
>>>> present, so we'd return that no pages were purged.  From an
>>>> application perspective this is pretty ugly.
>>>>
>>>> Johannes: Any thoughts on this potential issue with your proposal? Am
>>>> I missing something else?
>>> No, this is accurate.  However, I don't really see how this is
>>> different than any other use-after-free bug.  If you access malloc
>>> memory after free(), you might receive a SIGSEGV, you might see random
>>> data, you might corrupt somebody else's data.  This certainly isn't
>>> nice, but it's not exactly new behavior, is it?
>> The part that troubles me is that I see the purged state as kernel
>> data being corrupted by userland in this case. The kernel will tell
>> userspace that no pages were purged, even though they were. Only
>> because userspace made an errant read of a page, and got garbage data
>> back.
> That sounds overly dramatic to me.  First of all, this data still
> reflects accurately the actions of userspace in this situation.  And
> secondly, the kernel does not rely on this data to be meaningful from
> a userspace perspective to function correctly.
<insert dramatic-chipmunk video w/ text overlay "errant read corrupted
volatile page purge state!!!!1">

Maybe you're right, but I feel this is the sort of thing application
developers would be surprised and annoyed by.


> It's really nothing but a use-after-free bug that has consequences for
> no-one but the faulty application.  The thing that IS new is that even
> a read is enough to corrupt your data in this case.
>
> MADV_REVIVE could return 0 if all pages in the specified range were
> present, -Esomething if otherwise.  That would be semantically sound
> even if userspace messes up.

So its semantically more of just a combined mincore+dirty operation..
and nothing more?

What are other folks thinking about this? Although I don't particularly
like it, I probably could go along with Johannes' approach, forgoing
SIGBUS for zero-fill and adapting the semantics that are in my mind a
bit stranger. This would allow for ashmem-like style behavior w/ the
additional  write-clears-volatile-state and read-clears-purged-state
constraints (which I don't think would be problematic for Android, but
am not totally sure).

But I do worry that these semantics are easier for kernel-mm-developers
to grasp, but are much much harder for application developers to
understand.

Additionally unless we could really leave access-after-volatile as a
total undefined behavior, this would lock us into O(page) behavior and
would remove the possibility of O(log(ranges)) behavior Minchan and I
were able to get (admittedly with more complicated code - but something
I was hoping we'd be able to get back to after the base semantics and
interface behavior was understood and merged). I since applications will
have bugs and will access after volatile, we won't be able to get away
with that sort of behavioral flexibility.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/