[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez2cuqe_VYhhaqw8Hcyswv47cmz2XmkqNdvkXEhokMVaXg@mail.gmail.com>
Date: Tue, 13 Aug 2019 17:29:09 +0200
From: Jann Horn <jannh@...gle.com>
To: Michal Hocko <mhocko@...nel.org>,
Daniel Gruss <daniel.gruss@...k.tugraz.at>,
"Joel Fernandes (Google)" <joel@...lfernandes.org>
Cc: kernel list <linux-kernel@...r.kernel.org>,
Alexey Dobriyan <adobriyan@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Borislav Petkov <bp@...en8.de>,
Brendan Gregg <bgregg@...flix.com>,
Catalin Marinas <catalin.marinas@....com>,
Christian Hansen <chansen3@...co.com>,
Daniel Colascione <dancol@...gle.com>, fmayer@...gle.com,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Joel Fernandes <joelaf@...gle.com>,
Jonathan Corbet <corbet@....net>,
Kees Cook <keescook@...omium.org>,
kernel-team <kernel-team@...roid.com>,
Linux API <linux-api@...r.kernel.org>,
linux-doc@...r.kernel.org,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
Mike Rapoport <rppt@...ux.ibm.com>,
Minchan Kim <minchan@...nel.org>, namhyung@...gle.com,
"Paul E. McKenney" <paulmck@...ux.ibm.com>,
Robin Murphy <robin.murphy@....com>,
Roman Gushchin <guro@...com>,
Stephen Rothwell <sfr@...b.auug.org.au>,
Suren Baghdasaryan <surenb@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Todd Kjos <tkjos@...gle.com>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Vlastimil Babka <vbabka@...e.cz>, Will Deacon <will@...nel.org>
Subject: Re: [PATCH v5 1/6] mm/page_idle: Add per-pid idle page tracking using
virtual index
On Tue, Aug 13, 2019 at 12:09 PM Michal Hocko <mhocko@...nel.org> wrote:
> On Mon 12-08-19 20:14:38, Jann Horn wrote:
> > On Wed, Aug 7, 2019 at 7:16 PM Joel Fernandes (Google)
> > <joel@...lfernandes.org> wrote:
> > > The page_idle tracking feature currently requires looking up the pagemap
> > > for a process followed by interacting with /sys/kernel/mm/page_idle.
> > > Looking up PFN from pagemap in Android devices is not supported by
> > > unprivileged process and requires SYS_ADMIN and gives 0 for the PFN.
> > >
> > > This patch adds support to directly interact with page_idle tracking at
> > > the PID level by introducing a /proc/<pid>/page_idle file. It follows
> > > the exact same semantics as the global /sys/kernel/mm/page_idle, but now
> > > looking up PFN through pagemap is not needed since the interface uses
> > > virtual frame numbers, and at the same time also does not require
> > > SYS_ADMIN.
> > >
> > > In Android, we are using this for the heap profiler (heapprofd) which
> > > profiles and pin points code paths which allocates and leaves memory
> > > idle for long periods of time. This method solves the security issue
> > > with userspace learning the PFN, and while at it is also shown to yield
> > > better results than the pagemap lookup, the theory being that the window
> > > where the address space can change is reduced by eliminating the
> > > intermediate pagemap look up stage. In virtual address indexing, the
> > > process's mmap_sem is held for the duration of the access.
> >
> > What happens when you use this interface on shared pages, like memory
> > inherited from the zygote, library file mappings and so on? If two
> > profilers ran concurrently for two different processes that both map
> > the same libraries, would they end up messing up each other's data?
>
> Yup PageIdle state is shared. That is the page_idle semantic even now
> IIRC.
>
> > Can this be used to observe which library pages other processes are
> > accessing, even if you don't have access to those processes, as long
> > as you can map the same libraries? I realize that there are already a
> > bunch of ways to do that with side channels and such; but if you're
> > adding an interface that allows this by design, it seems to me like
> > something that should be gated behind some sort of privilege check.
>
> Hmm, you need to be priviledged to get the pfn now and without that you
> cannot get to any page so the new interface is weakening the rules.
> Maybe we should limit setting the idle state to processes with the write
> status. Or do you think that even observing idle status is useful for
> practical side channel attacks? If yes, is that a problem of the
> profiler which does potentially dangerous things?
I suppose read-only access isn't a real problem as long as the
profiler isn't writing the idle state in a very tight loop... but I
don't see a usecase where you'd actually want that? As far as I can
tell, if you can't write the idle state, being able to read it is
pretty much useless.
If the profiler only wants to profile process-private memory, then
that should be implementable in a safe way in principle, I think, but
since Joel said that they want to profile CoW memory as well, I think
that's inherently somewhat dangerous.
Powered by blists - more mailing lists