[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070412174201.065068b2.akpm@linux-foundation.org>
Date: Thu, 12 Apr 2007 17:42:01 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Nick Piggin <nickpiggin@...oo.com.au>
Cc: William Lee Irwin III <wli@...omorphy.com>,
Matt Mackall <mpm@...enic.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/13] maps: pagemap, kpagemap, and related cleanups
On Fri, 13 Apr 2007 10:15:24 +1000 Nick Piggin <nickpiggin@...oo.com.au> wrote:
> >>+ ((char *)page)[1] = PAGE_SHIFT;
> >
> >
> > OK.
>
> Shouldn't we just expose page size and endianness by other means? (another file or
> syscall).
I don't think so - this file exposes fairly deep kernel internals and
that's unavoidable, really - it's *supposed* to do that. It is explicitly
designed for monitoring kernel behaviour.
So it needs special handling by userspace. Keeping the number of files
which need such special handling to a minimum will keep the number of
applications which are exposed to kernel changes to a minimum.
> >>+ for (; i < 2 * chunk / KPMSIZE; i += 2, pfn++) {
> >>+ ppage = pfn_to_page(pfn);
> >>+ if (!ppage) {
> >>+ page[i] = 0;
> >>+ page[i + 1] = 0;
> >>+ } else {
> >>+ page[i] = ppage->flags;
> >>+ page[i + 1] = atomic_read(&ppage->_count);
> >>+ }
> >>+ }
> >
> >
> > Not a good idea to expose raw flags in this manner - it changes at the drop
> > of a hat. We'd need to also expose the kernel's PG_foo-to-bitnumber
> > mapping to make this viable.
>
> I don't think it is viable because that makes the flags part of the
> userspace ABI.
It *will* be viable. If the application wants to know if a page is dirty,
it looks up "PG_dirty" in /proc/pg_foo-to-bitnumber and uses PG_dirty's
numerical offset when inspecting fields in /proc/kpagemap. If correctly
designed, such a monitoring application will be able to report upon page
flags which we haven't even thought up yet.
> I wonder what they are needed for.
Poking deeply into the kernel to provide information about kernel state.
There are real-world needs for this, and the people who develop tools to
process this information will have decent kernel understanding and will
know that the file's contents may alter across kernel versions. It sure
beats poking around in /dev/kmem.
I doubt if there's a sensible way in which we can prettify this interface
without losing information. But we should aim to make it as robust as
possible agaisnt future kenrel changes, of course.
And we should satisfy ourselves that all the required information has been
made available. The fact that it will satisfy the Oracle requirement is
encouraging.
Matt, these changes make the new field in /proc/pid/smaps redundant, don't
they?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists