lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20130215154235.0fb36f53.akpm@linux-foundation.org>
Date:	Fri, 15 Feb 2013 15:42:35 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	Rusty Russell <rusty@...tcorp.com.au>,
	LKML <linux-kernel@...r.kernel.org>,
	Nick Piggin <npiggin@...e.de>,
	Stewart Smith <stewart@...mingspork.com>, linux-mm@...ck.org,
	linux-arch@...r.kernel.org
Subject: Re: [patch 1/2] mm: fincore()

On Fri, 15 Feb 2013 18:13:04 -0500
Johannes Weiner <hannes@...xchg.org> wrote:

> On Fri, Feb 15, 2013 at 01:27:38PM -0800, Andrew Morton wrote:
> > On Fri, 15 Feb 2013 01:34:50 -0500
> > Johannes Weiner <hannes@...xchg.org> wrote:
> > 
> > > + * The status is returned in a vector of bytes.  The least significant
> > > + * bit of each byte is 1 if the referenced page is in memory, otherwise
> > > + * it is zero.
> > 
> > Also, this is going to be dreadfully inefficient for some obvious cases.
> > 
> > We could address that by returning the info in some more efficient
> > representation.  That will be run-length encoded in some fashion.
> > 
> > The obvious way would be to populate an array of
> > 
> > struct page_status {
> > 	u32 present:1;
> > 	u32 count:31;
> > };
> > 
> > or whatever.
> 
> I'm having a hard time seeing how this could be extended to more
> status bits without stifling the optimization too much.

See other email: add a syscall arg which specifies the boolean status
which we're searching for.

>  If we just
> add more status bits to one page_status, the likelihood of long runs
> where all bits are in agreement decreases.  But as the optimization
> becomes less and less effective, we are stuck with an interface that
> is more PITA than just using mmap and mincore again.
> 
> The user has to supply a worst-case-sized vector with one struct
> page_status per page in the range, but the per-page item will be
> bigger than with the byte vector because of the additional run length
> variable.

Yes, we'd need to tell the kernel how much storage is available for the
structures.

> However, one struct page_status per run leaves you with a worst case
> of one syscall per page in the range.

Yes.

> I dunno.  The byte vector might not be optimal but its worst cases
> seem more attractive, is just as extensible, and dead simple to use.

But I think "which pages from this 4TB file are in core" will not be an
uncommon usage, and writing a gig of memory to find three pages is just
awful.

I wonder what the most common usage would be (one should know this
before merging the syscall :)).  I guess "is this relatively-small
range of the file in core" and/or "which pages from this
relatively-small range of the file will I need to read", etc.

The syscall should handle the common usages very well.  But it
shouldn't handle uncommon usages very badly!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ