lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 15 Feb 2013 18:13:04 -0500
From:	Johannes Weiner <hannes@...xchg.org>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Rusty Russell <rusty@...tcorp.com.au>,
	LKML <linux-kernel@...r.kernel.org>,
	Nick Piggin <npiggin@...e.de>,
	Stewart Smith <stewart@...mingspork.com>, linux-mm@...ck.org,
	linux-arch@...r.kernel.org
Subject: Re: [patch 1/2] mm: fincore()

On Fri, Feb 15, 2013 at 01:27:38PM -0800, Andrew Morton wrote:
> On Fri, 15 Feb 2013 01:34:50 -0500
> Johannes Weiner <hannes@...xchg.org> wrote:
> 
> > + * The status is returned in a vector of bytes.  The least significant
> > + * bit of each byte is 1 if the referenced page is in memory, otherwise
> > + * it is zero.
> 
> Also, this is going to be dreadfully inefficient for some obvious cases.
> 
> We could address that by returning the info in some more efficient
> representation.  That will be run-length encoded in some fashion.
> 
> The obvious way would be to populate an array of
> 
> struct page_status {
> 	u32 present:1;
> 	u32 count:31;
> };
> 
> or whatever.

I'm having a hard time seeing how this could be extended to more
status bits without stifling the optimization too much.  If we just
add more status bits to one page_status, the likelihood of long runs
where all bits are in agreement decreases.  But as the optimization
becomes less and less effective, we are stuck with an interface that
is more PITA than just using mmap and mincore again.

The user has to supply a worst-case-sized vector with one struct
page_status per page in the range, but the per-page item will be
bigger than with the byte vector because of the additional run length
variable.

> Another way would be to define the syscall so it returns "number of
> pages present/absent starting at offset `start'".  In other words, one
> call to fincore() will return a single `struct page_status'.  Userspace
> can then walk through the file and generate the full picture, if needed.
> 
> This also gets inefficient in obvious cases, but it's not as obviously
> bad?

Any run-length encoding will have a problem with multiple status bits,
I guess.

Maybe with a mask of bits the user is interested in?

struct page_status {
	unsigned long states;
	unsigned long count;
};

int fincore(int fd, loff_t start, loff_t len,
            unsigned long states_mask,
            struct page_status *status)

However, one struct page_status per run leaves you with a worst case
of one syscall per page in the range.

I dunno.  The byte vector might not be optimal but its worst cases
seem more attractive, is just as extensible, and dead simple to use.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ