linux-kernel - Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 28 Apr 2009 18:31:09 -0500
From:	Matt Mackall <mpm@...enic.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	fengguang.wu@...el.com, linux-kernel@...r.kernel.org,
	kosaki.motohiro@...fujitsu.com, andi@...stfloor.org,
	adobriyan@...il.com, linux-mm@...ck.org
Subject: Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags

On Tue, 2009-04-28 at 16:02 -0700, Andrew Morton wrote:
> On Tue, 28 Apr 2009 17:46:34 -0500
> Matt Mackall <mpm@...enic.com> wrote:
> 
> > > > +/* a helper function _not_ intended for more general uses */
> > > > +static inline int page_cap_writeback_dirty(struct page *page)
> > > > +{
> > > > +	struct address_space *mapping;
> > > > +
> > > > +	if (!PageSlab(page))
> > > > +		mapping = page_mapping(page);
> > > > +	else
> > > > +		mapping = NULL;
> > > > +
> > > > +	return mapping && mapping_cap_writeback_dirty(mapping);
> > > > +}
> > > 
> > > If the page isn't locked then page->mapping can be concurrently removed
> > > and freed.  This actually happened to me in real-life testing several
> > > years ago.
> > 
> > We certainly don't want to be taking locks per page to build the flags
> > data here. As we don't have any pretense of being atomic, it's ok if we
> > can find a way to do the test that's inaccurate when a race occurs, so
> > long as it doesn't dereference null.
> > 
> > But if there's not an obvious way to do that, we should probably just
> > drop this flag bit for this iteration.
> 
> trylock_page() could be used here, perhaps.
> 
> Then again, why _not_ just do lock_page()?  After all, few pages are
> ever locked.  There will be latency if the caller stumbles across a
> page which is under read I/O, but so be it?

As I mentioned just a bit ago, it's really not an unreasonable use case
to want to do this on every page in the system back to back. So per page
overhead matters. And the odds of stalling on a locked page when
visiting 1M pages while under load are probably not negligible.

Our lock primitives are pretty low overhead in the fast path, but every
cycle counts. The new tests and branches this code already adds are a
bit worrisome, but on balance probably worth it.

-- 
http://selenic.com : development and support for Mercurial and Linux


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/