lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 30 May 2013 12:43:44 -0500
From:	Seth Jennings <sjenning@...ux.vnet.ibm.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Dan Magenheimer <dan.magenheimer@...cle.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Nitin Gupta <ngupta@...are.org>,
	Minchan Kim <minchan@...nel.org>,
	Konrad Wilk <konrad.wilk@...cle.com>,
	Robert Jennings <rcj@...ux.vnet.ibm.com>,
	Jenifer Hopper <jhopper@...ibm.com>,
	Mel Gorman <mgorman@...e.de>,
	Johannes Weiner <jweiner@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Larry Woodman <lwoodman@...hat.com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Dave Hansen <dave@...1.net>, Joe Perches <joe@...ches.com>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Cody P Schafer <cody@...ux.vnet.ibm.com>,
	Hugh Dickens <hughd@...gle.com>,
	Paul Mackerras <paulus@...ba.org>,
	Heesub Shin <heesub.shin@...sung.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, devel@...verdev.osuosl.org
Subject: Re: [PATCHv12 2/4] zbud: add to mm/

On Wed, May 29, 2013 at 02:29:04PM -0700, Andrew Morton wrote:
> On Wed, 29 May 2013 14:09:02 -0700 (PDT) Dan Magenheimer <dan.magenheimer@...cle.com> wrote:
> 
> > > memory_failure() is merely an example of a general problem: code which
> > > reads from the memmap[] array and expects its elements to be of type
> > > `struct page'.  Other examples might be memory hotplugging, memory leak
> > > checkers etc.  I have vague memories of out-of-tree patches
> > > (bigphysarea?) doing this as well.
> > > 
> > > It's a general problem to which we need a general solution.
> > 
> > <Obi-tmem Kenobe slowly materializes... "use the force, Luke!">
> > 
> > One could reasonably argue that any code that makes incorrect
> > assumptions about the contents of a struct page structure is buggy
> > and should be fixed.
> 
> Well it has type "struct page" and all code has a right to expect the
> contents to match that type.
> 
> >  Isn't the "general solution" already described
> > in the following comment, excerpted from include/linux/mm.h, which
> > implies that "scribbling on existing pageframes" [carefully], is fine?
> > (And, if not, shouldn't that comment be fixed, or am I misreading
> > it?)
> > 
> > <start excerpt>
> >  * For the non-reserved pages, page_count(page) denotes a reference count.
> >  *   page_count() == 0 means the page is free. page->lru is then used for
> >  *   freelist management in the buddy allocator.
> >  *   page_count() > 0  means the page has been allocated.
> 
> Well kinda maybe.  How all the random memmap-peekers handle this I do
> not know.  Setting PageReserved is a big hammer which should keep other
> little paws out of there, although I guess it's abusive of whatever
> PageReserved is supposed to mean.
> 
> It's what we used to call a variant record.  The tag is page.flags and
> the protocol is, umm,
> 
> PageReserved: doesn't refer to a page at all - don't touch
> PageSlab: belongs to slab or slub
> !PageSlab: regular kernel/user/pagecache page

In the !PageSlab case, the page _count has to be considered to determine if the
page is a free page or if it is an allocated non-slab page.

So looking at the fields that need to remained untouched in the struct page for
the memmap-peekers, they are
- page->flags
- page->_count

Is this correct?

> 
> Are there any more?
> 
> So what to do here?  How about
> 
> - Position the zbud fields within struct page via the preferred
>   means: editing its definition.
> 
> - Decide upon and document the means by which the zbud variant is tagged

I'm not sure if there is going to be a way to tag zbud pages in particular
without using a page flag.  However, if we can tag it as a non-slab allocated
kernel page with no userspace mappings, that could be sufficient.  I think this
can be done with:

!PageSlab(p) && page_count(p) > 0 && page_mapcount(p) <= 0

An alternative is to set PG_slab for zbud pages then we get all the same
treatment as slab pages, which is basically what we want. Setting PG_slab
also conveys that no assumption can be made about the contents of _mapcount.

However, a memmap-peeker could call slab functions on the page which obviously
won't be under the control of the slab allocator. Afaict though, it doesn't
seem that any of them do this since there aren't any functions in the slab
allocator API that take raw struct pages.  The worst I've seen is calling
shrink_slab in an effort to get the slab allocator to free up the page.

In summary, I think that maintaining a positive page->_count and setting
PG_slab on zbud pages should provide safety against existing memmap-peekers.

Do you agree?

Seth

> 
> - Demonstrate how this is safe against existing memmap-peekers
> 
> - Do all this without consuming another page flag :)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ